May 26, 2022

Final Report of TEXTUS Project: An Open-Source Platform To Help With Reuse Cultural Heritage Materials

The Open Knowledge Foundation was the lead institution for the TEXTUS project.

From the Project Summary

The combination of freely available digital copies of public domain works, open bibliographic data and open source tools has the potential to revolutionise research in the humanities. However, there are currently numerous obstacles which mean that they are often under utilised by scholars and students in teaching and research.

From classic literary and cultural works, to letters, drafts, notes, and other historical documents, there is a huge amount of freely available public domain material that is highly relevant to scholars and students engaged in research in the humanities. But these works can be difficult to find, difficult to work with, and works by a given author may be scattered in a variety of locations. Search results may be confusing or unclear. Automated Optical Character Recognition of texts may be inaccurate or incomplete.

There are a growing number of open source tools for transcribing, translating and annotating texts. However many of these are one-off projects and it may not be clear how to deploy the tools in relation to a given text or collection of texts. Academic awareness of these tools and their potential benefits to their research is not great.

The overall goal of TEXTUS is to provide an open-source platform through which scholars and students are able to re-use the vast and expanding amount of digitised cultural heritage material now available through portals such as the Internet Archive, Wikisource and Project Gutenberg. The aim is to enable scholarly communities to easily establish their own instances of TEXTUS catering to their specific community and to enable developers to easily extend the platform itself and develop other apps and services for it.

TEXTUS Final Report

Direct to TEXTUS Project

Direct to OpenPhilosophy (First Deployment of TEXTUS)

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.