May 24, 2022

Conference Paper: Mining Large Datasets for the Humanities

Here’s another paper that will be presented next month at the IFLA World Library and Information Congress (80th IFLA General Conference and Assembly) in Lyon, France.

infoDOCKET will continue to highlight and share papers from the IFLA Congress over the next month.


Mining Large Datasets for the Humanities


Peter Leonard
Yale University Library


International Federation of Library Associations/WLIC 2014


This paper considers how libraries can support humanities scholars in working with large digitized collections of cultural material. Although disciplines such as corpus linguistics have already made extensive use of these collections, fields such as literature, history, and cultural studies stand at the threshold of new opportunity.

Libraries can play an important role in helping these scholars make sense of big cultural data. In part, this is because many humanities graduate programs neither consider data skills a prerequisite, nor train their students in data analysis methods. As the ‘laboratory for the humanities,’ libraries are uniquely suited to host new forms of collaborative exploration of big data by humanists. But in order to do this successfully, libraries must consider three challenges:

1) How to evolve technical infrastructure to support the analysis, not just the presentation, of digitized artifacts.

2) How to work with data that may fall under both copyright and licensing restrictions.

3) How to serve as trusted partners with disciplines that have evolved thoughtful critiques of quantitative and algorithmic methodologies.

Direct to Full Text Paper (14 pages; PDF)

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.