January 16, 2022

New White Paper From ACRL: “Transforming Library Services for Computational Research with Text Data: Environmental Scan, Stakeholder Perspectives, and Recommendations for Libraries”

From the Association of College Research Libraries:

ACRL announces the publication of a new white paper, Transforming Library Services for Computational Research with Text Data: Environmental Scan, Stakeholder Perspectives, and Recommendations for Libraries.

This report from the IMLS National Forum on Data Mining Research Using In-Copyright and Limited-Access Text Datasets seeks to build a shared understanding of the issues and challenges associated with the legal and socio-technical logistics of conducting computational research with text data. It captures preparatory activities leading up to the forum and its outcomes to (1) provide academic librarians with a set of recommendations for action and (2) establish a research agenda for the LIS community.

From the Executive Summary:

While responsibility for addressing the challenges of conducting text data mining (TDM) research with proprietary and IP-protected data does not fall solely on the shoulders of librarians, academic libraries have a key role to play in establishing a thriving scholarly ecosystem for TDM research. By working directly with researchers, communicating across units within the library, establishing campus-wide partnerships, and building coalitions with other external stakeholders, librarians can enact the recommendations outlined in this paper. In total, there are twenty-three recommendations organized along the six dimensions described below:

Fair use and licensing. Where possible, avoid agreeing to license terms that limit the use of public domain materials or otherwise limit fair use, including TDM. The terms of institutional licenses should also be clearly communicated and shared with the community to which they apply, while individual licenses from scholars’ data acquisition should be collected and stored to improve institutional memory and avoid duplicative effort.

Communication, outreach, and instruction. Adopt a “collections as data” mind-set and facilitate the use of digital data through both formal and informal training and instruction.

Workforce development. Establish text mining and legal literacies as core competencies for librarians working in the areas of digital scholarship and scholarly communication, create professional development opportunities for in-service professionals, and recruit information professionals with deep knowledge of TDM to work in academic libraries.

Research and governance. Convene a campus-wide task force to address issues of data governance and risk management, establish institutional workflows for acquiring and using proprietary data, clarify the role of librarians as mediators and facilitators empowered to support TDM research, and document case studies of TDM research from data acquisition through analysis and dissemination.

Advocacy. Work collaboratively with external stakeholders to develop a best practice guide for fair use in TDM, streamline scholar-initiated license negotiations; build awareness of TDM within scholarly and professional communities, and advocate for broad data access rights in matters of policy and legislation.

Infrastructure. Participate in standards-making efforts to establish shared strategies for data interchange, establish partnerships to support large-scale data storage and high-performance computing (HPC) initiatives with the library’s data, and explore opportunities for innovating data repositories to address the legal dimensions of data-intensive research along with its dissemination, preservation, and reuse.


Megan Senseney
University of Arizona
Eleanor Dickson Koehl
University of California, Los Angeles
Beth Sandore Namachchivaya
University of Waterloo
Bertram Ludäscher
University of Illinois at Urbana-Champaign

Direct to Full Text White Paper
61 pages; PDF.

Direct to Complete Blog Post

About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.