SUBSCRIBE
SUBSCRIBE
EXPLORE +
  • About infoDOCKET
  • Academic Libraries on LJ
  • Research on LJ
  • News on LJ
  • Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Libraries
    • Academic Libraries
    • Government Libraries
    • National Libraries
    • Public Libraries
  • Companies (Publishers/Vendors)
    • EBSCO
    • Elsevier
    • Ex Libris
    • Frontiers
    • Gale
    • PLOS
    • Scholastic
  • New Resources
    • Dashboards
    • Data Files
    • Digital Collections
    • Digital Preservation
    • Interactive Tools
    • Maps
    • Other
    • Podcasts
    • Productivity
  • New Research
    • Conference Presentations
    • Journal Articles
    • Lecture
    • New Issue
    • Reports
  • Topics
    • Archives & Special Collections
    • Associations & Organizations
    • Awards
    • Funding
    • Interviews
    • Jobs
    • Management & Leadership
    • News
    • Patrons & Users
    • Preservation
    • Profiles
    • Publishing
    • Roundup
    • Scholarly Communications
      • Open Access

December 3, 2025 by Gary Price

The Public Interest Corpus Releases Principles and Goals

December 3, 2025 by Gary Price

From a Blog Post (via Authors Alliance):

Today, we are pleased to release The Public Interest Corpus Principles and Goals. This release builds on the recap of our final planning workshop and anticipates release of our final deliverable later this month.

[Clip]

The Public Interest Corpus works with a growing coalition of stakeholders to develop a service that advances the library community’s ability to support the responsible use of their collections for AI research and development and computational research more generally. The initial focus of the service is on a corpus development, discovery, and access solution for books data (digitized and/or born digital text with metadata) at scale. Some estimatessuggest that ~162,000,000 books have been created globally, with ~2,200,000 new books published each year. Collectively, libraries steward the most comprehensive source of human inquiry recorded in book form.

[Clip]

What principles guide The Public Interest Corpus? 

  1. The Public Interest Corpus … advances equitable access to books data for small, medium, and large organizations.
  2. The Public Interest Corpus …  supports AI research and development and computational research that addresses public interest challenges (e.g., fighting misinformation, advancing understanding of the past and present, fostering a more informed citizenry).
  3. The Public Interest Corpus …  addresses corpus limitations (e.g., linguistic bias, outmoded forms of knowledge present in the corpus, and data quality) through production of additional metadata in line with efforts like the Hugging Face Model Card and Data Nutrition Label.
  4. The Public Interest Corpus … commits to transparency with respect to corpus composition, modification, and agreements in order to increase public trust in research that makes use of the corpus.
  5. The Public Interest Corpus … values the labor of content creators and works to ensure that their work is recognized through promotion of credit and attribution practices.
  6. The Public Interest Corpus … adopts practices and infrastructure that aim to reduce the environmental impactof corpus development, discovery, and access.
  7. The Public Interest Corpus … forms partnerships that concretely address long-term collective needs of academic libraries and the communities they serve (e.g., maximizing access, reducing legal encumbrances).
  8. The Public Interest Corpus  … is fundamentally guided by diverse stakeholders including but not limited to researchers, librarians, publishers, authors, and technologists.

What goals should The Public Interest Corpus work to achieve?

  1. Coordinate books data sourcing, discovery, and access across small, medium, and large organizations.
  2. Create cost efficiencies in access to books data.
  3. Minimize legal risk for those that seek to provide or make use of books data.
  4. Curate and provide access to fit-for-purpose books data that exceeds in quality and comprehensiveness what is otherwise available.
  5. Ensure consistent corpus growth and refinement over time in alignment with user community needs.
  6. Identify and adopt scalable author credit and attribution methods for authors and rights holders to track reuse.
  7. Deliver minimum viable solutions.
  8. Adopt a fit for purpose governance model.
  9. Develop a sustainability model that reduces barriers to books data access for small, medium, and large organizations on an ongoing basis.

Direct to Complete Blog Post

Filed under: Academic Libraries, Associations and Organizations, Companies (Publishers/Vendors), Data Files, Libraries, News

SHARE:

About Gary Price

Gary Price (gprice@gmail.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com.

ADVERTISEMENT

Archives

Job Zone

ADVERTISEMENT

Related Infodocket Posts

ADVERTISEMENT

FOLLOW US ON X

Tweets by infoDOCKET

ADVERTISEMENT

This coverage is free for all visitors. Your support makes this possible.

This coverage is free for all visitors. Your support makes this possible.

Primary Sidebar

  • News
  • Reviews+
  • Technology
  • Programs+
  • Design
  • Leadership
  • People
  • COVID-19
  • Advocacy
  • Opinion
  • INFOdocket
  • Job Zone

Reviews+

  • Booklists
  • Prepub Alert
  • Book Pulse
  • Media
  • Readers' Advisory
  • Self-Published Books
  • Review Submissions
  • Review for LJ

Awards

  • Library of the Year
  • Librarian of the Year
  • Movers & Shakers 2022
  • Paralibrarian of the Year
  • Best Small Library
  • Marketer of the Year
  • All Awards Guidelines
  • Community Impact Prize

Resources

  • LJ Index/Star Libraries
  • Research
  • White Papers / Case Studies

Events & PD

  • Online Courses
  • In-Person Events
  • Virtual Events
  • Webcasts
  • About Us
  • Contact Us
  • Advertise
  • Subscribe
  • Media Inquiries
  • Newsletter Sign Up
  • Submit Features/News
  • Data Privacy
  • Terms of Use
  • Terms of Sale
  • FAQs
  • Careers at MSI


© 2026 Library Journal. All rights reserved.


© 2022 Library Journal. All rights reserved.