May 28, 2022

HathiTrust Begins Project to Build Comprehensive U.S. Government Documents Registry & Other HT News

The September, 2012 HathiTrust Update is now available online and it includes news about a new HT project that’s anticipated to last two years and involves U.S. government documents.

HathiTrust has initiated a project to build a comprehensive registry of U.S. federal government documents. The Registry is an emerging effort in a broader undertaking by HathiTrust partners to improve access to U.S. federal government documents. Further information and background on the project is available on the Registry project page. A two-year term Government Documents Registry Analyst position for the project was posted in September.

The project page has more info.

The nature of the proposed work will be decided by a group and a process determined by the Board of Governors, but key elements include:

  • Facilitating “collective action to create a comprehensive digital corpus of U.S. federal publications including those issued by GPO and other federal agencies.”
  • Coordinating “operational plans and a business model to further and sustain coordinated digitization, ingest, and display of U.S. federal publications including those issued by GPO and other federal agencies.”
  • And “that HathiTrust develop a process to implement enhanced access protocols to fully realize the potential of a comprehensive corpus of U.S. federal publications including those issued by GPO and other federal agencies.”

Perhaps the most significant impediment to accomplishing the goal of creating a comprehensive corpus of US federal publications is the absence of a reliable inventory of items in the corpus. In discussions related to digitizing government documents, participants recognize that even the promising inductive strategy of relying on the catalogs of regional depository libraries falls short. Many or perhaps all regional depository libraries have not cataloged their collections comprehensively; records exist at the bibliographic level rather than the volume level (e.g., more than 7,000 volumes corresponded to five bibliographic records submitted by Michigan’s Law Library) ; many US federal government publications are cataloged as serials but are regarded by users and librarians as monographs. In short, the fundamental chaos inherent in this non-inventory has led informed individuals in digitization discussions to produce estimates that range from 1.8m to 2.2m volumes, and to estimate average volume page counts from 60 pages to over 300 pages (a total range with a difference of more than 500m pages). Moreover, the absence of an inventory makes impossible tasks like correlating the more than 400,000 documents currently in HathiTrust with the total corpus, and coordinating collective effort across a group of institutions.

Other HT News Discussed in the September Update

+ 1st HathiTrust Research Center UnCamp Takes Place

+ Infrastructure Changes for Out of Print and Brittle

+ User Experience Advisory Group Continues Discussions About a New Homepage Design

+ First Phase of Improvements to Enhance Accessibility of HathiTrust Web Applications

More About These Items and More HathiTrust News in the September 2012 HT Update

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.