May 27, 2022

Conference Paper: “Detecting US Federal Documents to Expand Access” (HathiTrust Fed. Docs Program)

The following paper will be presented at the IFLA World Library and Information Congress 2016 taking place August 3-19, 2016 in Columbus, OH.


Detecting US Federal Documents to Expand Access


Mike Furlough

Valerie Glenn



This paper reports on HathiTrust’s Federal Documents Program, which facilitates collective action to create a comprehensive digital collection of United States government publications issued by the Government Printing Office and other agencies. Most government information is now produced and distributed digitally, but US research libraries, especially those that participate in the Federal Depository Library Program, hold large numbers of historical print publications that are difficult to discover, find, and use.

In June 2016 HathiTrust held over 700,000 items identified as federal documents, but we know this to be only a fraction of what exists. Because of varied cataloging practices we have limited understanding of the number of federal documents at the title level, as well as the corresponding number of volumes, the number of pages, and their distribution across libraries in North America. All of these are important details necessary to plan comprehensive mass digitization of federal documents. A major component of HathiTrust’s program has been the development of the US Federal Documents Registry, envisioned as a reliable inventory of items published at the expense of the US government. The methodology employed for the Registry’s development includes extensive comparative bibliographic analysis, based upon more than 20 million records submitted by 40 libraries in response to a request from HathiTrust.

This paper describes methods of de-duplication, relationship-detection, and record consolidation. While many potential use cases exist for such a registry, its primary role is as a tool for identification of materials to be digitized among HathiTrust member libraries and in partnership with other agencies and groups.

Direct to Full Text Paper (10 pages; PDF)

See Also: HathiTrust US Federal Government Documents Initiative Web Page

See Also: Access/Search HathiTrust Digital Library: United States Federal Government Documents Registry

Some of the Other IFLA 2016 Papers We’ve Shared on infoDOCKET
More to Come

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.