HathiTrust has an ambitious goal to build a comprehensive digital collection of U.S. federal documents distributed in print format. But what do we already have in our collective digital collection? And what is it that can we learn about that collection? It is these questions that HathiTrust staff set out to answer in a project that we’ve called the “Federal Documents Collection Profile.” In January, we concluded an initial analysis of the U.S. Federal Documents collection as it existed September 1, 2016. “Initial”, because this hadn’t been done before, and because we expect it to be the precursor of more robust collection analysis and comparisons to come. A goal of the project was to investigate a variety of metrics based on the data available to us in order to establish a baseline for reporting on the collection. We were cautiously optimistic that we would be able to characterize at least some aspects of the collection.
Because we did not limit our set to full view, we found that, in our federal documents collection, approximately 852,488 digital objects/documents are fully viewable in the U.S., while 117,827 are limited view/search only. Clearly more investigation can be done here to understand this breakdown.
A brief look at usage metrics revealed “Library of Congress Catalogs 1976 V. 4” and “Annual Report of the Commissioner of Patents for 1916” in first and second place, as well as “A short guide to New Zealand” in ninth place, apparently having gained fame by being discussed in a reddit thread.
Direct to Full Text Report: U.S. Federal Documents in HathiTrust: A Collection Profile
15 pages; PDF.