December 9, 2017

Catalog Records: Library of Congress Releases 25 Million Free Records of Bibliographic Metadata

From LC:

The Library of Congress announced today that it is making 25 million records in its online catalog available for free bulk download at loc.gov/cds/products/marcDist.php.

This is the largest release of digital records in the Library’s history.

The records also will be easily accessible at data.gov. This is the first time a legislative branch agency has made its products available on the open-government website hosted by the General Services Administration (GSA).

Until now, these bibliographic records have only been available individually or through a paid subscription.

The Library is also joining with George Washington University and George Mason University to host a Hack-to-Learn workshop Wednesday, May 17 through Thursday, May 18, which will bring together librarians, digital researchers and coders to explore how the data (and other interesting data sets) can be used.

The new, free service will operate in parallel with the Library’s fee-based MARC Distribution Service, which is used extensively by large commercial customers and libraries.

All records use the MARC (Machine Readable Cataloging Records) format, which is the international standard maintained by the Library of Congress with participation and support of libraries and librarians worldwide for the representation and communication of bibliographic and related information in machine-readable form.

The data covers a wide range of Library items including books, serials, computer files, manuscripts, maps, music and visual materials.

The free data sets cover more than 45 years, ranging from 1968, during the early years of MARC, to 2014. Each record provides standardized information about an item, including the title, author, publication date, subject headings, genre, related names, summary and other notes.

The Hack-to-Learn workshop will bring together experts and enthusiasts to learn more about available research tools and to conduct hands-on exploration of largely unexplored data sets, including the 25 million MARC records; 52,000 index cards of jokes from the Phyllis Diller Gag File; and 8,000 documents from Eleanor Roosevelt’s “My Day” columns. For more information about the Hack-to-Learn, visit digitalpreservation.gov/meetings/hack-to-learn/hack-to-learn-site.html.

See Also: FAQ: MDSConnect (Open-access Version of LC’s MARC Records)
2 pages; PDF.

See Also: MDSConnect “Getting Started” Guide
2 pages; PDF.

Gary Price About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.

Share