May 22, 2022

HathiTrust Releases Zephir, a New Bibliographic Metadata Management System

From HT:

HathiTrust is pleased to announce the release of a new, state-of-the-art bibliographic management system for its 11-million volume digital repository. The new system, called Zephir, is developed and managed by the California Digital Library, and represents the first distributed development of a major repository component outside the University of Michigan.


California Digital Library (CDL) receives and manages bibliographic records in Zephir that are associated with digital items to be deposited in HathiTrust. Zephir stores all versions of submitted records and selects the “best” record when records for a given title are submitted from multiple sources. The records are then exported for use in HathiTrust’s catalog, data feeds, and APIs.

Laine Farley, Executive Director of the California Digital Library, noted that decades of experience managing bibliographic data for distributed campuses made the University of California system, and the California Digital Library in particular, well suited to this work. “Bibliographic metadata is critical to users’ ability to find and use materials, but managing this metadata is challenging. We knew we could make a strong contribution that would enhance the user experience and maximize the potential for use and enhancement of HathiTrust bibliographic records.” As one of the founding institutions of HathiTrust, Farley remarked, the California Digital Library is “proud to bring its expertise to further our collective work.”

Learn More About Zephir (History, Documentation, Timeline; via HT)

But That’s Not All From HathiTrust Today…

HathiTrust Announces Intention to Become A Digital Preservation Network Replicating Node

HathiTrust is pleased to announce its intention to become a “replicating node” in the Digital Preservation Network, contingent upon acceptable terms and conditions for doing so. HathiTrust is one of five institutions or consortia that have been collaborating over the past year to build a robust technological infrastructure for DPN, envisioned to be a broad, collaborative preservation safety net undergirding digital repositories, ensuring that “a single point of failure cannot jeopardize centuries of scholarship.”


DPN replicating nodes together will maintain a “dark archive” of digital content submitted by DPN contributing nodes. The content will be replicated among the nodes with appropriate metadata and policies in place such that under certain conditions, content could be “brightened” or made available at a future time irrespective of who deposited the content.

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.