May 22, 2022

Crossref: 2022 Public Data File of More Than 134 Million Metadata Records Now Available

From a Crossref Blog Post by Patrick Polischuk:

In 2020 we released our first public data file, something we’ve turned into an annual affair supporting our commitment to the Principles of Open Scholarly Infrastructure (POSI). We’ve just posted the 2022 file, which can now be downloaded via torrent like in years past.

We aim to publish these in the first quarter of each year, though as you may notice, we’re a little behind our intended schedule. The reason for this delay was that we wanted to make critical new metadata fields available, including resource URLs and titles with markup.

Crossref metadata is always openly available via our API. We recommend you use this method to incrementally add new and updated records once you’re up and running with an annual public data file. If you’re interested in more frequent and regular “full-file” downloads, consider subscribing to our Metadata Plus program. Plus subscribers have access to monthly snapshots in JSON and XML formats.

Every year our metadata corpus grows. The 2020 file was 65GB and held 112 million records; 2021 came in at 102GB and 120 million records. This year the file weighs in at 160 GB and contains metadata for 134 million records, or all Crossref records registered up to and including April 30, 2022.

Learn More, Read the Complete Post

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.