“The Road to Preprints (Part 3): Metadata Matters”
In our third and final post in our “Road to Preprints” series, we’re chatting with PKP Associate Director of Research Juan Pablo Alperin to learn more about the Preprint Uptake and Use Project, a joint research initiative between ASAPBIo and the ScholCommLab that turned disappointing data into a metadata mission.
In 2019, ScholCommLab visiting scholars Mario Malički and Janina Sarol (under the supervision of PKP’s Juan Pablo Alperin) began analyzing preprint metadata to “better understand the status of preprint adoption and impact in specific research communities.” Mario and Janina looked at several preprint servers including SHARE, OSF, BioRxiv, and arXiv. Their hope was to use data from these sources to answer questions such as “who publishes preprints?” and “how many preprints are published?” but the metadata they were mining turned out to be too unreliable to support. Incomplete, incorrect, and inconsistent metadata (e.g., author, subject, date) was so pervasive that they couldn’t exclude problematic entries in their analysis.
Despite these challenges, the team persevered with their analysis, coming up with suggestions along the way for preprint systems to improve their metadata. To learn more, including what this research meant – and will mean – for Open Preprint Systems (OPS), we asked Juan to share more about their unexpected findings.
Read the Complete Interview
About Gary Price
Gary Price (email@example.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.