May 16, 2022

Journal Article: “Open Data and Data Sharing in Articles About COVID-19 Published in Preprint Servers medRxiv and bioRxiv”

The article linked below was published yesterday by Scientometrics.


Open Data and Data Sharing in Articles About COVID-19 Published in Preprint Servers medRxiv and bioRxiv


Josip Strcic
Catholic University of Croatia

Antonia Civljak
Specialist Family Medicine Clinic Dr. Ljiljana Lipovac-Francuz, Croatia

Terezija Glozinic
Catholic University of Croatia

Rafael Leite Pacheco
Hospital Sírio-Libanês, Universidade Federal de São Paulo (Unifesp)
Centro Universitário São Camilo (CUSC)

Tonci Brkovic
University Hospital Split, Split, Croatia

Livia Puljak
Center for Evidence-Based Medicine and Health Care, Catholic University of Croatia


Scientometrics (2022)
DOI: 10.1007/s11192-022-04346-1


This study aimed to analyze the content of data availability statements (DAS) and the actual sharing of raw data in preprint articles about COVID-19. The study combined a bibliometric analysis and a cross-sectional survey. We analyzed preprint articles on COVID-19 published on medRxiv and bioRxiv from January 1, 2020 to March 30, 2020. We extracted data sharing statements, tried to locate raw data when authors indicated they were available, and surveyed authors. The authors were surveyed in 2020–2021. We surveyed authors whose articles did not include DAS, who indicated that data are available on request, or their manuscript reported that raw data are available in the manuscript, but raw data were not found. Raw data collected in this study are published on Open Science Framework. We analyzed 897 preprint articles. There were 699 (78%) articles with Data/Code field present on the website of a preprint server. In 234 (26%) preprints, data/code sharing statement was reported within the manuscript. For 283 preprints that reported that data were accessible, we found raw data/code for 133 (47%) of those 283 preprints (15% of all analyzed preprint articles). Most commonly, authors indicated that data were available on GitHub or another clearly specified web location, on (reasonable) request, in the manuscript or its supplementary files. In conclusion, preprint servers should require authors to provide data sharing statements that will be included both on the website and in the manuscript. Education of researchers about the meaning of data sharing is needed.

DOI: 10.1007/s11192-022-04346-1

Direct to Full Text Article

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.