Jack H. Culbert GESIS – Leibniz Institute for the Social Sciences
Anne Hobert Göttingen State and University Library
Najko Jahn Göttingen State and University Library
Nick Haupka Göttingen State and University Library
Marion Schmidt German Center for Higher Education Research and Science Studies (DZHW)
Paul Donner German Center for Higher Education Research and Science Studies (DZHW)
Philipp Mayr GESIS – Leibniz Institute for the Social Sciences
Source
Scientometrics (2025)
DOI:10.1007/s11192-025-05293-3
Abstract
OpenAlex is a promising open source of scholarly metadata, and competitor to established proprietary sources, such as the Web of Science and Scopus. As OpenAlex provides its data freely and openly, it permits researchers to perform bibliometric studies that can be reproduced in the community without licensing barriers. However, as OpenAlex is a rapidly evolving source and the data contained within is expanding and also quickly changing, the question naturally arises as to the trustworthiness of its data. In this report, we will study the reference coverage and selected metadata within each database and compare them with each other to help address this open question in bibliometrics. In our large-scale study, we demonstrate that, when restricted to a cleaned dataset of 16.8 million recent publications shared by all three databases, OpenAlex has average source reference numbers and internal coverage rates comparable to both Web of Science and Scopus. We further analyse the metadata in OpenAlex, the Web of Science and Scopus by journal, finding a similarity in the distribution of source reference counts in the Web of Science and Scopus as compared to OpenAlex. We also demonstrate that the comparison of other core metadata covered by OpenAlex shows mixed results when broken down by journal, where OpenAlex captures more ORCID identifiers, fewer abstracts and a similar number of Open Access status indicators per article when compared to both the Web of Science and Scopus.
Venn diagram of the intersection sizes of unique DOIs based in each database on exact DOI match (without deduplication, i.e. cases of DOIs that have been assigned to multiple papers are now kept in the sets), for records published between 2015 and 2022. Source: 10.1007/s11192-025-05293-3
Gary Price (gprice@gmail.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area.
He earned his MLIS degree from Wayne State University in Detroit.
Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com.