May 22, 2022

Research Article: “Citation Needed? Wikipedia Bibliometrics During the First Wave of the COVID-19 Pandemic”

The article linked below was recently published by GigaScience.


Citation Needed? Wikipedia Bibliometrics During the First Wave of the COVID-19 Pandemic


Omer Benjakob
Université de Paris, INSERM

Rona Aviram
Université de Paris, INSERM

Jonathan Aryeh Sobel
Weizmann Institute of Science, Israel


Volume 11, 2022, giab095

DOI: 10.1093/gigascience/giab095


With the COVID-19 pandemic’s outbreak, millions flocked to Wikipedia for updated information. Amid growing concerns regarding an “infodemic,” ensuring the quality of information is a crucial vector of public health. Investigating whether and how Wikipedia remained up to date and in line with science is key to formulating strategies to counter misinformation. Using citation analyses, we asked which sources informed Wikipedia’s COVID-19–related articles before and during the pandemic’s first wave (January–May 2020).

Top sources used in the Wikipedia COVID-19 corpus: A) source types, B) news agencies, C) websites, and D) publishers form the COVID-19 corpus sources (per Wikipedia’s citation template terminology). Several denominations for the same institution are present in the raw data which is highlighted here with the example of WHO and World Health Organization Source: 10.1093/gigascience/giab095

We found that coronavirus-related articles referenced trusted media outlets and high-quality academic sources. Regarding academic sources, Wikipedia was found to be highly selective in terms of what science was cited. Moreover, despite a surge in COVID-19 preprints, Wikipedia had a clear preference for open-access studies published in respected journals and made little use of preprints. Building a timeline of English-language COVID-19 articles from 2001–2020 revealed a nuanced trade-off between quality and timeliness. It further showed how pre-existing articles on key topics related to the virus created a framework for integrating new knowledge. Supported by a rigid sourcing policy, this “scientific infrastructure” facilitated contextualization and regulated the influx of new information. Last, we constructed a network of DOI-Wikipedia articles, which showed the landscape of pandemic-related knowledge on Wikipedia and how academic citations create a web of shared knowledge supporting topics like COVID-19 drug development.

Understanding how scientific research interacts with the digital knowledge-sphere during the pandemic provides insight into how Wikipedia can facilitate access to science. It also reveals how, aided by what we term its “citizen encyclopedists,” it successfully fended off COVID-19 disinformation and how this unique model may be deployed in other contexts.

Direct to Full Text Article

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.