April 13, 2021

Research Article (Preprint): “Online Disinformation and the Role of Wikipedia”

The following article (preprint) was recently shared on arXiv.


Online Disinformation and the Role of Wikipedia


Diego Saez-Trumper
Wikimedia Foundation


via arXiv


The aim of this study is to find key areas of research that can be useful to fight against disinformation on Wikipedia. To address this problem we perform a literature review trying to answer three main questions: (i) What is disinformation? (ii)What are the most popular mechanisms to spread online disinformation? and (iii) Which are the mechanisms that are currently being used to fight against disinformation?

In all these three questions we take first a general approach, considering studies from different areas such as journalism and communications, sociology, philosophy, information and political sciences. And comparing those studies with the current situation on the Wikipedia ecosystem.

We found that disinformation can be defined as non-accidentally misleading information that is likely to create false beliefs. While the exact definition of misinformation varies across different authors, they tend to agree that disinformation is different from other types of misinformation, because it requires the intention of deceiving the receiver. A more actionable way to scope disinformation is to define it as a problem of information quality. In Wikipedia quality of information is mainly controlled by the policies of neutral point of view and verifiability.

The mechanisms used to spread online disinformation include the coordinated action of online brigades, the usage of bots, and other techniques to create fake content. Underresouced topics and communities are especially vulnerable to such attacks.  The usage of sock-puppets is one of the most important problems for Wikipedia.

The techniques used to fight against information on the internet, include manual fact checking done by agencies and communities, as well as automatic techniques to assess the quality and credibility of a given information. Machine learning approaches can be fully automatic or can be used as tools by human fact checkers. Wikipedia and especially Wikidata play double role here, because they are used by automatic methods as ground-truth to determine the credibility of an information, and at the same time (and for that reason) they are the target of many attacks. Currently, the main defense of Wikimedia projects against fake news is the work done by community members and especially by patrollers, that use mixed techniques to detect and control disinformation campaigns on Wikipedia.

We conclude that in order to keep Wikipedia as free as possible from disinformation, it’s necessary to help patrollers to early detect disinformation and assess the credibility of external sources. More research is needed to develop tools that use state-of-the-art machine learning techniques to detect potentially dangerous content, empowering patrollers to deal with attacks that are becoming more complex and sophisticated.

Direct to Full Text Article (Preprint)
17 pages; PDF. 

About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.