May 28, 2022

Research Paper: “Wisdom of the Crowd or Wisdom of a Few? An Analysis of Users’ Content Generation”

An open access version of the following paper was recently made available via arXiv. It was published in the Proceedings of the 26th ACM Conference on Hypertext & Social Media, April 2015.


Wisdom of the Crowd or Wisdom of a Few? An Analysis of Users’ Content Generation


Ricardo Baeza-Yates
Yahoo Labs

Diego Saez-Trumper
Universitat Pompeu Fabra, Spain


via arXiv


In this paper we analyze how user generated content (UGC) is created, challenging the well known wisdom of crowds concept.

Although it is known that user activity in most settings follow a power law, that is, few people do a lot, while most do nothing, there are few studies that characterize well this activity. In our analysis of datasets from two different social networks, Facebook and Twitter, we find that a small percentage of active users and much less of all users represent 50% of the UGC.

We also analyze the dynamic behavior of the generation of this content to find that the set of most active users is quite stable in time. Moreover, we study the social graph, finding that those active users are highly connected among them. [Our emphasis] This implies that most of the wisdom comes from a few users, challenging the independence assumption needed to have a wisdom of crowds.

We also address the content that is never seen by any people, which we call digital desert, that challenges the assumption that the content of every person should be taken in account in a collective decision.

We also compare our results with Wikipedia data and we address the quality of UGC content using an Amazon dataset.

At the end our results are not surprising, as the Web is a reflection of our own society, where economical or political power also is in the hands of minorities.

Direct to Full Text Article (6 pages; PDF)

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.