An open access version of the following paper was recently made available via arXiv. It was published in the Proceedings of the 26th ACM Conference on Hypertext & Social Media, April 2015.
Universitat Pompeu Fabra, Spain
In this paper we analyze how user generated content (UGC) is created, challenging the well known wisdom of crowds concept.
Although it is known that user activity in most settings follow a power law, that is, few people do a lot, while most do nothing, there are few studies that characterize well this activity. In our analysis of datasets from two different social networks, Facebook and Twitter, we find that a small percentage of active users and much less of all users represent 50% of the UGC.
We also analyze the dynamic behavior of the generation of this content to find that the set of most active users is quite stable in time. Moreover, we study the social graph, finding that those active users are highly connected among them. [Our emphasis] This implies that most of the wisdom comes from a few users, challenging the independence assumption needed to have a wisdom of crowds.
We also address the content that is never seen by any people, which we call digital desert, that challenges the assumption that the content of every person should be taken in account in a collective decision.
We also compare our results with Wikipedia data and we address the quality of UGC content using an Amazon dataset.
At the end our results are not surprising, as the Web is a reflection of our own society, where economical or political power also is in the hands of minorities.
Direct to Full Text Article (6 pages; PDF)