May 18, 2022

New Conference Paper: “WhatsApp, Doc? A First Look at WhatsApp Public Group Data”

Here’s a new paper/dataset that has been accepted at the 12th International AAAI (Association for the Advancement of Artificial Intelligence) Conference on Web and Social Media (ICWSM2018) that is scheduled to take place in California in June.


WhatsApp, Doc? A First Look at WhatsApp Public Group Data


Kiran Garimella
EPFL, Switzerland

Gareth Tyson
Queen Mary University, London


via arXiv


In this dataset paper we describe our work on the collection analysis is of public WhatsApp group data. Our primary goal is to explore the feasibility of collecting and using What-App data for social science research. We therefore present a generalisable data collection methodology, and a publicly available dataset for use by other researchers. To provide con-text, we perform statistical exploration to allow researchers to understand what public WhatsApp group data can be collected and how this data can be used. Given the widespread use of WhatsApp, our techniques to obtain public data and potential applications are important for the community.

Direct to Full Text Paper
8 pages; PDF.

Direct to Dataset

Direct to arXiv Entry

Media Coverage

WhatsApp Public Groups Can Leave User Data Vulnerable to Scraping (via Venture Beat)

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.