Research Article (Preprint): The Rise of AI-Generated Content in Wikipedia
The article (preprint) linked below was recently shared on arXiv.
Title
The Rise of AI-Generated Content in Wikipedia
Authors
Creston Brooks
Princeton University
Samuel Eggert
Princeton University
Denis Peskoff
Princeton University
Source
via arXiv
DOI: 10.48550/arXiv.2410.08044
Abstract
The rise of AI-generated content in popular information sources raises significant concerns about accountability, accuracy, and bias amplification. Beyond directly impacting consumers, the widespread presence of this content poses questions for the long-term viability of training language models on vast internet sweeps.
We use GPTZero, a proprietary AI detector, and Binoculars, an open-source alternative, to establish lower bounds on the presence of AI-generated content in recently created Wikipedia pages. Both detectors reveal a marked increase in AI-generated content in recent pages compared to those from before the release of GPT-3.5.
With thresholds calibrated to achieve a 1% false positive rate on pre-GPT-3.5 articles, detectors flag over 5% of newly created English Wikipedia articles as AI-generated, with lower percentages for German, French, and Italian articles. Flagged Wikipedia articles are typically of lower quality and are often self-promotional or partial towards a specific viewpoint on controversial topics.
Direct to Full Text Article
13 pages; PDF.
UPDATE 10/16 Added Article/Link Below
On a related note… How Wikipedia is staying relevant in the AI era (via Fast Company)
Filed under: News
About Gary Price
Gary Price (gprice@gmail.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com.