January 28, 2022

Access to Information: Carnegie Mellon Performs First Large-Scale Analysis of "Soft" Censorship of Social Media in China

From the CMU News Release:

Researchers in Carnegie Mellon University’s School of Computer Science analyzed millions of Chinese microblogs, or “weibos,” to uncover a set of politically sensitive terms that draw the attention of Chinese censors. Individual messages containing the terms were often deleted at rates that could vary based on current events or geography.

The study is the first large-scale analysis of political content censorship in social media, a topic that drew attention and controversy earlier this year when Twitter announced a country-by-country policy for removing tweets that don’t comply with local laws.


The CMU study also showed high rates of weibo censorship in certain provinces. The phenomenon was particularly notable in Tibet, a hotbed of political unrest, where up to 53 percent of locally generated microblogs were deleted.

The study by Noah Smith, associate professor in the Language Technologies Institute (LTI); David Bamman, a Ph.D. student in LTI; and Brendan O’Connor, a Ph.D. student in the Machine Learning Department, appears in the March issue of First Monday, a peer-reviewed, online journal.


Many weibos with high deletion rates included terms and names known to be politically sensitive, such as Fang Binxing, the architect of the Great Firewall of China, and references to state propaganda. Others reflect sensitivity to events; a term meaning “to ask someone to resign” became subject to deletion following the high-speed rail crash that killed 40 people in Wenzhou last July and apparently referenced the minister of railways.

Censored terms are not always political. Following the March 2011 Fukushima nuclear disaster in Japan, weibos containing such politically innocuous terms as iodized salt and radioactive iodine had high deletion rates. The researchers believe these deletions were the result of government efforts to quash false rumors about the nuclear accident causing salt contamination.

Not all deletions are necessarily state-instigated censorship, the researchers noted. Spam and pornographic messages also are subject to deletion, just as they are in the United States.

Read the Complete News Release

Read the Full Text Article (via First Monday)


About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.