January 23, 2022

White Paper: “Privacy-Preserving Data Analysis for the Federal Statistical Agencies”

The following white paper was published yesterday by The Computing Community Consortium (CCC).


Privacy-Preserving Data Analysis for the Federal Statistical Agencies


John Abowd

Lorenzo Alvisi

Cynthia Dwork

Sampath Kannan

Ashwin Machanavajjhala

Jerome Reiter


Prepared for the Computing Community Consortium Committee of the Computing Research Association

From the Paper

Government statistical agencies collect enormously valuable data on the nation’s population and business activities. Wide access to these data enables evidence-based policy making, supports new research that improves society, facilitates training for students in data science, and provides resources for the public to better understand and participate in their society. These data also affect the private sector. For example, the Employment Situation in the United States, published by the Bureau of Labor Statistics, moves markets. Nonetheless, government agencies are under increasing pressure to limit access to data because of a growing understanding of the threats to data privacy and confidentiality.

“De-identification” — stripping obvious identifiers like names, addresses, and identification numbers — has been found inadequate in the face of modern computational and informational resources (Sweeney 2007; Narayanan and Shmatikov 2006; Narayanan and Shmatikov 2010; Sweeney 2013; see also the report of the President’s Council of Advisors on Science and Technology 2014).

Unfortunately, the problem extends even to the release of aggregate data statistics (Dinur and Nissim 2003; Dwork, McSherry, and Talwar 2007; Homer et al. 2008; Kasiviswanathan, Rudelson, Smith, and Ullman 2010; De 2012; Kasiviswanathan, Rudelson, and Smith, 2013; Muthukrishnan and Nikolov 2012; Dwork et al 2015). This counter-intuitive phenomenon has come to be known as the Fundamental Law of Information Recovery. It says that overly accurate estimates of too many statistics can completely destroy privacy. One may think of this as death by a thousand cuts.

Direct to Full Text (7 pages; PDF)

About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.