White Paper: “Privacy-Preserving Data Analysis for the Federal Statistical Agencies”
The following white paper was published yesterday by The Computing Community Consortium (CCC).
Title
Privacy-Preserving Data Analysis for the Federal Statistical Agencies
Authors
John Abowd
Lorenzo Alvisi
Cynthia Dwork
Sampath Kannan
Ashwin Machanavajjhala
Jerome Reiter
Source
Prepared for the Computing Community Consortium Committee of the Computing Research Association
From the Paper
Government statistical agencies collect enormously valuable data on the nation’s population and business activities. Wide access to these data enables evidence-based policy making, supports new research that improves society, facilitates training for students in data science, and provides resources for the public to better understand and participate in their society. These data also affect the private sector. For example, the Employment Situation in the United States, published by the Bureau of Labor Statistics, moves markets. Nonetheless, government agencies are under increasing pressure to limit access to data because of a growing understanding of the threats to data privacy and confidentiality.
“De-identification” — stripping obvious identifiers like names, addresses, and identification numbers — has been found inadequate in the face of modern computational and informational resources (Sweeney 2007; Narayanan and Shmatikov 2006; Narayanan and Shmatikov 2010; Sweeney 2013; see also the report of the President’s Council of Advisors on Science and Technology 2014).
Unfortunately, the problem extends even to the release of aggregate data statistics (Dinur and Nissim 2003; Dwork, McSherry, and Talwar 2007; Homer et al. 2008; Kasiviswanathan, Rudelson, Smith, and Ullman 2010; De 2012; Kasiviswanathan, Rudelson, and Smith, 2013; Muthukrishnan and Nikolov 2012; Dwork et al 2015). This counter-intuitive phenomenon has come to be known as the Fundamental Law of Information Recovery. It says that overly accurate estimates of too many statistics can completely destroy privacy. One may think of this as death by a thousand cuts.
Direct to Full Text (7 pages; PDF)
Filed under: Data Files, Journal Articles, News
About Gary Price
Gary Price (gprice@gmail.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com.