September 22, 2021

New Research: “Your ‘Anonymized’ Web Browsing History May Not Be Anonymous”

From Princeton University:

Raising further questions about privacy on the internet, researchers from Princeton and Stanford universities have released a study showing that a specific person’s online behavior can be identified by linking anonymous web browsing histories with social media profiles.

“We show that browsing histories can be linked to social media profiles such as Twitter, Facebook or Reddit accounts,” the researchers wrote in a paper scheduled for presentation at the 2017 World Wide Web Conference Perth, Australia, in April.

“It is already known that some companies, such as Google and Facebook, track users online and know their identities,” said Arvind Narayanan, an assistant professor of computer science at Princeton and one of the researchers involved in the project. But those companies, which consumers choose to create accounts with, disclose their tracking.  The new research shows that anyone with access to browsing histories — a great number of companies and organizations  — can identify many users by analyzing public information from social media accounts, Narayanan said.

“Users may assume they are anonymous when they are browsing a news or a health website, but our work adds to the list of ways in which tracking companies may be able to learn their identities,” said Narayanan, an affiliated faculty member at Princeton’s Center for Information Technology Policy.

[Clip]

In the article, the authors note that online advertising companies build browsing histories of users with tracking programs embedded on webpages. Some advertisers attach identities to these profiles, but most promise that the web browsing information is not linked to anyone’s identity. The researchers wanted to know if it were possible to de-anonymize web browsing and identify a user even if the web browsing history did not include identities.

They decided to limit themselves to publicly available information. Social media profiles, particularly those that include links to outside webpages, offered the strongest possibility. The researchers created an algorithm to compare anonymous web browsing histories with links appearing in people’s public social media accounts, called “feeds.”

“Each person’s browsing history is unique and contains tell-tale signs of their identity,” said Sharad Goel, an assistant professor at Stanford and an author of the study.

The programs were able to find patterns among the different groups of data and use those patterns to identify users. The researchers note that the method is not perfect, and it requires a social media feed that includes a number of links to outside sites. However, they said that “given a history with 30 links originating from Twitter, we can deduce the corresponding Twitter profile more than 50 percent of the time.”

The researchers had even greater success in an experiment they ran involving 374 volunteers who submitted web browsing information. The researchers were able to identify more than 70 percent of those users by comparing their web browsing data to hundreds of millions of public social media feeds. (The number of original participants in the study was higher, but some users were eliminated because of technical problems in processing their information.)

Read the Complete Article

Direct to Full Text Research Paper: De-anonymizing Web Browsing Data with Social Networks
9 pages; PDF.

More Research

“Online Tracking: A 1-Million-Site Measurement and Analysis” (May 23, 2016)
Research Paper Co-Authored by Arvind Narayanan.

New Research/Report Provides “Archaeological Study” About Use of Third-Party Tracking Technology on the Web (August 24, 2016)

Privacy and Data Leaks: “Location Data on Two Apps Enough to Identify Someone, Says Study” (April 15, 2016)

Conference Paper: “Cookies That Give You Away: The Surveillance Implications of Web Tracking” (June 6, 2015)

Online Privacy: How a Button Found on Many Web Pages Might Be Very Hazardous to Your Online Privacy (July 22, 2014)

About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.

Share