The following article (preprint) was recently shared by the authors on arXiv.
National Institute of Informatics/JFLI (France)
National Institute of Informatics (Japan)
Submitted December 19, 2017
Tracking is pervasive on the web. Third party trackers acquire user data through information leak from websites, and user browsing history using cookies and device fingerprinting. In response, several privacy protection techniques (e.g. the Ghostery browser extension) have been developed. To the best of our knowledge, our work is the first study that proposes a reliable methodology for privacy protection comparison, and extensively compares a wide set of privacy protection techniques. Our contributions are the following.
First, we propose a robust methodology to compare privacy protection techniques when crawling many websites, and quantify measurement error. To this end, we reuse the privacy footprint and apply the Kolmogorov-Smirnov test on browsing metrics. This test is likewise applied to HTML-based metrics to assess webpage quality degradation.
To complement HTML-based metrics, we also design a manual analysis. Second, we study the overlap of blocking resources between most popular browser extensions, and compare the performances using the proposed methodology. We show that protection techniques have vastly different performances, and that the best of them exhibit a wide overlap.
Next, we analyze the impact of privacy protection techniques on webpage quality. We show that automated HTML-based analysis sometimes fails to expose quality reduction perceived by users.
Finally, we provide a set of usage recommendations for end-users and research recommendations for the scientific community. Ghostery and uBlock Origin provide the best trade-off between protection and webpage quality. Ghostery however requires a configuration step which is difficult for users. The RequestPolicy Continued and NoScript extensions exhibit the best performances but reduce webpage quality. Ghostery and uBlock Origin use manually built blocking lists which are cumbersome to maintain.
Research efforts should focus on improving existing approaches that do not rely on blocking lists (such as Privacy badger or MyTrackingChoices), and automatically building reliable blocking lists.
Direct to Full Text Paper: (15 pages; PDF)