January 17, 2022

Roundup: Four Papers and Articles on Library and Online Privacy Issues

Here are four new articles/papers (three of them available full text) that focus on library and online privacy issues. We hope that one or more will be of interest to you, your colleagues, and library users.

Title: Privacy and Libraries
7 pages; PDF.

This paper by Martyn Wade will be presented later this month at the 2015 IFLA Annual Meeting/World Library Information Congress in Cape Town, South Africa.

The information releases – or leaks – about surveillance by Edward Snowden in 2013 prompted a debate and awareness amongst the library and information profession about the issue of privacy. The paper briefly summarises the development of the concept of privacy until its acceptance today as a human right. The important role that library and information services have to play in ensuring individual privacy is described together with the role of librarians as advocates of privacy. The paper argues that the profession must take an ethical stance towards privacy, as reflected in professional codes of conduct and practice. The paper concludes that privacy is recognised as a human right and is important in every culture, and that the absence of privacy can have a chilling effect on both the citizen and society as a whole. Library and information services should respect this right and work in an ethical way when working to ensure citizens can enjoy and benefit from that right, but also should manage their own services in ways that respect their users’ privacy. Library and information workers must continue – as they have always done – to take a carefully thought out ethical stance to ensure that they we also maintain their role of working to ensure freedom of access and freedom of expression.

Title: Web Tracking: Mechanisms, Implications, and Defenses
29 pages; PDF.
via arXiv

Here’s a recently posted paper by researchers in the Broadband Communications Research Group, Department of Computer Architecture, Universitat Politècnica de Cataluny in Barcelona.

This articles [sic] surveys the existing literature on the methods currently used by web services to track the user online as well as their purposes, implications, and possible user’s defenses. A significant majority of reviewed articles and web resources are from years 2012-2014. Privacy seems to be the Achilles’ heel of today’s web. Web services make continuous efforts to obtain as much information as they can about the things we search, the sites we visit, the people with who we contact, and the products we buy. Tracking is usually performed for commercial purposes.

We present 5 main groups of methods used for user tracking, which are based on sessions, client storage, client cache, fingerprinting, or yet other approaches. A special focus is placed on mechanisms that use web caches, operational caches, and fingerprinting, as they are usually very rich in terms of using various creative methodologies. We also show how the users can be identified on the web and associated with their real names, e-mail addresses, phone numbers, or even street addresses. We show why tracking is being used and its possible implications for the users (price discrimination, assessing financial credibility, determining insurance coverage, government surveillance, and identity theft).

For each of the tracking methods, we present possible defenses. Apart from describing the methods and tools used for keeping the personal data away from being tracked, we also present several tools that were used for research purposes – their main goal is to discover how and by which entity the users are being tracked on their desktop computers or smartphones, provide this information to the users, and visualize it in an accessible and easy to follow way. Finally, we present the currently proposed future approaches to track the user and show that they can potentially pose significant threats to the users’ privacy.

Title: It’s Almost Impossible to Stop Google and Facebook From Knowing About Your Health-Related Searches
via BGR

This article by Brad Reed discusses a March 2015 article in the Communications of the ACM titled, “Privacy Implications of Health Information Seeking on the Web” (paywalled).

Reed writes:

…a new study conducted by Tim Libert at the Annenberg School for Communication that found “91% of health-related pages relay the URL to third parties, often unbeknownst to the user, and in 70% of the cases, the URL contains sensitive information such as ‘HIV’ or ‘cancer’ which is sufficient to tip off these third parties that you have been searching for information related to a specific disease.”

Reed’s article also includes an embed of a video by the author of the ACM article, Tim Libert.

For a bit more about the research paper, see this February 2015 news release from the Annenberg School of Communications, “Your Privacy Online: Health Information at Serious Risk of Abuse.”

According to Libert, “Proving privacy harms is always a difficult task. However, this study demonstrates that data on online health information seeking is being collected by entities not subject to regulation oversight. This information can be inadvertently misused, sold, or even stolen. Clearly there is a need for discussion with respect to legislation, policies, and oversight to address health privacy in the age of the internet”.

Title: Your Actions Tell Where You Are: Uncovering Twitter Users in a Metropolitan Area
11 pages; PDF.
via arXiv

This paper (at times highly technical) by researchers at Arizona State and the U. of Hawaii has been accepted by the IEEE Conference on Communications and Network Security (CNS) 2015.


Twitter is an extremely popular social networking platform. Most Twitter users do not disclose their locations due to privacy concerns. Although inferring the location of an individual Twitter user has been extensively studied, it is still missing to effectively find the majority of the users in a specific geographical area without scanning the whole Twittersphere, and obtaining these users will result in both positive and negative significance. In this paper, we propose LocInfer, a novel and lightweight system to tackle this problem. LocInfer explores the fact that user communications in Twitter exhibit strong geographic locality, which we validate through large-scale datasets. Based on the experiments from four representative metropolitan areas in U.S., LocInfer can discover on average 86.6% of the users with 73.2% accuracy in each area by only checking a small set of candidate users. We also present a countermeasure to the users highly sensitive to location privacy and show its efficacy by simulations.

NOTE: The Electronic Frontier Foundation (EFF) just released version 1.0 of Privacy Badger, a free add-on tool that helps stop third-party tracking across the web. 

As the FAQ points, Privacy Badger is one of a number of tools to help reduce tracking of your movements across the web. Since all do what they do a bit differently you might want to consider using more than one. All of these tools also include documentation that can also help you learn more about web tracking.


See Also: Conference Paper: “Cookies That Give You Away: The Surveillance Implications of Web Tracking” (June 6, 2015)

See Also: NoScript (Another Free Ad-On Tool that You Might Want to Consider Using)
Btw, NoScript is part of the default TOR Browser download.

About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.