January 25, 2021

New Report: “How Apps on Android Share Data with Facebook (Even if You Don’t Have a Facebook Account)”

From Privacy International

This question of whether Facebook gathers information about users who are not signed in or do not have an account was raised in the aftermath of the Cambridge Analytica scandal by lawmakers in hearings in the United States and in Europe. Discussions, as well as previous fines by Data Protection Authorities about the tracking of non-users, however, often focus on the tracking that happens on websites. Much less is known about the data that the company receives from apps. For these reasons, in this report we raise questions about transparency and use of app data that we consider timely and important.

Facebook routinely tracks users, non-users and logged-out users outside its platform through Facebook Business Tools. App developers share data with Facebook through the Facebook Software Development Kit (SDK), a set of software development tools that help developers build apps for a specific operating system. Using the free and open source software tool called “mitmproxy”, an interactive HTTPS proxy, Privacy International has analyzed the data that 34 apps on Android, each with an install base from 10 to 500 million, transmit to Facebook through the Facebook SDK.

2019-01-02_14-01-18All apps were tested between August and December 2018, with the last re-test happening between 3 and 11 of December 2018. The full documentation, including the exact date each app was tested, can be found at https://privacyinternational.org/appdata.

Findings

  • We found that at least 61 percent of apps we tested automatically transfer data to Facebook the moment a user opens the app. This happens whether people have a Facebook account or not, or whether they are logged into Facebook or not.
  • Typically, the data that is automatically transmitted first is events data that communicates to Facebook that the Facebook SDK has been initialized by transmitting data such as “App installed” and “SDK Initialized”. This data reveals the fact that a user is using a specific app, every single time that user opens an app.
  •  In our analysis, apps that automatically transmit data to Facebook share this data together with a unique identifier, the Google advertising ID (AAID). The primary purpose of advertising IDs, such as the Google advertising ID (or Apple’s equivalent, the IDFA) is to allow advertisers to link data about user behavior from different apps and web browsing into a comprehensive profile. If combined, data from different apps can paint a fine-grained and intimate picture of people’s activities, interests, behaviors and routines, some of which can reveal special category data, including information about people’s health or religion. For example, an individual who has installed the following apps that we have tested, “Qibla Connect” (a Muslim prayer app), “Period Tracker Clue” (a period tracker), “Indeed” (a job search app), “My Talking Tom” (a children’s’ app), could be potentially profiled as likely female, likely Muslim, likely job seeker, likely parent.
  • If combined, event data such as “App installed”, “SDK Initialized” and “Deactivate app” from different apps also offer a detailed insight into the app usage behavior of hundreds of millions of people.
  • We also found that some apps routinely send Facebook data that is incredibly detailed and sometimes sensitive. Again, this concerns data of people who are either logged out of Facebook or who do not have a Facebook account. A prime example is the travel search and price comparison app “KAYAK”, which sends detailed information about people’s flight searches to Facebook, including: departure city, departure airport, departure date, arrival city, arrival airpot, arrival date, number of tickets (including number of children), class of tickets (economy, business or first class).
  • Facebook’s Cookies Policy describes two ways in which people who do not have a Facebook account can control Facebook’s use of cookies to show them ads. Privacy International has tested both opt-outs and found that they had no discernible impact on the data sharing that we have described in this report.

Read the Complete Summary Blog Post

Direct to Full Text Report>
51 pages; PDF.

See Also: VIDEO: Presentation Featuring Findings From Report
Recorded at 35th Chaos Computer Congress (35C3).

About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.

Share