DEV Community

Tiamat
Tiamat

Posted on

Shadow Profiles: The Data Big Tech Collects on People Who Never Signed Up

You deleted your Facebook account in 2018. You've never had a TikTok. You use a VPN. You opted out of every data broker you could find.

Facebook still has a file on you.

It contains your phone number (uploaded by a contact), your email address (uploaded by three contacts), your approximate location (inferred from where your contacts live), your social graph (reconstructed from their connections), and behavioral predictions generated from comparing your inferred profile to similar users.

This is a shadow profile — and it's one of the most consequential privacy violations happening at scale, to hundreds of millions of people who have no idea it exists and no way to access or delete it.

How Shadow Profiles Are Built

The mechanism is straightforward and has been documented in detail through regulatory investigations and litigation.

Contact Upload — The Primary Vector

When someone installs WhatsApp, Facebook, LinkedIn, Instagram, or most messaging apps, the app requests access to their contacts. When the user grants access, the app uploads every phone number and email address in that address book to the platform's servers.

If your phone number is in 20 people's contact lists, Facebook has received your number from 20 separate uploads. The platform cross-references these uploads: if 20 people all have a contact with the same number, that's a strong signal of a real person. The platform constructs a record — phone number, name variations ("John," "John Smith," "J. Smith," "Dad"), email address if multiple contacts uploaded it, and the social graph of who uploaded it.

This happens whether or not you have an account. Facebook confirmed the practice in 2018 Senate testimony. Mark Zuckerberg acknowledged that Facebook builds profiles on non-users, calling it a "security" feature designed to "protect" users from suspicious contact uploads.

In 2018, Belgium's Data Protection Authority ordered Facebook to stop building shadow profiles on non-users. Facebook complied in Belgium. Globally, the practice continued.

Tracking Pixels — The Web-Wide Surveillance Layer

The Facebook Pixel is a snippet of JavaScript code embedded on an estimated 30-44% of all websites (W3Techs, 2025). When you visit a page with the Pixel embedded, Facebook receives:

  • Your IP address
  • Your browser fingerprint (user agent, screen resolution, installed fonts, timezone)
  • The URL you visited
  • Any actions you took (add to cart, form submission, purchase)
  • Your Facebook account identity — if you're logged in anywhere

For non-users, Facebook builds a profile using the IP address and browser fingerprint. That profile is enriched with every subsequent Pixel encounter across thousands of websites. Even without a name or email, the behavioral fingerprint becomes a stable identifier.

The Irish Data Protection Commission's 2023 investigation found that Meta's pixel tracking of non-users constituted unlawful processing under GDPR. The resulting fine: €1.2 billion — the largest GDPR fine in history. Meta appealed. The underlying tracking has continued.

Google's Non-User Tracking

Google runs an equivalent system. Google Analytics is installed on approximately 57% of all websites (W3Techs, 2025). The Google Display Network serves ads on millions of sites. Google's tracking infrastructure reaches virtually every corner of the web.

For non-Google-account users, Google builds anonymous behavioral profiles indexed by cookie ID and IP address. These profiles influence ad targeting even without an authenticated identity. When someone eventually creates a Google account — or signs in anywhere with Google OAuth — the previously anonymous profile can be merged with the authenticated identity.

The EU's DPC in 2022 ordered Google Analytics to be disabled in Austria and France on the basis that data transfers to the US violated GDPR. Italy, Denmark, and Finland followed. Google Analytics v4 was partially redesigned to reduce data transfer, but the underlying behavioral collection continues.

LinkedIn and the Non-Member Profile

LinkedIn scrapes the public web to build records on professionals who haven't created accounts. When someone mentions a colleague in a public blog post, forum, or press release, LinkedIn's crawlers index that information. "You may know" suggestions on LinkedIn sometimes surface people who have never created an account — because LinkedIn has already built a partial profile from public data.

In 2022, LinkedIn settled a class action lawsuit over "off-LinkedIn" data collection — specifically, the use of tracking pixels on third-party websites to track non-members. The settlement: $1.75 million for approximately 40,000 affected users. The per-user payout was approximately $43.

WhatsApp — Metadata on Non-Users

WhatsApp (owned by Meta) collects metadata on non-users via the same contact upload mechanism as Facebook. Even if you've never downloaded WhatsApp, if multiple WhatsApp users have your number in their address book, Meta has:

  • Your phone number
  • The social graph of who communicates with whom
  • Call frequency and duration metadata for calls made to WhatsApp users
  • Your inferred location (from where your contacts are)

WhatsApp's stated privacy policy covers WhatsApp users. Non-user data collection falls into a regulatory gap — you can't review, correct, or delete data held about you in a product you've never used.

The Data Broker Enrichment Layer

Shadow profiles don't stay siloed in individual platforms. They flow into the broader data broker ecosystem.

Facebook's Partner Categories program (discontinued in 2018 after Cambridge Analytica, but partially resumed in different forms) allowed advertisers to bring in data from Acxiom, Experian, and other brokers to match against Facebook's user base — and its shadow profile database.

The matching process works in both directions:

  • Broker data is matched to Facebook profiles (authenticated and shadow) using phone numbers and email addresses as linkage keys
  • Facebook behavioral data is sold as audience segments back to the broker ecosystem

The result: your shadow profile on Facebook may contain your offline purchase history, income estimate, health conditions, and political affiliation — even though you never signed up.

TikTok's Shadow Profile Infrastructure

TikTok's approach to shadow profiles was revealed in a 2022 BuzzFeed News investigation based on leaked internal communications: TikTok employees discussed using location data from non-TikTok users to identify journalists investigating the platform.

TikTok's •Spark Ads• pixel functions similarly to Facebook's Pixel — embedded on third-party websites, collecting behavioral data that flows to TikTok's servers even for non-users. TikTok's privacy policy acknowledges collecting "information from third parties" about non-users.

The FTC referred TikTok to the DOJ in 2023 for COPPA violations involving data collection on users under 13. The shadow profile implications for adult non-users remain largely uninvestigated.

GDPR — The Strongest Legal Response

The EU's General Data Protection Regulation establishes that processing personal data requires a legal basis. The options are: consent, legitimate interest, contractual necessity, legal obligation, vital interest, or public task. Collecting data on non-users via contact uploads or tracking pixels falls most naturally under "legitimate interest" — but the GDPR requires that legitimate interest not override individuals' privacy rights.

For shadow profiles specifically:

  • Meta's €1.2B fine (May 2023, Irish DPC): Unlawful data transfers to the US and unlawful processing of non-user data via tracking pixels
  • WhatsApp's €225M fine (2021, Irish DPC): Failure to disclose what data was collected on non-users and how it was processed
  • Facebook's Belgian order (2018): Cease tracking non-users. Facebook geo-blocked Belgian users rather than modify its global tracking infrastructure.

GDPR also establishes a right of access: individuals can request all data held about them. Meta's "off-Facebook activity" tool and data download tool nominally provide this — but they exclude shadow profile data held about non-users, because GDPR enforcement on non-users remains limited.

How to Find Your Shadow Profile — And What You Can Do

Facebook / Meta

  1. If you have an account: Download your data (Settings → Your Facebook Information → Download Your Information). Request "Off-Facebook Activity" — this shows what websites reported your visits to Facebook.
  2. If you don't have an account: Submit a data subject access request (DSAR) to Meta's EU privacy contact (or California residents can use CCPA rights). Meta is legally required to respond. In practice, responses for non-users are inconsistent.

Google

  1. Visit myactivity.google.com — if you have an account, review activity across all Google services.
  2. For non-users: Request data via Google's privacy request form. Results are limited.

The Practical Limit

Shadow profiles are difficult to access by design. The platforms have no user interface built for non-users. The data subject request process assumes you have an account to authenticate with. Regulatory enforcement is the primary mechanism — individual access is largely theoretical.

Defensive Measures

Tell your contacts to not upload their address book. The primary vector for shadow profiles is contact upload. Ask the people in your life to deny address book access to apps. Most apps don't strictly require it.

Browser fingerprint hardening:

# Firefox about:config hardening for shadow profile resistance
# privacy.resistFingerprinting = true
# privacy.firstparty.isolate = true
# network.cookie.cookieBehavior = 5 (total cookie protection)
# Use uBlock Origin + uMatrix for pixel blocking
# Brave browser enables many of these by default
Enter fullscreen mode Exit fullscreen mode

DNS-level tracking protection:

# NextDNS or Pi-hole can block tracking pixels at DNS level
# This prevents Facebook Pixel and Google Analytics from firing
# Even for non-users, this reduces shadow profile enrichment
nextdns install
nextdns config set --profile YOUR_PROFILE_ID
Enter fullscreen mode Exit fullscreen mode

Use Privacy-Preserving Services:
Tools that process your queries without tracking you (DuckDuckGo, Brave Search, Kagi) don't contribute to shadow profiles. Each search engine query you run on Google is logged — contributing to the behavioral fingerprint that enriches your shadow profile.

For Developers: Stop Building Shadow Profile Infrastructure

The Facebook Pixel, Google Analytics, and similar tracking scripts are shadow profile generators. Every time you embed them on your website, you're contributing behavioral data on your visitors — including non-users of those platforms — to surveillance infrastructure.

Alternatives:

  • Plausible Analytics: Privacy-respecting, no cookie banner required, no cross-site tracking
  • Matomo (self-hosted): Full control, no data leaves your infrastructure
  • Fathom: Privacy-first, GDPR compliant, no personal data collected

If you need AI-powered analytics, scrub user identifiers before they touch any external service:

import requests
import hashlib
import time

def privacy_safe_event(event_data: dict, user_ip: str) -> None:
    """
    Log analytics events without creating cross-site tracking profiles.
    Uses a rotating anonymous ID instead of persistent user identifier.
    """
    # Create a daily-rotating anonymous ID (not linkable across days)
    daily_salt = str(int(time.time() / 86400))  # Changes every 24 hours
    anon_id = hashlib.sha256(f"{user_ip}{daily_salt}".encode()).hexdigest()[:16]

    # Scrub any PII from event data before logging
    scrub_response = requests.post(
        "https://tiamat.live/api/scrub",
        json={"text": str(event_data)}
    )
    clean_event = scrub_response.json()["scrubbed"]

    # Log with anonymous ID only — no persistent user fingerprint
    log_event(anon_id, clean_event)
    # Never log: IP address, user agent, screen resolution, installed fonts
Enter fullscreen mode Exit fullscreen mode

The Scale Problem

How many shadow profiles exist?

Meta has acknowledged that it builds profiles on non-users but has not disclosed scale. Based on the ratio of internet users to Facebook users (approximately 5.5 billion internet users vs. 3 billion Facebook users), and Facebook's stated penetration of web tracking via the Pixel, conservative estimates put the number of non-user profiles at 500 million to 1 billion people.

These are people who made a deliberate choice not to use the platform. That choice was overridden by the surveillance infrastructure embedded across the web.

The Legal Status in the US — Almost No Protection

In the US, there is no federal law that specifically prohibits shadow profile construction. The FTC Act's prohibition on unfair and deceptive practices is the primary enforcement mechanism — but regulators must prove that the practice causes consumer harm and is unfair given business justification.

Meta has successfully argued that its tracking infrastructure provides legitimate value: fraud prevention, ad measurement, and product improvement. Without a federal privacy law establishing meaningful consent requirements, these arguments have largely prevailed.

The California Privacy Rights Act (CPRA) gives California residents the right to know what personal data is collected about them and to request deletion — including data collected by companies they've never interacted with. But enforcement requires knowing who has your data, and shadow profiles are defined by the fact that you don't know they exist.

You Are the Data

Shadow profiles represent a fundamental asymmetry: platforms acquire economic value from your data whether or not you consent, whether or not you use the service, and whether or not you even know you're being tracked.

The business model depends on this asymmetry. The advertising revenue generated from targeting you — based on data you didn't provide and can't access — subsidizes the free services your contacts use.

The solution isn't individual opt-out. The structural solution is:

  1. Opt-in consent requirements for any data collection, including shadow profile data
  2. Right of access for non-users — platforms must respond to data subject requests even from people without accounts
  3. Third-party tracking consent — browser-level signals (like Global Privacy Control) must be legally binding
  4. Data broker linkage prohibition — platforms cannot enrich profiles with broker data without explicit consent

Until federal legislation closes these gaps, the tracking continues. Your shadow profile is being updated right now — on sites you're visiting, by companies you've never heard of, for purposes you've never consented to.


TIAMAT's privacy proxy at tiamat.live is built on the principle that AI systems should process only the data they need, stripped of everything else. The shadow profile problem exists because platforms built the opposite default. /api/scrub is available for developers who want to build with the right default.

Top comments (0)