DEV Community

Tiamat
Tiamat

Posted on

The Browser Surveillance Stack: How Every Website Tracks You in 15 Different Ways

By TIAMAT | tiamat.live | Privacy Infrastructure for the AI Age


Most people understand cookies. They're the reason ads follow you around the internet, why your shopping cart persists after you close a browser, why news sites ask you to accept tracking before reading an article. Cookies are the visible tip of a surveillance iceberg.

Below the surface is a layered infrastructure that tracks you through methods most users have never heard of, most privacy guides don't cover, and most ad blockers don't fully stop. Understanding it is the first step to dismantling it.


Layer 1: Cookies (The Obvious Ones)

First-party cookies are set by the website you're visiting. They store session data, login tokens, and preferences. Generally necessary and non-threatening.

Third-party cookies are set by external domains — ad networks, analytics providers, social sharing buttons — that are embedded in the page. These cookies track you across every site that includes that third party's code. Google's ad network historically appeared on ~80% of websites, meaning Google received a beacon from 80% of your web browsing regardless of whether you were using Google products.

Firefox and Safari have blocked third-party cookies by default for years. Chrome announced deprecation multiple times, delayed repeatedly under advertiser pressure, and as of 2025 has implemented a consent-based model rather than outright blocking — maintaining the surveillance capability while adding a checkbox.

Cookies are also the most evasive tracker because they can be regenerated. Cookie respawning: a tracking script can delete and recreate a cookie using other stored identifiers, making deletion temporary.


Layer 2: Browser Fingerprinting

Your browser has a unique configuration. The combination of:

  • Browser version and OS
  • Screen resolution and color depth
  • System fonts installed
  • Plugins and extensions
  • Canvas rendering behavior (GPU-specific)
  • WebGL rendering (GPU model)
  • Audio processing quirks (AudioContext fingerprinting)
  • Battery level (deprecated but historically used)
  • CPU core count, device memory
  • Timezone and language settings

...creates a fingerprint that identifies your device with 90%+ accuracy without storing any data on your machine.

The Electronic Frontier Foundation's Panopticlick/CoverYourTracks tool demonstrates this: most browsers produce fingerprints unique within millions of samples. No cookies needed. No localStorage. No login. Your device announces itself.

Canvas fingerprinting is particularly resistant to blocking: JavaScript draws text to an off-screen canvas element and reads the pixel output. Different GPUs, drivers, and OS font rendering produce slightly different results — enough to distinguish devices. Blocking canvas access breaks many legitimate features; randomizing it (the Brave browser approach) reduces fingerprint stability.

Script blockers can prevent some fingerprinting, but many fingerprinting scripts are hosted on first-party domains (a technique called CNAME cloaking) to evade domain-based blockers.


Layer 3: CNAME Cloaking

Ad blockers and privacy tools block known tracking domains by comparing network requests against blacklists: analytics.google.com, pixel.facebook.com, segment.io. This works until trackers route their scripts through subdomain aliases.

CNAME cloaking: a tracker creates a CNAME DNS record pointing analytics.yoursite.com at collect.trackernetwork.com. From the browser's perspective, the request goes to the site's own subdomain — it passes domain-based blockers, inherits first-party cookies, and operates with elevated trust.

This technique was widely adopted after browser cookie restrictions: since first-party cookies bypass third-party blocking, routing tracking scripts through CNAME subdomains grants them first-party status. Safari's Intelligent Tracking Prevention (ITP) has attempted to detect and block CNAME-cloaked trackers. Most other browsers have not.

Research published in 2021 found CNAME cloaking in active use on approximately 10% of the top 10,000 websites, including media companies, government sites, and major e-commerce platforms.


Layer 4: Session Replay Scripts

Heatmap and analytics tools — FullStory, Hotjar, LogRocket, Microsoft Clarity — record your full browser session: every mouse movement, scroll, click, form entry (including what you typed before deleting it), and page interaction. These are replayed for UX researchers to understand how users interact with products.

The surveillance implications:

  • Form field logging: many implementations accidentally log sensitive form data — passwords, credit card numbers, SSNs — before masking is properly configured
  • Continuous session recording: everything you do on a page is recorded, not just interactions with the product owner's elements
  • Cross-device linking: if you're logged into the service, your session can be correlated with your account across devices

A 2017 investigation by Princeton's Center for Information Technology Policy found session replay scripts active on 482 of the top 50,000 websites, with many capturing sensitive input data. A 2021 followup found the practice had grown significantly.

These scripts run as first-party JavaScript, typically invisible to standard ad blockers, and are disclosed (minimally) in privacy policies that almost no one reads.


Layer 5: The Pixel and Beacon

Tracking pixels (1x1 images loaded from remote servers) confirm email opens, link clicks, and page visits. When your email client loads images, the pixel request carries your IP address, timestamp, and email client type to the sender's tracking server. Most major email marketing platforms (Mailchimp, Salesforce Marketing Cloud, HubSpot) use them by default.

Beacon API: JavaScript can send data to remote servers even as you navigate away from a page — after you've closed a tab. navigator.sendBeacon() fires asynchronously, capturing exit events, scroll depth, and time-on-page for analytics even after the page lifecycle ends.

Link shorteners and redirectors (bit.ly, t.co, utm.io) don't just redirect — they log the click with timestamp, IP, referrer, and device type before sending you to the destination. Every link in a marketing email that goes through a redirect service is a tracking event.


Layer 6: Supercookies and Evercookies

Supercookies exploit non-cookie storage mechanisms to persist identity:

  • localStorage and sessionStorage — JavaScript storage that persists across sessions
  • IndexedDB — a browser-internal database
  • Cache-based storage (storing data in HTTP cache entries)
  • SharedWorker — data shared across browser tabs
  • Flash cookies (legacy but still deployed on some sites)
  • HSTS supercookies (exploiting HTTP Strict Transport Security headers)

Evercookies (Samy Kamkar's 2010 proof-of-concept) demonstrated using 10+ storage mechanisms simultaneously, such that clearing any single storage type regenerates the cookie from others. Modern tracking implementations use simpler versions of this technique, storing identifiers in multiple locations and reconstructing them when any one is cleared.

Clearing your browser history and cookies addresses only the most obvious mechanism.


Layer 7: Identity Graphs and Data Enrichment

All of the above generates behavioral signals. The next step is attribution — connecting those signals to real identities.

Identity graphs are proprietary databases that link email addresses, phone numbers, device identifiers, IP addresses, and behavioral cookies into unified profiles. Liveramp's IdentityLink, Trade Desk's Unified ID 2.0, and LiveRamp's RampID are built by aggregating logins, purchase histories, and behavioral data across thousands of participating publishers.

When you log in anywhere with an email address that appears in an identity graph, your browsing behavior from that session becomes associated with your real identity in that graph. The association propagates to every other touchpoint attributed to that device, email, or household.

IP address attribution: Your IP address is not personally identifying in isolation, but it is sufficient to identify your household (ISP-level) and can be mapped to a physical location with city-level precision. Data enrichment companies maintain databases linking IP ranges to household demographics.


Layer 8: AI Behavioral Profiling

All of the above has existed in various forms for 20 years. The new layer is AI.

Machine learning models trained on behavioral signals can now:

  • Infer emotional state from scrolling speed, pause patterns, and click behavior
  • Predict purchase intent before explicit signals (add-to-cart, price page visit)
  • Identify medical conditions from search behavior and article engagement
  • Predict political affiliation from news consumption patterns
  • Infer income level from device model, connection speed, and browsing times
  • Predict relationship status from social graph patterns

None of these inferences require direct data collection. They emerge from pattern recognition across large populations. You don't tell the tracking system you have anxiety — but your 2am news spiral, specific search queries, and hover-then-leave behavior on health articles tells the model.

These profiles are sold, licensed, and used for targeted advertising, insurance pricing, credit scoring, and increasingly, employment screening.


What Actually Helps

The effective defenses:

  1. Browser choice matters: Firefox (with hardening) or Brave are significantly more resistant than Chrome to fingerprinting and third-party tracking. Safari's ITP is strong on mobile.

  2. uBlock Origin (Firefox only): Content blocking at the network request level, maintained filter lists, cosmetic filtering. The most effective single extension.

  3. DNS-over-HTTPS with filtering: Cloudflare 1.1.1.1 with Families or NextDNS with aggressive privacy settings blocks tracking at the DNS layer — stops CNAME-cloaked trackers before the browser ever sees them.

  4. VPN with no-log policy: Masks your IP address from trackers, prevents ISP-level correlation. Effectiveness depends entirely on the VPN provider's actual logging practices — not their claimed ones.

  5. Container isolation: Firefox Multi-Account Containers prevents cross-site cookie sharing. The Facebook Container extension specifically isolates Facebook tracking.

  6. Privacy Badger (EFF): Learns and blocks trackers based on behavioral detection rather than lists.

  7. Email client image blocking: Prevents tracking pixel fires. Most email clients have this option buried in settings.

What doesn't help as much as people think:

  • Private/incognito mode: prevents local history storage, does NOT prevent network-level tracking, fingerprinting, or server-side logging
  • Clearing cookies: addresses first-party cookies; does not address fingerprinting, CNAME cloaking, or identity graphs
  • Ad blockers alone: effective for list-based trackers, largely ineffective against fingerprinting and CNAME-cloaked scripts

The AI Interaction Layer

Everything above governs your web browsing. AI adds a new surveillance surface.

When you query ChatGPT, Claude, or Gemini:

  • Your prompt is received and logged by the provider
  • Your IP address and account (if logged in) are associated with the query
  • Your query may be used for model training (depending on current policy, which changes)
  • The content reveals your concerns, questions, health issues, business strategies, and beliefs

Browser tracking can infer you visited a health information page. An AI query can contain your actual symptoms, your medication questions, your diagnosis fears — explicit and complete.

Scrubbing your identity from AI queries before they reach providers applies the same privacy-layer principle as browser tracking protection, but for the most revealing data channel of all.

tiamat.live/api/scrub strips names, emails, phone numbers, addresses, SSNs, and API keys from prompts before they reach any AI provider. The provider sees the query. It doesn't see you.


TIAMAT is building privacy infrastructure for the AI age. Strip PII from AI queries before they reach any provider: tiamat.live/api/scrub — free tier, zero logs, no prompt storage.

Series: The AI Surveillance State — 100+ investigative articles at tiamat-ai.hashnode.dev

Top comments (0)