William Weiner

Posted on Apr 9

Tracking, Propagation Attacks, and What We Found in Real Email Traffic

#security #privacy #webdev

A few weeks ago I posted about finding the same per-recipient identifier in three independent places inside a single marketing email -- pixel, click redirects, and technical headers -- and asked what other vectors people had seen.

Surveying surfaced some good ones: CSS media queries, hidden data attributes, MIME boundary patterns. I went looking. Here is what I found in real production traffic, and what turned out to be harder to close than expected.

1. CSS Tracking Is Broader Than I Thought

The original post focused on <img src=> pixels and click-redirect wrappers. CSS-based tracking is a separate attack surface that image-blocking tools don't touch, and it is more varied than the obvious background-image case.

The obvious case (external stylesheet link):

<link rel="stylesheet"
  href="https://tracker.esp.com/open/PERRECIPIENTTOKEN.css">

When the email client loads this stylesheet, the sender's server logs the open. The member never sees this. Apple Mail Privacy Protection pre-fetches images but not stylesheets, so this survives MPP.

The less obvious cases:

/* @import in a <style> block */
@import url('https://tracker.esp.com/t/PERRECIPIENTTOKEN.css');

/* Any property that accepts a URL, not just background-image */
li { list-style-image: url('https://tracker.esp.com/PERRECIPIENTTOKEN'); }
div { border-image: url('https://tracker.esp.com/PERRECIPIENTTOKEN'); }

The one I did not expect -- CSS custom property indirection:

:root {
  --emp-track: url('https://tracker.esp.com/PERRECIPIENTTOKEN');
}
p.hero::before {
  content: var(--emp-track);
}

The tracking URL is stored in a custom property and referenced indirectly. A filter that looks for url( in property values like background-image will not find it here. The custom property declaration looks innocuous; only following the reference chain reveals the external URL.

I have not yet seen this in a real email, but it is a documented evasion technique in 2025-2026 security research and straightforward to generate from a template.

Beyond CSS: HTML presentation attributes

These are not CSS at all, which is why they were a separate discovery. HTML has legacy presentation attributes that also trigger automatic network fetches:

<!-- table background attribute -- HTML4 era, still renders in many clients -->
<table background="https://tracker.esp.com/PERRECIPIENTTOKEN">

<!-- video poster -- loads even if the user never plays the video -->
<video poster="https://tracker.esp.com/PERRECIPIENTTOKEN" src="...">

<!-- object/applet legacy attributes -->
<object data="https://tracker.esp.com/PERRECIPIENTTOKEN">

The background= table attribute is the one I actually found in production email. It is visually identical to a background set via CSS, but the attribute lives outside any style block and is missed by a CSS-only filter pass.

2. The Reply Chain Attack: Propagation Tracking

This one came up as I was working through the threat model and it is arguably more valuable to trackers than open tracking.

The setup: When a sender issues an email, their ESP assigns a Message-ID that typically contains per-send and per-recipient tokens:

Message-ID: <campaign.abc123.recipient.def456@esp.example.com>

When a member replies to that email, their client includes the original Message-ID in the References header:

References: <campaign.abc123.recipient.def456@esp.example.com>
In-Reply-To: <campaign.abc123.recipient.def456@esp.example.com>

If the sender has instrumented their receiving infrastructure to parse incoming References headers, they can:

Recognize their own campaign token (campaign.abc123)
Recover the per-recipient token (recipient.def456)
Link the reply back to the original send record

This maps who replied to whom. For a relay service like the one I run, the risk is amplified: if a member receives a tracked email through the relay and then replies, the relay's outbound message carries the original sender's Message-ID in the References chain. The original sender now knows the reply chain came through a relay and what user emails it came through.

The fix is to hash all sender-issued Message-IDs in the outbound References and In-Reply-To headers before delivery. The hash is irreversible (sender cannot recover the token), consistent (same input always produces the same output, so member-side threading still works), and emitted under your own namespace so it cannot be confused with any other IDs. This rewrite is valid to do since this code is for an email relay, not an email forwarding service.

# Original References in outbound reply -- before fix
References: <campaign.abc123.recipient.def456@esp.example.com>

# After hashing
References: <emp-a3f9b2c1d4e5f678@emparrot.com>

Our delivery IDs pass through unchanged -- members thread on these and they contain no sender-controlled token.

3. Direct-Destination URL Tokens (No Redirect Needed)

The original post covered click-redirect wrappers: links rewritten through a tracking server before reaching the real destination. There is a simpler variant that bypasses the redirect entirely.

Real example from a Nextdoor email (April 2026), every link included:

https://nextdoor.com/news_feed/?ct=ABCDEF123&ec=GHIJ456&token=UNIQUE789&auto_token=XYZ012&link_source_user_id=345678

The destination is real (nextdoor.com). There is no redirect. But the per-recipient token is in the query string and fires on click. Class A redirect unwrapping does not catch this because there is nothing to unwrap -- the destination IS the URL.

The naive fix is a known-bad-parameter blocklist (utm_*, fbclid, etc.), but that loses the arms race: senders rename parameters or use unfamiliar ones. Nextdoor's ct=, ec=, auto_token=, and link_source_user_id= would not appear on most blocklists.

The approach that works without needing to be right about every parameter: strip all query parameters from all links, and back up the original URLs in an attachment when non-utm_* parameters were present. The stripped link is the primary path. The backup attachment is a recovery mechanism for action links (unsubscribe, confirm) where tokens may be functionally required.

# Link in email body after stripping
https://nextdoor.com/news_feed/

# emparrot-original-links.txt attachment
[View your feed] https://nextdoor.com/news_feed/?ct=ABCDEF123&ec=GHIJ456&...
[Unsubscribe]    https://nextdoor.com/email/unsubscribe/?token=UNIQUE789&...

The utm_* parameters are stripped silently with no backup -- the destination always works without them.

4. Meta Tag Token Embedding

Several other examples exploited data attributes. Meta tags turned out to be a closely related case worth separating out.

<meta name="x-em-id" content="PERRECIPIENTTOKEN">
<meta name="campaign" content="CAMPAIGN123/PERRECIPIENTTOKEN">
<meta name="list-id" content="LIST456/MEMBER789">

These do not trigger a network fetch. They are completely invisible to image-blocking tools and to users. But they survive into quoted reply content -- if a member replies and includes the original email body, the quoted section carries these meta tags forward.

The threat model is the same as the References header case: the sender's receiving infrastructure can recognize their own tokens in the reply and reconstruct network information without ever seeing an address.

The fix is a whitelist. The set of meta tags that have any legitimate function in email is small and stable: charset, viewport, http-equiv for content-type, format-detection, color-scheme, theme-color, description, author, generator, robots. Strip everything else.

5. What Turned Out to Be Hard

CSS media query conditional loading -- The vector is real:

@media (prefers-color-scheme: dark) {
  body { background-image: url('https://tracker.esp.com/dark/PERRECIPIENTTOKEN'); }
}
@media (prefers-color-scheme: light) {
  body { background-image: url('https://tracker.esp.com/light/PERRECIPIENTTOKEN'); }
}

This leaks dark/light mode preference in addition to confirming the open. I have not seen this in a real email yet, but it is described in 2026 privacy research as an emerging pattern. The fix is the same (strip external URLs from all CSS properties) but the threat is worth naming: it is not just "did they open" but "what device profile are they using."

Body structural fingerprinting is the vector I cannot close without semantic understanding of the content. Unique per-recipient signals embedded in the body itself: specific inline color values, attribute ordering, whitespace patterns, synonym substitution. These survive into reply quotes and surface unexposed users when those replies propagate. Partially mitigated by hashing CSS class names (destroys class-based fingerprints in quoted content), but full removal is out of scope.

Class B opaque link IDs (c.gle, click.mailchimp.com, and similar) cannot be resolved without fetching the tracking URL, which would itself be tracked. The right behavior is to flag these with a warning before the member clicks rather than attempting to unwrap them. Fetching to resolve is not an option.

What We Shipped

For full disclosure: I built these fixes into the relay service EMail Parrot. The analysis above drove the release this month. The implementation handles all of sections 1-4. Body structural fingerprinting and Class B unwrapping remain open.

A writeup of the release is at the EMail Parrot blog if that is useful context: Pixels Were Just the Beginning

New Questions for the Community

The original post asked about CSS media queries, hidden data attributes, and MIME boundary patterns. The first two turned out to be real and addressed above. MIME boundary patterns remain on the list.

A few things I am still thinking about:

MIME boundary reuse: Is there evidence of ESPs using predictable MIME boundaries that encode per-recipient data? I have not seen it in real traffic but it seems plausible. Our relay rebuilds the email from scratch and replaces all MIME boundaries so this threat is architecturally not present for EMail Parrot.
Quoted-printable encoding patterns: Can per-recipient tokens be embedded through deliberate encoding choices (encoding specific characters to =XX form or not) that survive relay and appear in quoted content?
cid: Content-ID attributes: MIME attachment Content-IDs are sender-chosen. I am treating this as low risk but would be interested in evidence either way.
plain-text URL tracking: Class A redirect URLs appearing in plain-text body parts rather than HTML hrefs. Lower priority since plain text is less common in tracked email, but not zero.

Has anyone seen these in the wild? And are there vectors I have missed entirely?

Cross-posted from the EMail Parrot engineering blog. I am the sole developer of the service and this work grew directly out of the threat analysis in the original post.

Top comments (3)

Privacy.Fish • May 18

I’ve seen the direct-destination token pattern in the wild more than once: real destination URL, no redirect wrapper, but a per-recipient token tucked into query params on every link. Unsubscribe/preference-center URLs are especially easy to overlook because the token is functional and identifying at the same time.

One vector I’d add to the list is calendar/contact attachments. ICS invites and vCards can carry fairly sticky organizer/attendee IDs, URLs, and custom X-* fields, and people often import or forward them without thinking of them as “email content”. It is not the same as a pixel, but it can still propagate identity/context outside the cleaned HTML body.

The References/In-Reply-To bit is the most interesting part here though. A lot of people treat tracking as a rendering problem, but reply-chain metadata turns it into a conversation graph problem. Hashing sender-controlled Message-IDs under the relay namespace makes sense if you control both delivery and outbound reply handling. The tricky edge case I’d watch is support/ticketing systems that depend on exact original Message-ID/References values for threading on their side.

William Weiner • May 20

ICS rewriting is in the works right now. A little tricky but not impossible. We'll make is look like the relay address is the one accepting an invite - not the user's real email address.

There is a compromise here - can't break the message-id linkage without changing the message-id. Most systems today no longer rely on message-id (in-reply-to) as the only signal of ticketing chain.

Privacy.Fish • May 23

ICS is exactly where the alias model starts touching someone else’s workflow. Making the relay address the invitee feels like the right default to me, as long as the “what breaks” part is explicit: calendar threading may survive, but the original address should not leak just to keep invites pretty.