Most email open tracking in 2026 is broken. Apple Mail Privacy
Protection fires a fake open within seconds of delivery, before any
human sees the email. Corporate scanners do the same. Open rates run
2-3x inflated.
The engineering problem: how do you tell a real human open from a
machine pre-fetch given only the HTTP request metadata of the pixel
load?
*Signals available
*
Every open event arrives at the tracking endpoint with:
- Request IP
- User-Agent string
- Request timestamp (relative to email send)
- Accept-Language, Referer, other headers
Patterns by source
**
**Apple MPP pre-fetches:
- IP from Apple-attributable ranges (17.0.0.0/8 mostly)
- User-Agent: Mac/iOS native with Apple's tracking-relay format
- Timing: typically 30 seconds to 5 minutes after delivery
- No subsequent click activity
Corporate scanner pre-fetches (Defender, Mimecast, Proofpoint):
- IP from known scanner ranges (each vendor publishes these or they are discoverable via reverse lookup)
- User-Agent: scanner-specific signatures
- Timing: sub-5-second from delivery
- Multiple link and image requests within 1-3 second window from same IP
Gmail image proxy:
- IP from googleusercontent.com range
- User-Agent: Google bot signature
- Timing: variable
Real human opens:
- IP from residential or generic corporate range
- User-Agent: actual mail client used by a human
- Timing: rarely sub-30-second from delivery; clusters at typical inbox-check times
- Often followed by click activity within 30 minutes
*Model approach
*
A gradient-boosted classifier on the feature set above gives 95-98%
agreement with human-rated labels on a held-out test set. Output is a
confidence score (0-100%) which maps to Tier 1-5.
- False-positive rate (Tier 1 graded when it's a machine): <2%
- False-negative rate (Tier 4-5 graded when it's a human): <5%
The productionized version of the model + dashboard surface is at:
https://outsolvi.com/features/confidence-scoring
*The retraining problem
*
Apple keeps shifting MPP's IP block allocation. The model needs
retraining every few months as patterns drift. We've automated the
labeled-data collection so retraining is a 1-day job rather than a
1-week job.
Anyone else working on email signal filtering? Curious about your
approach to the drift problem specifically.
— Nate Summers
Co-Founder, Outsolvi
Top comments (0)