DEV Community

Cover image for The "Privacy-First" Mirage: Why Your Analytics Hash is Still Fingerprinting
Zenovay
Zenovay

Posted on

The "Privacy-First" Mirage: Why Your Analytics Hash is Still Fingerprinting

"Privacy-first" has become the favorite marketing buzzword for every new analytics tool. But as developers, we shouldn't trust the landing page we should trust the implementation.

I've been building Zenovay, a lean alternative to the GA4 nightmare, and I spent a lot of time auditing how "privacy-friendly" tools actually handle visitor identity. What I found was a lot of "Fingerprinting Lite" masquerading as privacy.

The Trap: User Agent Persistence

Most indie analytics tools use a hash to identify unique visitors without storing an IP. You'll often see logic similar to this in their backends (let's call this the "GhostlyX" approach):

// The problematic approach
$visitorHash = hash_hmac('sha256', $ip . $ua . $site->id . $today, config('app.key'));
Enter fullscreen mode Exit fullscreen mode

The issue is the $ua (User Agent). By including the User Agent in the hash, you are effectively fingerprinting the user's device.

If a user moves from their home Wi-Fi to a 5G network, their IP changes but their User Agent stays identical. The tool can still link those sessions. This creates a persistent identifier that survives network changes, which is exactly what "privacy-first" is supposed to prevent.

A Cleaner Approach: Daily Rotating Salts

When we built the tracking engine for Zenovay, we decided: if the data is supposed to be anonymous, it should be actually anonymous. No fingerprints. No persistence beyond 24 hours.

We use the Web Crypto API (available natively in edge environments like Cloudflare Workers) to process the IP the millisecond it hits our server.

The Implementation

async function hashIPForVisitor(ip: string, websiteId: string): Promise<string> {
    // Daily salt ensures no long-term tracking
    const today = new Date().toISOString().slice(0, 10); // YYYY-MM-DD

    const encoder = new TextEncoder();
    // We combine IP, Website ID, and the daily salt.
    // NOTICE: No User Agent. No device fingerprinting.
    const data = encoder.encode(`${ip}|${websiteId}|${today}`);

    const hashBuffer = await crypto.subtle.digest('SHA-256', data);
    const hashArray = Array.from(new Uint8Array(hashBuffer));

    return hashArray
        .map(b => b.toString(16).padStart(2, '0'))
        .join('');
}
Enter fullscreen mode Exit fullscreen mode

Why this is technically superior:

Zero Persistence Because we include ${today}, the hash for the same visitor changes at midnight. We have no way of "following" a user across multiple days.

No Fingerprinting By omitting the User Agent, we ensure we're only measuring "a connection at a specific time," not "a specific device."

Database Integrity In our Supabase schema, ip_address is hard-coded to null. The raw IP never touches the disk.

const visitorHash = await hashIPForVisitor(clientIP, website.id);

await supabase.from('hits').insert({
    website_id: website.id,
    ip_hash: visitorHash,
    ip_address: null, // Raw IP is never stored
    pathname: request.url.pathname,
    // ...
});
Enter fullscreen mode Exit fullscreen mode

The Architecture: Cloudflare + Supabase + Next.js

Building on a modern edge stack allowed us to keep the entire tracking script under 1KB.

  • Cloudflare Workers Handle incoming requests and hash at the edge, before any data leaves the user's network hop.
  • Supabase (PostgreSQL) Stores hashed events. No bloated tracking data means queries stay fast even at millions of rows.
  • Next.js (App Router) Powers the dashboard, connecting hits to real-time Stripe revenue attribution.

Privacy Marketing vs. Privacy Engineering

The difference comes down to transparency. If a tool doesn't tell you exactly how they generate visitor IDs, they're probably fingerprinting you.

Real privacy isn't about obfuscating data it's about making sure sensitive data is gone before it ever hits your database.

I'm bootstrapping Zenovay to prove that you can get world-class business insights and revenue attribution without turning your users into a product.


How are you handling PII in your tracking stacks? Are you using the Web Crypto API or sticking to traditional server-side hashing? Drop it in the comments.

Top comments (0)