ahmet gedik

Posted on Jun 27

Building Per-Second Video Heatmap Analytics with HyperLogLog in Postgres

#postgres #analytics #php #performance

Every player on our network fires a heartbeat every five seconds: video ID, an anonymized viewer hash, the region the request landed in, and the offset into the video. Across the 8 regions we operate, that is roughly 40 million events on a normal day. The product question that kicked this whole project off sounded innocent: for every second of every video, how many unique viewers were actually watching? Not pageviews. Not heartbeats. Uniques — so we could render a heatmap under the scrubber and show editors exactly where audiences bail.

I run the backend for TrendVidStream, a global multi-region video streaming discovery site. Our serving stack is deliberately boring — PHP 8.4, SQLite with FTS5 for discovery search, cron jobs per region, FTP-based deploys to cheap shared hosts. None of that is exotic, and that's the point. But "distinct viewers per second per video per region per day" is a cardinality problem, and cardinality problems do not care how boring your stack is. This post is how we solved it with the postgresql-hll extension on a single modest Postgres box, what the schema looks like, and the knobs that actually matter.

The Query That Couldn't

The naive design is a raw events table and COUNT(DISTINCT viewer_hash) grouped by (video_id, second_bucket). We tried it, because you should always try the dumb thing first. The numbers killed it quickly:

40M rows/day × 90 days retention ≈ 3.6 billion rows just to keep the question answerable.
COUNT(DISTINCT) forces Postgres to materialize the distinct set per group — either a sort or a hash table per (video, bucket) pair. For a popular video with 200k uniques and 720 five-second buckets, that's a lot of hashing for one chart render.
The heatmap endpoint is user-facing. Editors open it dozens of times a day. A 9-second analytical query behind a dashboard is a 9-second analytical query behind every dashboard load.
The killer: uniques don't roll up. You cannot store "uniques per hour" and sum them to get "uniques per day," because the same viewer appears in multiple hours. Pre-aggregated counts are additive; distinct counts are not. So classic rollup tables — the trick that saves every other analytics workload — simply don't work here.

That last bullet is the real problem. Storage and query time are engineering annoyances; non-additivity is a math wall. What we needed was a representation of "the set of viewers" that is tiny, mergeable, and good enough on accuracy. That is exactly what HyperLogLog is.

HyperLogLog Without the Paper

The intuition fits in two paragraphs. Hash every viewer ID to a uniform 64-bit number. In a stream of random bit-strings, seeing a hash that starts with k leading zero bits is roughly a 1-in-2^k event — so if the longest run of leading zeros you've observed is 20, you've probably seen on the order of a million distinct values. One observation is a terrible estimator (one lucky hash ruins it), so HLL splits the hash space into 2^log2m registers, tracks the max leading-zero count per register, and combines them with a harmonic mean plus bias correction. With 2^14 = 16,384 registers the standard error is about 1.04 / sqrt(16384) ≈ 0.81%.

Two properties make it perfect for heatmaps. First, the sketch is fixed-size: ~10–12 KB at our settings whether it has seen 50 viewers or 50 million. Second — and this is the property that fixes the rollup wall — the union of two sketches is just the register-wise max, it's lossless relative to the sketches themselves, and it's exactly the sketch you'd have built from the combined stream. Per-region sketches union into a global sketch. Per-day sketches union into a weekly sketch. Distinct counts become additive in the only sense that matters.

Schema Design with postgresql-hll

We use the Citus postgresql-hll extension (apt install postgresql-17-hll on Debian-family, or build from the GitHub repo). The design is two tables: a dumb append-only staging table that ingestion writes to, and the sketch table that everything reads from.

CREATE EXTENSION IF NOT EXISTS hll;

-- Raw heartbeats land here. No indexes except the identity column:
-- this table exists to be appended to and drained, nothing else.
CREATE TABLE beacon_staging (
    id            bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    video_id      text NOT NULL,
    viewer_hash   text NOT NULL,
    region        text NOT NULL,
    second_bucket integer NOT NULL,
    received_at   timestamptz NOT NULL DEFAULT now()
);

-- One sketch per (video, region, day, 5-second bucket).
CREATE TABLE video_heatmap (
    video_id      text NOT NULL,
    region        text NOT NULL,
    day           date NOT NULL,
    second_bucket integer NOT NULL,
    viewers       hll NOT NULL DEFAULT hll_empty(14, 5),
    PRIMARY KEY (video_id, region, day, second_bucket)
);

-- The rollup statement the cron runs: fold staging rows into sketches.
INSERT INTO video_heatmap (video_id, region, day, second_bucket, viewers)
SELECT video_id, region, received_at::date, second_bucket,
       hll_add_agg(hll_hash_text(viewer_hash), 14, 5)
FROM beacon_staging
GROUP BY 1, 2, 3, 4
ON CONFLICT (video_id, region, day, second_bucket)
DO UPDATE SET viewers = video_heatmap.viewers || EXCLUDED.viewers;

The (14, 5) arguments are log2m and regwidth — 16,384 registers of 5 bits each. The || operator on two hll values is sketch union, which is what makes ON CONFLICT ... DO UPDATE the entire merge strategy. Each rollup run builds fresh sketches from the new staging rows and unions them into whatever is already stored. Idempotent in spirit, append-only in practice, no locking gymnastics.

A sizing sanity check before moving on: a 10-minute video at 5-second buckets is 120 rows per region per day. With 8 regions that's 960 rows/video/day, each ~10 KB dense — under 10 MB per video per day worst case, and far less in reality because most buckets stay in the sparse encoding (more on that below). Our entire 90-day heatmap dataset for ~12,000 tracked videos fits in about 40 GB. The raw-events version was projected at 1.2 TB before indexes.

Beacon Ingestion in PHP 8.4

The ingest endpoint is plain PHP because our whole serving tier is plain PHP — it ships over the same FTP deploy pipeline as everything else, no build step, no daemon to babysit on shared hosting. The only rules: validate hard, never make the player retry, and never store raw PII. The viewer hash is salted with a value that rotates daily, so a "unique viewer" is really "unique viewer-day" and the hash is useless after 24 hours. That's a privacy feature and an accuracy feature — daily salts mean per-day sketches are exactly what you think they are.

<?php
declare(strict_types=1);

// beacon.php — receives player heartbeats, appends to staging.
const REGIONS = ['us-east','us-west','eu-west','eu-north','sa-east','ap-south','ap-east','oceania'];
const BEACON_SALT = 'rotate-me-via-env-in-production';

final readonly class Beacon
{
    public function __construct(
        public string $videoId,
        public string $viewerHash,
        public string $region,
        public int $secondBucket,
    ) {}

    public static function fromRequest(array $post, array $server): self
    {
        $second = filter_var($post['t'] ?? null, FILTER_VALIDATE_INT,
            ['options' => ['min_range' => 0, 'max_range' => 86_400]]);
        $video = preg_replace('/[^A-Za-z0-9_-]/', '', (string)($post['v'] ?? ''));
        if ($second === false || $video === '') {
            throw new InvalidArgumentException('bad beacon');
        }
        return new self(
            videoId: substr($video, 0, 16),
            viewerHash: hash('sha1',
                ($server['REMOTE_ADDR'] ?? '')
                . ($server['HTTP_USER_AGENT'] ?? '')
                . date('Y-m-d') . BEACON_SALT),
            region: in_array($post['r'] ?? '', REGIONS, true) ? $post['r'] : 'unknown',
            secondBucket: intdiv($second, 5) * 5,
        );
    }
}

try {
    $b = Beacon::fromRequest($_POST, $_SERVER);
} catch (InvalidArgumentException) {
    http_response_code(204); // swallow garbage; players must never retry-loop
    exit;
}

$pdo = new PDO('pgsql:host=127.0.0.1;dbname=heatmap', 'beacon', getenv('PG_PASS'), [
    PDO::ATTR_ERRMODE    => PDO::ERRMODE_EXCEPTION,
    PDO::ATTR_PERSISTENT => true,
]);

$pdo->prepare(
    'INSERT INTO beacon_staging (video_id, viewer_hash, region, second_bucket)
     VALUES (?, ?, ?, ?)'
)->execute([$b->videoId, $b->viewerHash, $b->region, $b->secondBucket]);

http_response_code(204);

Notice what this endpoint does not do: it does not touch the hll type at all. Hashing into sketches at request time would mean a read-modify-write on a hot row for every heartbeat of a trending video — a lock convoy with extra steps. Appending to an unindexed staging table is the cheapest write Postgres can do, and it decouples ingest latency from rollup cost completely. If the rollup falls behind, staging grows and nobody notices; the player still gets its 204 in single-digit milliseconds.

Rolling Up on the Multi-Region Cron

We already run a cron fleet — every region has scheduled jobs at staggered offsets for video fetching and index refresh, so the heatmap rollup just joined the rotation. Each region's collector drains its own staging table into the central sketch store every 10 minutes. Because sketch union is commutative and associative, the regions don't need to coordinate at all: whatever order the eight jobs land in, the merged sketch is identical.

#!/usr/bin/env python3
"""rollup_heatmap.py — cron: */10 * * * *, one instance per region."""
import os
import sys
import time

import psycopg

BATCH = 500_000

ROLLUP = """
WITH moved AS (
    DELETE FROM beacon_staging
    WHERE id IN (SELECT id FROM beacon_staging ORDER BY id LIMIT %s)
    RETURNING video_id, viewer_hash, region, second_bucket, received_at
)
INSERT INTO video_heatmap (video_id, region, day, second_bucket, viewers)
SELECT video_id, region, received_at::date, second_bucket,
       hll_add_agg(hll_hash_text(viewer_hash), 14, 5)
FROM moved
GROUP BY 1, 2, 3, 4
ON CONFLICT (video_id, region, day, second_bucket)
DO UPDATE SET viewers = video_heatmap.viewers || EXCLUDED.viewers
"""

def main() -> int:
    dsn = os.environ["HEATMAP_DSN"]
    batches = 0
    with psycopg.connect(dsn) as conn:
        while True:
            t0 = time.monotonic()
            with conn.cursor() as cur:
                cur.execute(ROLLUP, (BATCH,))
            conn.commit()
            batches += 1
            print(f"batch {batches} folded in {time.monotonic() - t0:.1f}s")
            with conn.cursor() as cur:
                cur.execute("SELECT EXISTS (SELECT 1 FROM beacon_staging)")
                if not cur.fetchone()[0]:
                    break
    return 0

if __name__ == "__main__":
    sys.exit(main())

The DELETE ... RETURNING inside a CTE is the move worth stealing: drain-and-fold is one atomic statement, so a crash mid-batch loses nothing — either the rows are gone and folded, or neither. The 500k batch cap keeps each transaction's lock footprint and WAL burst bounded. On our hardware (4 vCPU, NVMe) a full batch folds in 6–9 seconds, so even a backlog from a cron outage clears in a few minutes.

One operational note for fellow travelers on modest infrastructure: this script deploys exactly like our PHP does — lftp push from the deploy script, staggered cron entries per region so eight rollups never pile onto the database in the same minute. No orchestrator, no queue broker. Postgres is the queue, and DELETE ... RETURNING is the consumer.

Serving the Heatmap

The read path is where HLL pays rent. The frontend wants one array per video: bucket offset, global unique count, and a per-region breakdown for the region toggle. With sketches stored per region, the global number is a union computed at query time — hll_union_agg across the eight region rows of each bucket, then hll_cardinality on the result. No distinct-set materialization, no re-scan of raw events; it's register-wise max over a few KB per row.

<?php
declare(strict_types=1);

// GET /api/heatmap?video=abc123&day=2026-06-12
$pdo = new PDO('pgsql:host=127.0.0.1;dbname=heatmap', 'reader', getenv('PG_PASS'),
    [PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION]);

$video = preg_replace('/[^A-Za-z0-9_-]/', '', $_GET['video'] ?? '');
$day   = $_GET['day'] ?? date('Y-m-d');
if ($video === '' || !preg_match('/^\d{4}-\d{2}-\d{2}$/', $day)) {
    http_response_code(400);
    exit;
}

$sql = <<<'SQL'
SELECT second_bucket,
       hll_cardinality(hll_union_agg(viewers))::bigint            AS global_uniques,
       jsonb_object_agg(region, hll_cardinality(viewers)::bigint) AS by_region
FROM video_heatmap
WHERE video_id = :video AND day = :day
GROUP BY second_bucket
ORDER BY second_bucket
SQL;

$stmt = $pdo->prepare($sql);
$stmt->execute([':video' => $video, ':day' => $day]);

$buckets = [];
foreach ($stmt as $row) {
    $buckets[] = [
        't'       => (int) $row['second_bucket'],
        'uniques' => (int) $row['global_uniques'],
        'regions' => json_decode($row['by_region'], true),
    ];
}

header('Content-Type: application/json');
header('Cache-Control: public, max-age=600, stale-while-revalidate=1800');
echo json_encode(['video' => $video, 'day' => $day, 'buckets' => $buckets]);

For a 30-minute video this query touches at most 2,880 rows (360 buckets × 8 regions) via the primary key and returns in 15–40 ms — against the 9 seconds the COUNT(DISTINCT) version needed on a fraction of the retention window. We still front it with a 10-minute HTTP cache because the underlying sketches only change every rollup cycle anyway; the same stale-while-revalidate pattern we use on category pages applies unchanged.

Queries the Sketches Answer for Free

Once uniques are mergeable, a bunch of product questions stop requiring new pipelines. Two we use weekly:

-- 1) Where do viewers bail? Largest drop between consecutive buckets.
WITH curve AS (
    SELECT second_bucket,
           hll_cardinality(hll_union_agg(viewers)) AS uniques
    FROM video_heatmap
    WHERE video_id = 'abc123' AND day = '2026-06-12'
    GROUP BY second_bucket
)
SELECT second_bucket, uniques,
       uniques - lag(uniques) OVER (ORDER BY second_bucket) AS delta
FROM curve
ORDER BY delta ASC NULLS LAST
LIMIT 5;

-- 2) Audience overlap between two videos, via inclusion-exclusion:
--    |A ∩ B| = |A| + |B| - |A ∪ B|, and union is just the || operator.
SELECT hll_cardinality(a.h) AS video_a_uniques,
       hll_cardinality(b.h) AS video_b_uniques,
       hll_cardinality(a.h) + hll_cardinality(b.h)
           - hll_cardinality(a.h || b.h) AS shared_viewers
FROM (SELECT hll_union_agg(viewers) AS h FROM video_heatmap WHERE video_id = 'abc123') a,
     (SELECT hll_union_agg(viewers) AS h FROM video_heatmap WHERE video_id = 'def456') b;

The overlap query deserves a caveat: intersections via inclusion-exclusion inherit the error of three cardinality estimates, so the relative error on a small intersection of two large sets can be huge. We only surface overlap when the estimated intersection is at least ~10% of the smaller set, and we label it as an estimate in the UI. Unions are trustworthy; intersections are a party trick you must chaperone.

Sizing, Accuracy, and the Knobs That Matter

The extension exposes four parameters per sketch; two of them matter in practice.

log2m (we use 14): 2^14 registers ⇒ ~0.81% standard error and ~10 KB dense size. Going to 15 halves nothing useful (error drops to ~0.57%) while doubling storage. Going down to 11 (~1.6% error, ~1.3 KB) is defensible if you're tracking millions of objects; for a per-second heatmap, 1.6% wobble between adjacent buckets makes curves visibly noisy, so we paid for 14.
regwidth (we use 5): 5-bit registers handle cardinalities far beyond anything a video site sees. Don't overthink it.
Sparse-to-dense promotion is your storage savior. The extension stores small sketches in an EXPLICIT mode (literal hash list), then a compressed SPARSE mode, and only promotes to the full ~10 KB DENSE array past a threshold. The long tail of videos — buckets with 12 viewers — costs tens of bytes, not 10 KB. In our data, under 4% of sketch rows are dense; the average row is ~600 bytes.
Set parameters once, globally. Sketches with different log2m can't be unioned. Pin them in the column default and in every hll_add_agg call, and never "upgrade" them in place — you'd have to rebuild from raw data you deliberately threw away.
TOAST works in your favor. Dense sketches compress and store out-of-line automatically; the heap stays small enough that the heatmap working set lives in shared_buffers.

We validated accuracy the boring way before trusting it: ran staging-table COUNT(DISTINCT) and sketch cardinality side by side for two weeks on the top 50 videos. Worst observed error was 1.4%, median 0.6% — invisible on a heatmap whose y-axis exists to show shape, not census figures.

Pitfalls We Hit

Hash in the database, not the app. Our first version pre-hashed viewer IDs in PHP with a different function than hll_hash_text, which silently produces sketches that union fine and estimate garbage. Feed hll_hash_text the raw string and let the extension own the hashing.
Don't update sketches row-at-a-time. An early prototype did UPDATE ... SET viewers = viewers || hll_hash_text(...) per heartbeat. Hot videos turned one tuple into a write hotspot, and every update rewrote a multi-KB TOASTed value. Batch through staging; union once per cycle.
Watch staging bloat. A drained staging table is full of dead tuples. Autovacuum mostly copes, but after a backlog flush we schedule an explicit VACUUM beacon_staging from the same cron — on shared-host-grade hardware, the difference is measurable.
The daily salt defines your semantics. Because viewer hashes rotate at midnight UTC, cross-day unions count the same human twice. That's the deliberate trade we made for privacy; document it so nobody builds a "monthly unique viewers" KPI on top and presents it as people.
Mind the analytics/serving boundary. Our discovery search stays on SQLite FTS5 at the edge and is great at it; the mistake we almost made was forcing the same database to do cardinality analytics. Distinct-counting is the one workload where I'll happily run a second engine, because the sketch column is the only thing that makes the storage math work at all.

Conclusion

HyperLogLog turns the one non-additive metric in video analytics — unique viewers — into something you can pre-aggregate, merge across 8 regions without coordination, and serve from primary-key lookups in tens of milliseconds. The whole system is a staging table, one INSERT ... ON CONFLICT rollup, and a 10 KB ceiling per sketch, running on hardware that costs less per month than a team lunch. If your dashboards are gated on COUNT(DISTINCT) over an events table, you don't need a data warehouse first; you need 16,384 registers and the || operator. Trade 1% accuracy for three orders of magnitude in storage and latency — on a heatmap, nobody will ever see the 1%.

DEV Community