Building Edge-Side Video Personalization with Cloudflare Workers KV

#webdev #architecture #cloudflare #performance

Why move personalization to the edge

When you operate a video discovery site that serves audiences across continents, the personalization tax adds up fast. A single uncached request to origin to figure out "what should this user see on the homepage" can stack 80-200ms of round-trip latency on top of whatever rendering time you already pay. At DailyWatch we hit this wall once we crossed a few thousand requests per minute from Asia and Latin America. The origin was in a single region, and even with aggressive HTTP caching, the personalized slot at the top of the page kept punching holes in our cache hit ratio.

Edge personalization with Cloudflare Workers KV is the compromise that worked for us. KV is eventually consistent, low-write, and globally replicated — exactly the shape of a recommendation lookup table.

The data model

We keep two kinds of state in KV:

User segment buckets — a key per user (or anonymous cookie ID) pointing to a small JSON blob: language preference, recent categories, country code, last-seen timestamp.
Segment-to-video maps — keys like seg:us-en-shorts:v3 holding the precomputed top-N video IDs for that segment.
Shell pointers — references into the Worker Cache API for the static HTML body, kept separate so we can rev them independently.

The trick is that the Worker never touches the origin database for read traffic. It composes the personalized response from two cheap KV lookups and a cached page shell.

SEGMENT_TTL = 60 * 60 * 6
USER_TTL = 60 * 60 * 24 * 30

def build_segment_key(user_meta):
    country = user_meta.get("country", "xx").lower()
    lang = user_meta.get("lang", "en")
    bucket = user_meta.get("affinity_bucket", "default")
    return f"seg:{country}-{lang}-{bucket}:v3"

The version suffix (:v3) on segment keys is critical. Whenever the offline pipeline regenerates segment definitions, the new keys go live without invalidating old ones — clients drain naturally as their user records refresh.

The Worker read path

Here is the read path, simplified. We read the user blob, derive the segment key, fetch the segment payload, and inline it into the HTML shell that the Worker pulled from cache.

func handle(req *http.Request, kv *KVNamespace) (*http.Response, error) {
    uid, _ := cookieValue(req, "dw_uid")
    if uid == "" {
        uid = newAnonID()
    }

    user, err := kv.Get(req.Context(), "u:"+uid, "json")
    if err != nil || user == nil {
        user = defaultProfile(req)
    }

    segKey := buildSegmentKey(user)
    seg, err := kv.Get(req.Context(), segKey, "json")
    if err != nil || seg == nil {
        seg = fallbackSegment()
    }

    shell, _ := cache.Match(req)
    return injectSlot(shell, seg, uid), nil
}

A few things worth calling out:

The fallback segment is non-negotiable. KV can return nil — keys expire, replication lag exists, regional caches can miss. Never return a 500 to a user because the recommender shrugged.
cache.Match is the Worker Cache API, separate from KV. The shell HTML lives there with a long TTL; KV holds the volatile bits. Two different storage layers, two different invalidation strategies.
Anonymous IDs minted in the Worker keep cold visitors out of the origin entirely.

Writing to KV from origin

KV is read-heavy on purpose. Writes have a per-key rate limit of roughly one per second, eventual consistency on the order of tens of seconds globally, and a 25 MB value cap. That means the origin has to batch and version, not stream.

The PHP origin job that rebuilds segments looks roughly like this:

function flushSegment(string $segmentKey, array $videoIds): void {
    $payload = json_encode([
        'ids' => array_slice($videoIds, 0, 50),
        'built_at' => time(),
        'version' => 3,
    ], JSON_UNESCAPED_UNICODE);

    $url = sprintf(
        'https://api.cloudflare.com/client/v4/accounts/%s/storage/kv/namespaces/%s/values/%s',
        CF_ACCOUNT,
        CF_KV_NS,
        rawurlencode($segmentKey)
    );

    $ch = curl_init($url);
    curl_setopt_array($ch, [
        CURLOPT_CUSTOMREQUEST => 'PUT',
        CURLOPT_POSTFIELDS => $payload,
        CURLOPT_HTTPHEADER => [
            'Authorization: Bearer ' . CF_API_TOKEN,
            'Content-Type: application/json',
        ],
        CURLOPT_RETURNTRANSFER => true,
    ]);
    curl_exec($ch);
    curl_close($ch);
}

We trigger this from a cron task that runs every few hours. Anything more aggressive and you start fighting KV's write limits without measurable benefit — segment composition just doesn't shift that fast.

What this actually buys you

The numbers from our rollout, with US, EU, and APAC traffic mixed:

p50 personalized response time dropped from 240 ms to 41 ms.
p95 from 880 ms to 120 ms.
Origin egress on the homepage route fell roughly 70%.
Cache hit ratio on the shell jumped from 58% to 94% because we stopped baking user-specific bytes into the cached HTML.

The biggest non-performance win is freedom of deployment. Origin maintenance windows used to mean degraded personalization. Now the Worker keeps serving the last known segments from KV until origin comes back.

Pitfalls we hit

Hot keys: A single segment serving all anonymous US visitors becomes a hot key. KV handles this, but if your payload grows past a few hundred KB you will feel it. Keep segment payloads tight — IDs and minimal metadata, not full video records.
Stale anonymous IDs: We initially regenerated anon IDs too aggressively. Every regen meant a KV write and a cold segment lookup. Mint once, persist in a long-lived cookie, refresh metadata not identity.
Schema migrations: Always version the key namespace. Renaming a key shape in place will brick the Worker for the replication-lag window.
Observability: KV ops show up in Workers analytics but not in your origin logs. Wire your own metrics — request count, KV miss rate, fallback rate — into Logpush early. You will need them when things drift.

When not to do this

Edge personalization with KV fits when the recommendation tier can be precomputed in batches and read many more times than written. If you need per-event real-time updates — a live counter, a chat presence flag, a leaderboard that ticks every second — Workers KV is the wrong tool and you want Durable Objects or D1 instead. Pick the storage shape that matches the read/write ratio, not the other way around.

For a discovery surface where the same hundred users see the same hundred curated shelves, edge KV is hard to beat.