DEV Community

ahmet gedik
ahmet gedik

Posted on

Designing a Three-Tier Video Cache: Edge, Regional, and Origin Layers

Viral video traffic is brutal. One TikTok mention and you are suddenly serving 50x your baseline. A single-tier cache will not save you, and neither will throwing more origin servers at the problem. After two years operating across European regions, we settled on a three-tier cache architecture that balances cost, latency, and GDPR obligations.

Why three tiers?

A flat cache forces a binary choice: cache aggressively at the edge and lose freshness, or cache near origin and pay latency on every miss. Three tiers let you specialize:

  • Edge tier: short TTL, broad reach, absorbs the bulk of read traffic
  • Regional tier: longer TTL, fewer nodes, acts as a shield for cold edge nodes
  • Origin tier: durable page cache plus data cache on the application server

The interesting part is not the tiers themselves; it is the promotion and invalidation logic between them.

Tier 1: Edge cache

We use Cloudflare with cache rules tuned per route. Watch pages get 6 hours, category listings 3 hours, search results 10 minutes. The trick is stale-while-revalidate, which lets us serve stale content for a window while we refresh in the background.

header('Cache-Control: public, max-age=21600, stale-while-revalidate=86400');
header('Vary: Accept-Encoding, X-Region');
Enter fullscreen mode Exit fullscreen mode

GDPR adds a wrinkle: we cannot include any user-identifying data in cacheable responses. Our edge cache key includes the region (resolved from CF-IPCountry) but never anything tied to a user. Personalization happens client-side only.

Tier 2: Regional cache

Cloudflare's free tier does not give us tiered cache controls, so we built our own regional layer using small VPS nodes in three European data centers (Frankfurt, Amsterdam, Warsaw). Each acts as an origin shield for its region. When an edge POP misses, it asks the regional node before falling through to origin.

The promotion rule is simple: a video is promoted to regional cache after three hits within a 60-second window. Cold videos stay at origin only. This single rule cut our origin egress by roughly 70%.

func shouldPromote(videoID, region string) bool {
    key := fmt.Sprintf("hits:%s:%s", region, videoID)
    count, _ := redis.Incr(ctx, key).Result()
    if count == 1 {
        redis.Expire(ctx, key, 60*time.Second)
    }
    return count >= 3
}
Enter fullscreen mode Exit fullscreen mode

Hot videos pin themselves into regional cache for an hour. Lukewarm videos rotate naturally as their counters expire.

Tier 3: Origin cache

The origin runs LiteSpeed with two caches stacked on top of the PHP application:

  • LiteSpeed page cache for full HTML responses
  • An application-level file cache for SQL query results and category data

The page cache is the workhorse. It serves 95% of requests without touching PHP. Cache keys include the region code so a German visitor and a Spanish visitor do not share a regionalized response.

The data cache holds shorter-lived hot paths: top-10 viral lists, regional trending, category metadata. We rebuild these on cron rather than invalidating reactively, which guarantees a freshness floor.

Invalidation is the hard part

Caches are easy. Invalidation is where systems die. We use three signals:

  • Time-based: every layer has a TTL. Always. No exceptions.
  • Event-based: when an admin edits metadata or a video is removed, we issue a purge to all tiers in parallel.
  • Capacity-based: regional nodes evict LRU when memory pressure crosses 80%.

The event purge is the one that gets tricky. We learned the hard way that hitting the edge purge API synchronously inside a request handler will eventually time out. Now we enqueue purges and process them in a small Python worker.

def process_purge_queue():
    while True:
        item = queue.brpop("purge:queue", timeout=30)
        if not item:
            continue
        url = item[1].decode()
        cloudflare.purge(url)
        for region in ("fra", "ams", "waw"):
            requests.delete(
                f"https://{region}.internal/cache",
                json={"url": url},
                timeout=5,
            )
        origin.purge_local(url)
Enter fullscreen mode Exit fullscreen mode

If a purge fails on one tier, we retry with exponential backoff. If it fails permanently, we log and rely on the TTL as a backstop. Defense in depth is the only thing that keeps invalidation honest.

GDPR considerations

European compliance shaped the design more than I expected. Three rules ended up baked into every tier:

  • No personal data in any cache key or response body. Personalization is client-side.
  • Geolocation uses country-level granularity only (CF-IPCountry), never city or coordinate.
  • A user data deletion request fires a purge across all tiers for any URL containing their handle.

If you are building anything similar for a European audience, work the deletion path first. Most teams build caches they cannot purge by user, and that is a compliance landmine.

Numbers that matter

After moving to the three-tier model, we measured the change at ViralVidVault over a month:

  • Edge hit ratio: 88% to 94%
  • Origin requests per second under viral load: 1,200 to 180
  • p95 latency for cold European visitors: 480ms to 140ms
  • Egress bandwidth cost: down 62%

Closing notes

You do not need exotic infrastructure for a tiered video cache. We run the regional tier on three small VPS nodes. The hard parts are the promotion rules, the invalidation pipeline, and treating GDPR as a first-class constraint instead of a checkbox.

If I were starting over, I would build the invalidation worker before the cache tiers. Everything else is downstream of that.

Top comments (0)