DEV Community

Alberto Daniel Badia
Alberto Daniel Badia

Posted on

Semantic Invalidation That Doesn't Suck

If you've worked on a web app for any length of time, you know the deal with caching. You add a cache, everything's fast, and then someone updates something and users see old data. Prices, inventory, whatever. TTL helps but you're always trading freshness for load.

The typical fix is manual invalidation. Update a product, invalidate the cache key. Fine for one endpoint. Less fine when that product has reviews, and reviews have comments, and the product belongs to a store, and the store belongs to an organization. Now you're tracking relationships and invalidating keys everywhere. It gets messy.

I built ZooCache to handle this differently. It's a Python caching library with a Rust core that focuses on semantic invalidation, invalidate based on what changed, not just when.

How It Works

You register dependencies when you cache something:
Note: even if you don't know them upfront.

from zoocache import cacheable, invalidate, add_deps, configure

configure()

@cacheable()
def get_product(pid):
    add_deps([f"product:{pid}"])
    return db.get_product(pid)

@cacheable()
def get_reviews(pid):
    add_deps([f"product:{pid}:reviews"])
    return db.get_reviews(pid)

@cacheable()
def get_store_products(sid):
    add_deps([f"store:{sid}:products"])
    return db.get_store_products(sid)

@cacheable()
def get_org_stores(oid):
    add_deps([f"org:{oid}:stores"])
    return db.get_org_stores(oid)
Enter fullscreen mode Exit fullscreen mode

These tags form a hierarchy. org:1:stores:2:products:42 is a path in a PrefixTrie.

When you update something, invalidate the relevant tag:

def update_product(pid, data):
    db.update_product(pid, data)
    invalidate(f"product:{pid}")

def update_store(sid, data):
    db.update_store(sid, data)
    invalidate(f"store:{sid}")

def update_org(oid, data):
    db.update_org(oid, data)
    invalidate(f"org:{oid}")
Enter fullscreen mode Exit fullscreen mode

Invalidating org:1 clears everything below it. Product, reviews, store products, all gone. You don't have to remember which functions cached what.

The invalidation itself is O(D) where D is tag depth. Doesn't matter how many items are cached.

Distributed Systems

If you're running multiple instances, ZooCache uses Hybrid Logical Clocks (HLC) for consistency. Each invalidation gets a timestamp that accounts for clock drift. If invalidation B happens after invalidation A, B's timestamp is guaranteed to be higher, even if the clocks are wrong.

There's also passive resync. Every cached entry stores version info. When a node reads data from another node, it checks those versions. If they're newer, it catches up automatically. Reads keep things consistent without extra coordination.

The Thundering Herd Thing

When a cache entry disappears and 100 requests hit your database at once, that's a problem. ZooCache handles this with a SingleFlight pattern. The first request does the work, the other 99 wait, then everyone gets the same result. Database sees one query instead of 100.

Other Stuff

  • Storage: in-memory, LMDB, or Redis
  • Distributed invalidation bus based on Redis, storage agnostic
  • Integrations: FastAPI, Django, Litestar
  • Serialization: MsgPack with LZ4 compression
  • Observability: logs, Prometheus, OpenTelemetry
  • CLI for monitoring if you want it

Tui tool

Performance

Rust core, Python bindings. Benchmarks are on the docs site if you want numbers.

Try It

uv add zoocache
Enter fullscreen mode Exit fullscreen mode

GitHub | Docs

If you try it and have thoughts, let me know. Issues, suggestions, whatever. Always happy to hear how things could work better.

Top comments (0)