The "Distributed State Tax" and Why We Don't Always Need Redis: Introducing C3

#distributedsystems #database #microservices #architecture

One of the hardest lessons in software architecture is learning that every decoupling creates a new communication cost. We call this the "Distributed State Tax".

In a monolithic application, sharing data is essentially free. It’s just a memory lookup. But as soon as you split that application into microservices, you are "billed" for every piece of shared data. The currency isn't just latency; it's complexity. A simple question like "Is this user online?" or "Is this feature enabled?" transforms from a nanosecond variable check into a distributed systems problem involving networks, serialization, and failure modes.

For years, the industry has presented us with a binary approach to solve this. Whether you are a Junior Developer building your first service or a Principal Architect designing a platform, you have likely been forced to choose between two imperfect paths:

1. Remote Procedure Call ("Just Ask the Source"): You make an API call to the source of truth for every request.

The Trap: This is easy to implement but creates "Chatty Services." Your network becomes a bottleneck, latency spikes, and if the source service goes down, everyone goes down.

2. Caching Infrastructure ("Just Add Redis"): You deploy a dedicated cache cluster like Redis.

The Trap: This fixes performance but introduces "Infrastructure Bloat." You are now managing a complex distributed system just to read a simple value. If you treat Redis as a cache, you face the Dual-Write Problem (database saves, cache fails, data drifts). If you treat Redis as a primary store (to avoid dual-writes), you must now manage persistence, backups, and high availability for a whole new database engine.

I believe there is a third way. We can stop treating caching as additional infrastructure and start treating it as a protocol rooted in the database we already use.

Enter C3 (Coherent Cluster Cache).

The Theoretical Vision: The Database IS the OS

The architecture of C3 is not just about saving money on Redis instances; it is rooted in the concept of DBOS (Database-Oriented Operating System). The thesis is simple: modern relational databases (like PostgreSQL) are powerful enough to handle low-level coordination concerns like messaging and inter-process communication.

C3 also draws inspiration from Tuple Spaces and the Linda language from the 1980s. In this model, processes don't message each other directly; they generate "tuples" into a shared abstract space.

In the C3 architecture, PostgreSQL acts as that persistent Tuple Space. By standardizing the schema and invalidation signals at the persistence layer, we can create a "Serverless Caching" architecture where the database acts as the coordination engine.

What is C3?

C3 is a Go library that implements the Database-Centric Cache Protocol (DCCP). It provides a two-tiered caching strategy that delivers the read performance of local memory with the consistency guarantees of a distributed cache.

The Tiered Architecture

L1 Cache (In-Memory): An LRU map inside your application's memory. This serves the "hot path" with nanosecond-level latency.
L2 Cache (PostgreSQL): The shared source of truth. If data isn't in L1, C3 fetches it from the cache_data table in Postgres.

But the real magic isn't in the storage; it's in the Coherency Engine.

How We Achieve Coherency

The hardest part of distributed caching is invalidation. C3 solves this by leveraging PostgreSQL's ACID guarantees and its native LISTEN/NOTIFY subsystem.

1. The Atomic Write Path

When you set a value in C3, you aren't just writing to a database; you are initiating a transaction.

Step 1: The library writes the key/value pair to the L2 table (INSERT ... ON CONFLICT DO UPDATE).
Step 2: Within the same transaction, it executes pg_notify('cache_invalidate', key).
Step 3: The transaction commits.

This guarantees Atomicity. The data update and the invalidation signal happen together. If the transaction fails, no signal is sent. This eliminates the "Cache Drift" common in dual-write systems where an app writes to the DB and then fails to update Redis.

2. Generative Communication

Every service instance running C3 maintains a background NotificationListener connected to the database. When Service A updates a user profile, the database broadcasts the cache_invalidate signal. Service B, C, and D receive this signal and perform a surgical invalidation of that specific key in their local L1 memory.

3. Resilience and Fail-Safes

In IoT and Cloud systems, network partitions are a reality. What happens if the listener disconnects?

C3 implements a strict fail-safe. If the connection to the database is lost, the library automatically purges the entire L1 cache. This ensures that a partitioned node "fails secure," serving higher-latency data (L2 misses) rather than serving stale, incorrect data.

Operational Simplicity & Observability

As an architect, I value tools that are easy to run and even easier to maintain. Compared to a Service Mesh, which requires managing control planes and sidecars, C3 requires only a library import and a connection string.

However, we didn't skimp on visibility. C3 comes with built-in OpenTelemetry instrumentation. It emits metrics for:

L1 vs. L2 Hit Rates: To tune your memory allocation.
Coherency Events: To track invalidation volume.
Tracing: Verifying that L1 hits have zero network span.

When Should You Use C3?

C3 is not a replacement for Kafka or a high-throughput job queue. However, it excels where Redis is often an overkill.

Use C3 if:

You want Durability by default: Unlike a typical Redis setup, C3 backs everything to Postgres. If your service restarts, the data is still there.
You want "Zero Ops": You don't want to manage, patch, or scale a separate Redis cluster.
Your workload is Read-Heavy: User profiles, feature flags, configuration, and session data.
You work in a Polyglot Environment: Any language that can talk to Postgres can participate in the protocol.

Conclusion

We often over-engineer our systems because we fear inconsistency. We deploy massive infrastructure to solve problems that could be handled by the database we already have.

C3 validates the hypothesis that for read-heavy, transient state, the database itself is the most efficient coordinator. It allows us to build "Smart Clients" that maximize local resources while maintaining a coherent view of the world.

The Database-Centric Cache Protocol (DCCP) is fundamentally language-agnostic, and we believe this pattern is too valuable to stay within the boundaries of a single language ecosystem. We are actively looking for contributors to help replicate the C3 library in Python (c3-py), TypeScript/Node.js, Rust, and Java.

It’s time to pay less tax on our distributed state. Check out the repository and the protocol spec—let's build a coherent future together.

For those interested in the code, the C3 library is open-sourced and available for Go applications, complete with E2E tests validating the coherency protocols.

C3 Repo at Github