Rethinking ReBAC: From Accidental Discovery to Zero-Latency Reads

#rebac #architecture #zanzibar #oidc

I needed to solve a common hierarchical permission problem on my project... and unknowingly arrived at ideas that extend Google's Zanzibar paper.

Photo by note thanun on Unsplash

I was building a mechanism for "setting-spreading". If a parent company upgraded to a premium tier (enabled a specific feature module), or added a user as a manager of some data, that setting needed to automatically cascade down to specific subsidiaries, team members, and/or related entities based on dynamic rules.

Starting with a single case, the idea was seductive so I extended it to more complex scenarios, before I decided to isolate it as a stand alone solution. Once there, I did my duty of researching whether the problem I was tackling was already solved and open sourced. Surprisingly, it was not so democratized, but it already had a name: ReBAC (Relationship-Based Access Control), formalized by Google's Zanzibar paper.

But as I explored Zanzibar and its modern SaaS implementations (like SpiceDB or Auth0 FGA), I realized a critical bottleneck: Latency.

Here's what I'll cover: the latency problem with standard ReBAC, how AOT materialization solves it, the three abstraction layers that make this engine reusable, and the governance benefits that emerged unexpectedly.

The Zanzibar Latency Paradigm

Google Zanzibar is a phenomenally robust distributed system designed to compute permissions in real-time. It operates as a centralized authorization engine. (Analogy: OIDC scopes and claims, but resolved live across interrelated services)

The standard approach for modern ReBAC environments is to operate as centralized microservices relying on Just-In-Time (JIT) graph traversal. When a user requests access to a resource, the engine calculates the permissions over the network.

This is elegant on paper and adopts a nice microservice SaaS architecture, but in the real world, it becomes a bottleneck:

Network Hops: Every isAuthorized check requires a remote network call.
Read-Time Overhead: As relationship depth increases, computing the graph at read-time causes latency spikes.

If your platform needs sub-millisecond read responses, taking a network hit on every single API check quickly becomes a difficult-to-resolve issue without abandoning the architecture.

The Pivot: Spread at Write, Optimize Read

My approach was iterative—I started denormalizing from the first version. The result is a highly flexible Ahead-Of-Time (AOT) Materialized Engine.

If a user's rights are inherited by their relationships in the hierarchy, we materialize those rights only once: the moment the relationship changes, not each time he wants to access data.

How it Works

Our Materialized ReBAC engine acts as a background hydration pipeline:

The Graph Traverser: When a trigger occurs (e.g., adding an Owner to a Holding Company, or activating a paid module), the engine consults declarative SpreadingConfig rules. It uses a GraphNavigator to traverse the Holding-Subsidiaries database relationships. Through a unified interface, implemented in concrete project-specific code, which is easier to test and debug.
Hydration Jobs: It traverses downstream (or upstream) and injects the resulting permissions or modules directly into the target entities' database records (granting the user inline permissions or activating the paid feature locally).
Constant-Time Reads: Because the permissions are materialized on the target node, authorization at the API level requires zero mathematical computation or graph traversal. It is a simple object check: entity.modules.includes('ANALYTICS') or user.roles.includes('OWNER').

The read latency dropped from network-dependent dozen milliseconds per call (because of network latencies) to absolute zero algorithmic overhead, as it resolves to a db-read that usually is executed along with other data querying in the same network request.

Three Abstraction Axes for Project-Agnostic Design

1. Graph Navigation Abstraction

The engine never hardcodes "how to find neighbors." It consumes an injected GraphNavigator interface that yields adjacent nodes. The engine knows that traversal happens, not how—whether across PostgreSQL, GraphQL, or REST endpoints.

2. Policy Decision Abstraction

The engine delegates all condition evaluations to registered ConditionHandler plugins. Instead of parsing rigid rule strings, it passes typed entities to project-specific code that answers "yes/no" based on business logic (subscription tiers, payment status, etc.).

3. Action Execution Abstraction

The engine uses the Command Pattern for state mutation. It receives an Action payload and blindly calls .execute(). The engine doesn't know if it's granting a role, activating a module, or firing a webhook—it just runs the command.

Testing Advantage: Layered Confidence

Unit tests on the engine itself verify the mechanics: cycle detection, depth limits, queue ordering, FILO rollback logic, and correlation ID tracking. These run in milliseconds with zero infrastructure—pure Go/Rust/TypeScript logic.

Project-specific plugin tests verify the integration: that your CompanyGraphNavigator correctly queries subsidiaries, that your SubscriptionConditionHandler properly evaluates tier levels, and that your RoleMaterializer writes to the correct database tables.

Beyond reusability, the critical advantage of this design is isolation. Because the engine depends only on interfaces, you can test:

Engine behavior with mock plugins (fast, deterministic, exhaustive)
Plugin behavior against real databases using project-specific classes and methods. This removes abstraction layers and eases code reading—no unnecessary mental indirection.

Beyond Speed: Some Nice Side Effects

But beyond speed, some nice side effects showed up: governance.

Because we materialize permissions through an event-driven queue, we can safely emulate (dry-run) outcomes before applying them. This allows for governance patterns that are traditionally difficult to achieve in JIT engines:

Terraform-Style Dry Runs & Pending Changes: By treating permissions as a queue of upcoming events, we can simulate massive cascades and stage "Pending Changes" before they are committed (navigating the tree again to activaite nodes on payment success).
The "Human in the Middle": An administrator can review the simulated propagation tree and explicitly lock specific entities or deactivate spreading on specific modules before executing the materialization. It turns authorization into a clear Permission Staging pipeline that can be submitted for review before getting materialized.
Atomic Correlation & Rollbacks: Every wave of spreading is tied to a correlationId. If an upstream change is undone (or a human makes a mistake), the engine can perfectly roll back the entire materialized state across the tree. Repeatedly in a FILO (first in, last out) manner.
Decision Transparency: Every time the engine refuses to spread a module down a valid path (due to failing a conditional check), it records the detailed structural reasoning in a unified AuditLog for better understanding of the reasons of refusal.
Cycle & Depth Safety: To prevent infinite loops, the queue maintains a strict pathHistory for cycle detection and enforces per-operation-type configuration-specific depth limits to stop cascade exhaustion.

Solving for these edge cases effectively blurred the line between a basic Policy Engine and a full-fledged Identity Governance Administration (IGA) system.

Conclusion

Zanzibar definitively proved that static roles are dead and relationships are the future of access control.

But adopting ReBAC doesn't mean you have to surrender your application to the latency of external API computations. By embracing AOT Materialization, it is possible to build a generic, extensible engine that guarantees the dynamic power of relationship-based access with the uncompromising speed of local read queries.