Why Kairon runs a separate gRPC authorization service

#node #typescript #architecture #webdev

When you're building a multi-tenant platform where users run autonomous trading agents, "just check a middleware flag" isn't a safety model. It's a hope.

This is how we ended up with Guardian -- a standalone Node.js gRPC server on :50052 that every agent execution gates through before a single order can fire.

The problem with inline auth checks

Our initial instinct was the usual: tRPC middleware, a capability check on the procedure, done. It works fine for UI-driven actions where a bad outcome is a 403 and a sad user. It does not work when the "action" is an autonomous agent executing a trading strategy with real capital.

The failure modes are different. A misconfigured middleware might pass a stale session. A quota check might race against a concurrent execution. An unhandled exception might default-allow instead of default-deny. In a UI context those are bugs. In an agent runtime they're incidents.

We needed authorization to be:

Explicit -- every execution path calls it, no exceptions
Fail-closed -- if the auth service is unreachable, the run is rejected
Auditable -- every decision is a record, not a log line

What Guardian does

Guardian exposes a proto3 service with three RPCs:

service GuardianService {
  rpc CheckCapability(CapabilityRequest) returns (CapabilityResponse);
  rpc CheckQuota(QuotaRequest) returns (QuotaResponse);
  rpc AuthorizeAgentRun(AgentRunRequest) returns (AgentRunResponse);
}

AuthorizeAgentRun is the gate. It calls CheckCapability, then CheckQuota, then writes an execution record. If any step fails or Guardian is unreachable, the run is rejected with reason guardian_unavailable. No silent pass-through.

Why a separate process

Two reasons: practical and principled.

Practical: Guardian enforces hard rate limits at the infrastructure level, isolated from API server memory pressure.

Principled: a separate service audits independently. Our kairon_org_audit_log table has exactly one writer with one responsibility.

The tradeoff

Every agent execution has a gRPC round-trip. That latency is deliberate. Trading agent authorization isn't latency-sensitive -- if your strategy breaks because auth took 2ms, the strategy has bigger problems.

What we gained is a single place where "should this agent run?" is answered and recorded, with an immutable sequence of authorization decisions to replay when something goes wrong.

Building this at kairon.trade. Source: github.com/greymoth-jp.