How Octorato does per-client FinOps: attribution + hard budget caps

dataqbs — Mon, 01 Jun 2026 16:44:18 +0000

An Octorato runs one operator's "brain" across many sealed client arms. The moment you do that, one question decides whether you have a business or a money pit: which client's actions burned which tokens — and can you stop one client from running you to a $4,000 bill?

This is the part people bolt on after the first surprise invoice. In Octorato it's native, because the architecture forces it. Here's exactly how it works — and, just as importantly, where it's an estimate vs. a hard guarantee.

Attribution: per arm, not per request

Every client is a sealed arm — its own repo/workspace. Cost is aggregated from local session logs at list price, keyed by repo path. So the unit of attribution is the arm (the client), rolled up across its sessions.

Being honest about the granularity: this is an estimate, not a billing-grade per-request meter. There's a small unattributed remainder, and it's list-price math, not your negotiated rate. The project says so out loud rather than implying precision it doesn't have — you can't price what you can't measure, but you also shouldn't pretend to measure what you estimate.

Enforcement: a hook that refuses the tool

Tracking that client X spent $40 is step one. Stopping X from spending $4,000 is the step that actually protects you — and it's the part most frameworks skip.

Octorato's budget gate is real code:

budget-check.py exits non-zero when an arm's grace-adjusted cap is burned through.
A PreToolUse hook refuses the expensive tool (sub-agent spawn, browser automation, etc.) before it runs.
Three tiers: alert → warn → hard_stop.

The honest caveat: it arms itself only once you set a per-arm cap in budgets.yaml. The mechanism is real; the precision is opt-in.

Why isolation gives you FinOps for free

The trick isn't a billing module bolted on top — it's the cell wall. The same wall that isolates a client is the wall that meters them. Because an arm never sees another arm, its session logs are already a clean per-client ledger. The arm is the ledger.

That's also the wager on the right side of the Gartner prediction that 40% of agentic-AI projects get cancelled by 2027 over unmanaged cost.

Try it

White paper: https://www.dataqbs.com/octorato
Source (MIT): https://github.com/CarlosCaPe/octorato

One brain, sealed arms, one ledger per client — because the arm is the ledger. 🐙

Octorato: an open-source AI agent OS with built-in per-client FinOps

dataqbs — Sun, 31 May 2026 00:42:23 +0000

Most agent frameworks assume one agent, one app, one bill. The moment you run agents for many clients, two problems appear that no runtime solves for you: you can't prove which client burned which tokens, and nothing stops one client's workspace from leaking into another's. I built Octorato to fix exactly that.

What Octorato is

Octorato is an open-source AI agent operating system: one file-native "brain" — rules, 190+ skills, 180+ specialist agents, all plain markdown under git — that a single operator runs across many sealed client "arms," with per-client token attribution and opt-in budget caps.

It's not a runtime you import. It's the agent's self as files you can read, diff, fork, and own — runtime-agnostic (it runs on Claude Code today).

The octopus model

One brain, many arms. The brain holds the shared self: rules (the constitution), skills (HOW to do things), agents (WHO does them). Each arm is a sealed deployment serving exactly one client. Knowledge flows down (generic skills cascade to every arm) and lessons flow up (anonymized patterns get distilled back into the brain). Like a real octopus, most of the neurons live in the arms, not the head.

Why "file-native" matters

Your agent's identity, skills, and memory normally live trapped inside vendor code and a cloud console — you can't read the whole self, diff a change, or move it. Octorato keeps all of it as plain markdown under version control. Identity becomes diffable, reviewable, portable, and ownable. Text outlives runtimes.

The part nobody else does: FinOps and isolation are the same wall

Because each arm is a sealed cell that no other arm can see, every token an arm spends is attributable to exactly one client by construction. Cellular isolation is per-client FinOps — the wall that seals a client is the wall that meters it. Concretely: per-arm USD rollup (estimated from local session logs at list price), cost-spike alerts, and an opt-in PreToolUse budget gate — wire the hook and set a client's cap in budgets.yaml, and it refuses the tool call (exits non-zero) once the cap is hit.

Gartner predicts over 40% of agentic AI projects will be cancelled by 2027 over unmanaged cost. The boring discipline — attribute every token, cap every client — is what keeps you on the right side of that statistic.

How it compares (honestly)

CrewAI, LangGraph, and AutoGen are excellent Python agent-runtime frameworks: you define agents and graphs in code and they execute in-process. They have far larger communities. Octorato lives at a different layer — the self as files — and its defensible difference is multi-tenant arm isolation plus built-in FinOps, which runtime frameworks don't target. If you're building one app, use a runtime framework. If you're an operator or agency serving many clients from one brain, that's the gap Octorato fills.

Try it

It's MIT-licensed and public: https://github.com/CarlosCaPe/octorato

Read the white paper for the full model, or the FAQ for the short version. Contributions welcome — every contributor is credited.

DEV Community: dataqbs

How Octorato does per-client FinOps: attribution + hard budget caps

Attribution: per arm, not per request

Enforcement: a hook that refuses the tool

Why isolation gives you FinOps for free

Try it

Octorato: an open-source AI agent OS with built-in per-client FinOps

What Octorato is

The octopus model

Why "file-native" matters

The part nobody else does: FinOps and isolation are the same wall

How it compares (honestly)

Try it