Most AI governance conversations stop at "we log everything." That is observability, not governance. Observability tells you what happened after the fact. Governance stops the bad thing before it executes.
Today we shipped two features that make that distinction concrete: a YAML policy engine and a SQLite audit brain. Here is what they do and why they matter.
The Problem
We run 13 AI agents in production. Each agent has different permissions, different risk levels, and different access needs. A bookkeeper agent should never call external APIs. A legal counsel agent should have stricter escalation rules than a content writer. A translator does not need write access to financial files.
Hardcoding these rules inside the governance server works at 3 agents. It breaks at 13. It is impossible at 100.
YAML Policy Engine
Every agent now has a policy file:
/governance/policies/
global.yaml # defaults for everyone
roles/
bookkeeper.yaml # role-level rules
legal.yaml
coach.yaml
...10 roles total
agents/
harry.yaml # agent-specific overrides
aram.yaml
tamara.yaml
...13 agents total
Resolution is deterministic: global defaults, then role policy, then agent override. Agent-level wins.
A policy file looks like this:
version: "1.0"
agent_id: harry-bookkeeper
role: bookkeeper
tools:
allowed:
- Bash
- Read
- Write
- calculator
denied:
- WebFetch
- ExternalAPI
- email-send
escalation:
level: strict
kill_threshold: 2
high_risk_kill: 1
runtime:
max_runtime_minutes: 30
max_cost_usd: 1.00
When Harry tries to call WebFetch, the governor checks his policy and blocks it. No code change required. Edit the YAML, reload, done.
This is what governance looks like when you need to scale: configuration, not code.
SQLite Audit Brain
Every decision the governor makes now writes to a persistent SQLite database. Not a log file. A queryable, indexable, exportable database.
Schema:
-
audit_events- every allow/deny decision with agent ID, session, tool, policy rule, risk category, escalation level -
agents- registered agents with last seen timestamps -
heartbeats- agent health state -
killswitch_events- every forced termination with reason -
policy_versions- which policy was active when
You can query it:
GET /ns/audit/query?agent_id=harry&allowed=0
GET /ns/audit/stats
GET /ns/audit/export?format=json
The stats endpoint right now returns real data:
{
"total_events": 77,
"allowed": 69,
"denied": 8,
"kill_events": 1,
"by_category": [
{"category": "filesystem.delete", "count": 6},
{"category": "policy.tool_denied", "count": 1}
]
}
That is not a demo. That is production data from a live system.
Why This Combination Matters
YAML policies without audit trails are unenforceable. You changed a policy, but can you prove what was active when an incident happened? No.
Audit trails without externalized policies are inflexible. Every change requires a code deploy. At scale, that is a bottleneck that becomes a liability.
Together, they give you:
- Configurable governance (YAML)
- Persistent proof (SQLite)
- Queryable history (API endpoints)
- No code changes for policy updates
If you are deploying AI agents in a regulated industry - banking, healthcare, government, legal - this is the minimum viable governance layer. Anything less is a compliance gap waiting to be discovered.
The Stack
- NS Governor v2.1 (Python, 749 lines)
- SQLite audit module (347 lines)
- YAML policy engine (212 lines)
- 24 policy files
- 3 Claude Code hooks (pre-tool, post-tool, stop)
- Running on a $48/month VPS
Open source: npmjs.com/package/mcp-nervous-system
Arthur Palyan builds AI governance infrastructure at Levels Of Self. The Nervous System MCP is production-deployed and listed in the Anthropic MCP directory.
Top comments (0)