I built an open-source authorization layer for AI agents — here's what the audit trail looks like
When an AI agent takes an action in production — isolates a host, rotates credentials, applies a patch — you can log what happened. But you can't prove it was authorized. Logs show what happened. They don't prove who said it was allowed to happen.
I built Shani to solve this. It sits between an agent's intent and execution, issues signed authorization tokens, and produces a tamper-evident audit trail.
The core idea
Agent ──DecisionProposal──► Shani ──ADO──► ExecutionBoundary ──Capability──► World
No ADO → no Capability → no execution.
An agent submits a DecisionProposal (what it wants to do + evidence). Shani evaluates it against a YAML policy — blast radius, reversibility, environment risk, evidence quality. If authorized, it issues a signed ADO (Authorized Decision Object). The agent can only act through a Capability derived from a valid ADO.
The risk level (D-SAL 0–4) is computed by Shani from proposal context — not declared by the agent. An agent cannot claim its own action is low-risk.
When D-SAL exceeds a threshold, Shani blocks and waits for human approval via Slack, webhook, or CLI.
Real example: vulnerability remediation CI pipeline
# Scan
pip-audit --format=json > pip-audit.json
# Shani judgment (auto-approve in CI, HITL in prod)
SHANI_HITL_AUTO=1 python examples/vuln_remediation/scenario.py
Here's what a real run looks like — 5 vulnerabilities detected, evaluated, and patched:
[scan] Running pip-audit…
5 vulnerability finding(s) across 47 packages
[govern] PYSEC-2026-196 [MEDIUM] pip 24.0
AUTHORIZED — ADO ff73fedc… dsal=1
OK: upgraded to 26.1.2
[govern] CVE-2025-8869 [MEDIUM] pip 24.0
AUTHORIZED — ADO 6590afbf… dsal=1
OK: upgraded to 25.3
[govern] CVE-2026-6357 [MEDIUM] pip 24.0
AUTHORIZED — ADO 4cd03165… dsal=1
OK: upgraded to 26.1
[audit] audit.json (executed=5 denied=0 skipped=0)
And the audit trail:
{
"schema_version": "1",
"agent_id": "vuln-remediation-agent/v1",
"mode": "auto",
"summary": { "total": 5, "executed": 5, "denied": 0, "skipped": 0 },
"entries": [
{
"vuln_id": "PYSEC-2026-196",
"package": "pip",
"action": "executed",
"ado_id": "ff73fedc-...",
"approved_by": "SOC-Analyst",
"detail": "OK: upgraded to 26.1.2"
}
]
}
Every entry proves not just that the upgrade happened, but that it was authorized, by whom, and against which specific vulnerability finding.
Also works with nanoclaw (zero TypeScript required)
pnpm run chat "Isolate host:prod-db-12. Evidence: EDR detected lateral movement (0.93), SIEM anomalous outbound (0.88)"
Claude reads the Shani SKILL.md, submits a DecisionProposal, waits for HITL approval, and only proceeds after receiving a signed ADO.
CI/CD pipeline integration
Ships a GitHub Actions workflow: Trivy + Grype + OSV-Scanner → Shani judgment → shani-audit.json. Gate your deployments on policy, not just vulnerability counts.
- name: Run Shani vulnerability judgment
env:
SHANI_HITL_AUTO: '1'
run: |
python examples/vuln_remediation/shani_vuln_judge.py \
--trivy trivy.json --grype grype.json \
--output shani-audit.json \
--fail-on-denied
What this gives you that observability tools don't
LangSmith answers "what did the agent do?" Shani answers "was it authorized, by whom, and why was it blocked?" These are complementary. But for the conversation with your security team, compliance officer, or a regulator — you need both.
Honest limitations
- v0.4, self-hosted only
- Evidence is agent-submitted (push model — pull-based verification not yet implemented)
- YAML policy is expressive but not yet well-documented for complex cases
Apache 2.0: github.com/kmori-source/shani
If you're shipping AI agents to production and have hit the accountability problem, I'd genuinely like to hear what you've tried.
Tags: #security #ai #devops #opensource #llm
Top comments (0)