yuqiang

Posted on Apr 7

I Built a "Blame Finder" for AI Agents – So You Never Have to Guess Who Broke Production

#ai #agents #opensource #devops

The 3 AM Slack Message We All Fear

"Hey, the multi-agent pipeline just deleted the staging database. Any idea which agent did it?"

Your PM Agent says it passed a clean requirement.

Your Coder Agent says it followed the spec perfectly.

Your Verifier Agent says it never even got the output.

You spend the next 4 hours grepping through thousands of lines of logs. You find nothing.

This is the Accountability Vacuum. And it's a nightmare.

So I built a cure: Agent Blame-Finder – an open‑source cryptographic black box for multi‑agent systems.

What Does It Do?

In 3 seconds, it tells you exactly which agent messed up.

$ blame-finder blame incident-abc123

🎯 Verdict: Coder-Agent
💡 Reason: Input requirement was correct, but output didn't match expectations
🔗 Chain:
   ✅ PM-Agent – success
   ❌ Coder-Agent – failed
   ⏳ Verifier-Agent – not reached

No more finger‑pointing. No more log spelunking. Just a verifiable, signed receipt of every decision.

How It Works (The 10‑Second Technical)

Under the hood, it implements two IETF Internet‑Drafts:

JEP (Judgment Event Protocol) – a minimal, cryptographically signed log format for agent decisions.
JAC (Judgment Accountability Chain) – a task_based_on field that links every decision to its parent.

Each time an agent does something, a JEP receipt is created:

{
  "verb": "J",
  "who": "Coder-Agent",
  "when": 1742345678,
  "what": "sha256:...",
  "task_based_on": "parent-task-hash",
  "sig": "Ed25519 signature"
}

The four verbs – J (Judge), D (Delegate), T (Terminate), V (Verify) – are all you need to model any accountability flow.

Integration: One Decorator

from blame_finder import BlameFinder

finder = BlameFinder(storage="./blackbox_logs")

@finder.trace(agent_name="Coder-Agent")
def write_code(requirement: str) -> str:
    # Your existing logic – no changes needed
    return "print('hello world')"

# Later, when something breaks:
print(finder.blame(incident_id="task_123"))

That’s it. The decorator handles hashing, signing, storage, and chain linking.

Why You Should Care

Without Blame‑Finder	With Blame‑Finder
Hours of log hunting	`blame-finder blame <id>`
"Maybe Agent X?" finger‑pointing	Cryptographic proof
No audit trail	JEP receipts (immutable, signed)
Broken causality	Full `task_based_on` tree

It’s like git blame but for AI agents.

And because it’s based on IETF drafts, it’s not another walled garden – it’s infrastructure.

The Road Ahead

✅ Rust core engine (fast)
✅ Python & TypeScript SDKs
🚧 LangChain / CrewAI native adapters
🚧 Visual dashboard (blame-finder dashboard – already works!)
🚧 One‑click PDF/HTML blame reports

Try It Right Now

pip install agent-blame-finder

Then launch the dashboard:

blame-finder dashboard

You’ll see a causality tree visualizer that looks like a Git graph – but for agent decisions.

Contribute

MIT licensed. We need:

Integrations with popular agent frameworks
More tests
Documentation improvements
Your crazy ideas

GitHub: https://github.com/hjs-spec/Agent-Blackbox

Stop the guessing game. Start the Blame‑Finder. 🔍

P.S. The name is intentionally provocative. Your PM will hate it. Your CTO will love it.

Top comments (5)

Mykola Kondratiuk • Apr 14

spent a week on this kind of debug - grep-based tracing across 8 agents is brutal. per-agent intent logging is the cleaner path.

Jonathan Murray • Apr 8

decorator approach is the right call, no rewrite just wrap what you have. curious about multi-session agents though. if an agent picks up a task wednesday that started monday, does the JAC chain stay intact across that gap or does each session start fresh?

yuqiang • Apr 8

Great question — and yes. JAC chains are persistent. Monday’s receipt is saved to disk. Wednesday’s task references that same receipt hash via task_based_on. The chain survives any gap because it lives in storage, not in memory. No fresh start, just a new link appended.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.