How I built production quality gates into a multi-agent Claude Code workflow
Published to dev.to — cross-post from GitHub
1. The problem: agents that write code but never review it
When I started using Claude Code's Agent tool to dispatch subagents, I noticed a pattern quickly: the agent would write code, declare success, and move on. There was no review step unless I explicitly asked for one in the prompt — and prompts are unreliable. If the model was running low on context or the task was complex, the review step would get dropped.
The deeper issue is that multi-agent systems are composable but not automatically accountable. You can chain code-writer → commit → push in a plan, but nothing in the default setup prevents a buggy implementation from being committed and pushed before a human or reviewer has seen it. The agent doesn't know what it doesn't know.
I wanted a framework where review wasn't optional — where it was structurally impossible to skip.
2. CAST's hook-driven commit gate
Claude Code exposes a lifecycle hook system via settings.json. One of those hooks is PreToolUse — it fires before every tool call and can return {"decision": "block"} to reject the operation entirely.
I used this to build a hard commit gate. The hook script (pre-tool-guard.sh) intercepts every Bash tool call that matches git commit. If the command doesn't have a specific escape hatch prefix (CAST_COMMIT_AGENT=1), the hook exits with code 2, which Claude Code treats as a hard block — the commit does not happen.
# pre-tool-guard.sh (simplified)
if echo "$FIRST_LINE" | grep -qE "(^|[[:space:]])git[[:space:]]+commit"; then
echo "**[CAST]** Raw git commit blocked. Dispatch the commit agent instead."
exit 2
fi
The only way to commit is through the commit agent workflow, which:
- Reads staged changes
- Dispatches
code-reviewer(Claude Haiku) and waits for a DONE status - If the reviewer returns DONE_WITH_CONCERNS, surfaces those to the user before proceeding
- Only then runs
CAST_COMMIT_AGENT=1 git commitwith the escape hatch
The gate is enforced at the shell level, not at the prompt level. It can't be bypassed by rephrasing a request.
The full framework ships 16 agents, 16 slash commands, and a hook architecture covering 19 hooks across 13 Claude Code lifecycle events. The BATS test suite has 301 tests covering every hook script. It's installable via Homebrew:
brew tap ek33450505/cast && brew install cast
3. cast.db as an event store
Every meaningful lifecycle event gets written to a SQLite database at ~/.claude/cast.db. The schema has four main tables:
-
sessions— one row per Claude Code session, with start/end timestamps and token counts -
agent_runs— one row per subagent dispatch, tracking which agent ran, duration, and status -
routing_events— one row per tool call that hits a hook, with tool name, exit code, and latency -
hook_health— rolling health state for each hook script (last fired, last exit code)
The writes happen via PostToolUse hooks set to async: true, which means they don't block tool execution. The hook script spawns a Python process, parses the Claude Code hook payload from stdin, and appends to the DB. Because it's async, the latency hit to the tool call is effectively zero.
"PostToolUse": [
{
"matcher": "Write|Edit|Agent|Bash",
"hooks": [
{
"type": "command",
"command": "bash ~/.claude/scripts/post-tool-hook.sh",
"if": "Write|Edit|Agent|Bash",
"timeout": 10,
"async": true
}
]
}
]
The if: field filters are important here — they scope each hook to only the tool types it actually cares about, so the cost tracker only runs when a Bash, Edit, Write, or Agent call completes, not on every Read or Glob.
4. The React dashboard: making agent activity queryable
The companion project (claude-code-dashboard) is a React 19 + Vite frontend backed by an Express 5 API that reads from cast.db. It runs locally at :5173 (Vite dev server) + :3001 (Express API).
Key pages:
- /activity — live event stream via SSE; shows tool calls in real time as they fire
- /sessions — session history with token spend per session
- /analytics — aggregate token spend over time, cost by agent, hook fire frequency
- /agents — per-agent run history with duration and status distributions
- /hooks — hook health dashboard: which hooks are firing, last exit codes, latency percentiles
- /token-spend — daily/weekly cost breakdown
The value of having SQLite as the backing store vs. just log files: you can query it. Want to know which agent costs the most per session? One SQL query. Want to see hook latency over the last week? Aggregate routing_events by day. The data is local, structured, and queryable without a cloud backend.
5. Lessons learned
Async hooks changed the performance profile. Early versions had all hooks synchronous. Adding async telemetry hooks (PostToolUse, SubagentStart/Stop, TaskCreated, Stop) eliminated measurable latency from observability overhead. The key insight: telemetry hooks can be async because you don't need their output to make a decision. Security and commit gates must stay synchronous because they need to block.
if: filters are essential at scale. Without them, every hook fires on every tool call. The security guard was running on ls commands. Adding if: "Bash(curl *)" filters means it only fires when curl is about to run — which is the only time it matters. The Claude Code if: field supports glob-style matching against the tool name and input.
effort frontmatter changes model behavior. Setting effort: low on lightweight agents (commit, code-reviewer, push, test-runner) and effort: high on deep analysis agents (security, planner, researcher, debugger) lets the runtime allocate thinking budget appropriately. A commit agent doesn't need extended thinking. A security agent reviewing auth code does.
isolation: worktree prevents file conflicts in parallel dispatches. When the orchestrator dispatches multiple agents in parallel — code-writer and test-writer running simultaneously on the same codebase — they can clobber each other's edits without worktree isolation. Adding isolation: worktree to parallelizable agents (code-writer, test-writer, security, frontend-qa) gives each agent its own git worktree.
The BATS test suite is non-negotiable. Shell scripts are easy to break silently. 301 BATS tests covering every hook, every exit code path, and every escape hatch means I can refactor hooks without guessing whether I broke the commit gate. CI runs on every push.
The repo is at github.com/ek33450505/claude-agent-team. Issues and PRs welcome — especially around the hook architecture and DB schema.
Top comments (0)