DEV Community

Edward Kubiak
Edward Kubiak

Posted on

How I built production quality gates into a multi-agent Claude Code workflow

How I built production quality gates into a multi-agent Claude Code workflow

Published to dev.to — cross-post from GitHub


1. The problem: agents that write code but never review it

When I started using Claude Code's Agent tool to dispatch subagents, I noticed a pattern quickly: the agent would write code, declare success, and move on. There was no review step unless I explicitly asked for one in the prompt — and prompts are unreliable. If the model was running low on context or the task was complex, the review step would get dropped.

The deeper issue is that multi-agent systems are composable but not automatically accountable. You can chain code-writer → commit → push in a plan, but nothing in the default setup prevents a buggy implementation from being committed and pushed before a human or reviewer has seen it. The agent doesn't know what it doesn't know.

I wanted a framework where review wasn't optional — where it was structurally impossible to skip.


2. CAST's hook-driven commit gate

Claude Code exposes a lifecycle hook system via settings.json. One of those hooks is PreToolUse — it fires before every tool call and can return {"decision": "block"} to reject the operation entirely.

I used this to build a hard commit gate. The hook script (pre-tool-guard.sh) intercepts every Bash tool call that matches git commit. If the command doesn't have a specific escape hatch prefix (CAST_COMMIT_AGENT=1), the hook exits with code 2, which Claude Code treats as a hard block — the commit does not happen.

# pre-tool-guard.sh (simplified)
if echo "$FIRST_LINE" | grep -qE "(^|[[:space:]])git[[:space:]]+commit"; then
  echo "**[CAST]** Raw git commit blocked. Dispatch the commit agent instead."
  exit 2
fi
Enter fullscreen mode Exit fullscreen mode

The only way to commit is through the commit agent workflow, which:

  1. Reads staged changes
  2. Dispatches code-reviewer (Claude Haiku) and waits for a DONE status
  3. If the reviewer returns DONE_WITH_CONCERNS, surfaces those to the user before proceeding
  4. Only then runs CAST_COMMIT_AGENT=1 git commit with the escape hatch

The gate is enforced at the shell level, not at the prompt level. It can't be bypassed by rephrasing a request.

The full framework ships 16 agents, 16 slash commands, and a hook architecture covering 19 hooks across 13 Claude Code lifecycle events. The BATS test suite has 301 tests covering every hook script. It's installable via Homebrew:

brew tap ek33450505/cast && brew install cast
Enter fullscreen mode Exit fullscreen mode

3. cast.db as an event store

Every meaningful lifecycle event gets written to a SQLite database at ~/.claude/cast.db. The schema has four main tables:

  • sessions — one row per Claude Code session, with start/end timestamps and token counts
  • agent_runs — one row per subagent dispatch, tracking which agent ran, duration, and status
  • routing_events — one row per tool call that hits a hook, with tool name, exit code, and latency
  • hook_health — rolling health state for each hook script (last fired, last exit code)

The writes happen via PostToolUse hooks set to async: true, which means they don't block tool execution. The hook script spawns a Python process, parses the Claude Code hook payload from stdin, and appends to the DB. Because it's async, the latency hit to the tool call is effectively zero.

"PostToolUse": [
  {
    "matcher": "Write|Edit|Agent|Bash",
    "hooks": [
      {
        "type": "command",
        "command": "bash ~/.claude/scripts/post-tool-hook.sh",
        "if": "Write|Edit|Agent|Bash",
        "timeout": 10,
        "async": true
      }
    ]
  }
]
Enter fullscreen mode Exit fullscreen mode

The if: field filters are important here — they scope each hook to only the tool types it actually cares about, so the cost tracker only runs when a Bash, Edit, Write, or Agent call completes, not on every Read or Glob.


4. The React dashboard: making agent activity queryable

The companion project (claude-code-dashboard) is a React 19 + Vite frontend backed by an Express 5 API that reads from cast.db. It runs locally at :5173 (Vite dev server) + :3001 (Express API).

Key pages:

  • /activity — live event stream via SSE; shows tool calls in real time as they fire
  • /sessions — session history with token spend per session
  • /analytics — aggregate token spend over time, cost by agent, hook fire frequency
  • /agents — per-agent run history with duration and status distributions
  • /hooks — hook health dashboard: which hooks are firing, last exit codes, latency percentiles
  • /token-spend — daily/weekly cost breakdown

The value of having SQLite as the backing store vs. just log files: you can query it. Want to know which agent costs the most per session? One SQL query. Want to see hook latency over the last week? Aggregate routing_events by day. The data is local, structured, and queryable without a cloud backend.


5. Lessons learned

Async hooks changed the performance profile. Early versions had all hooks synchronous. Adding async telemetry hooks (PostToolUse, SubagentStart/Stop, TaskCreated, Stop) eliminated measurable latency from observability overhead. The key insight: telemetry hooks can be async because you don't need their output to make a decision. Security and commit gates must stay synchronous because they need to block.

if: filters are essential at scale. Without them, every hook fires on every tool call. The security guard was running on ls commands. Adding if: "Bash(curl *)" filters means it only fires when curl is about to run — which is the only time it matters. The Claude Code if: field supports glob-style matching against the tool name and input.

effort frontmatter changes model behavior. Setting effort: low on lightweight agents (commit, code-reviewer, push, test-runner) and effort: high on deep analysis agents (security, planner, researcher, debugger) lets the runtime allocate thinking budget appropriately. A commit agent doesn't need extended thinking. A security agent reviewing auth code does.

isolation: worktree prevents file conflicts in parallel dispatches. When the orchestrator dispatches multiple agents in parallel — code-writer and test-writer running simultaneously on the same codebase — they can clobber each other's edits without worktree isolation. Adding isolation: worktree to parallelizable agents (code-writer, test-writer, security, frontend-qa) gives each agent its own git worktree.

The BATS test suite is non-negotiable. Shell scripts are easy to break silently. 301 BATS tests covering every hook, every exit code path, and every escape hatch means I can refactor hooks without guessing whether I broke the commit gate. CI runs on every push.


The repo is at github.com/ek33450505/claude-agent-team. Issues and PRs welcome — especially around the hook architecture and DB schema.

Top comments (0)