How I made AI-assisted coding safe: hook-based runtime interception instead of prompt instructions
Ask most AI coding tools how they prevent dangerous operations and they'll say something like: "The model is instructed not to do X."
That's not a safety system. That's a gentleman's agreement.
I built PocketTeam partly to solve a real workflow problem (solo devs skipping pipeline steps), but the most interesting engineering challenge was this: how do you make an agentic system safe in a way that actually holds up?
The problem with prompt-based safety
Prompt instructions fail in at least three ways:
Context compaction. When an agent's context window fills, older content gets summarized or dropped. Your safety instructions might not survive.
Prompt injection. A malicious or malformed input can override instructions if they're just text in the conversation.
Emergent behavior. Even well-instructed models sometimes do unexpected things. "Please don't" is probabilistic guidance, not a hard constraint.
For a system that runs code, deploys to production, and has access to your filesystem, probabilistic guidance is not enough.
The solution: hooks at the tool-call level
Claude Code has a hook system. Hooks run before and after tool calls. They're Python scripts — not LLM context.
PocketTeam's safety is implemented as 9 hook layers that run on every tool invocation:
# Simplified example of what a hook checks
BLOCKED_PATHS = [".env", ".ssh", ".aws", "*.pem", "*.key"]
BLOCKED_COMMANDS = ["rm -rf /", "DROP DATABASE", "TRUNCATE", ":(){ :|:& };:"]
def pre_tool_call(tool_name, tool_input):
if tool_name == "write_file":
if any(fnmatch(tool_input["path"], pattern) for pattern in BLOCKED_PATHS):
return {"error": "Blocked: write to sensitive path rejected"}
if tool_name == "bash":
for blocked in BLOCKED_COMMANDS:
if blocked in tool_input["command"]:
return {"error": f"Blocked: dangerous command rejected"}
return None # allow
The LLM never gets to execute the blocked operation. The hook rejects the call and returns an error. The LLM sees a tool failure and (usually) tries a different approach.
This survives context compaction because the hook code is not in the context — it's in the file system, registered with the Claude Code runtime.
The pipeline enforcement
Beyond blocking dangerous operations, the hooks enforce pipeline sequencing. The DevOps agent's deploy tool will fail if QA and Security haven't marked their steps complete. This isn't a prompt instruction to "only deploy after testing" — it's a check in the hook that reads a status file.
# Simplified
def pre_deploy(tool_input):
status = read_status_file()
if not status.get("qa_passed") or not status.get("security_passed"):
return {"error": "Deploy blocked: QA and Security must pass first"}
Persistent learnings: the compounding advantage
The second interesting design decision: making the system improve over time.
After every completed task, an Observer agent runs. It writes structured learnings to agent-specific markdown files:
# learnings/engineer.md
## Patterns in this codebase
- Always use the `db.transaction()` context manager for multi-step DB writes
- The test suite requires REDIS_URL to be set even for non-cache tests
- Migrations live in /db/migrations, not /migrations
## Common mistakes to avoid
- Do not import from `utils/legacy` — those functions are deprecated
- The `Config` class is a singleton; don't instantiate it directly
These files persist across sessions. Future Claude Code sessions inject them into agent context. The agents that run in month 3 have access to everything learned in months 1 and 2.
This is not RAG or vector search. It's structured, curated, agent-specific institutional memory written by an agent that watched the previous task run.
Self-healing via GitHub Actions
The third piece: making broken builds autonomous to fix.
The workflow:
- GHA build fails
- A GHA workflow triggers
pt fix --civia the Claude Code CLI - An Investigator agent runs root cause analysis
- An Engineer agent creates a fix (on a branch)
- A Telegram notification is sent with the fix plan and a diff summary
- The developer approves from their phone
- The fix is merged and the pipeline reruns
The key insight: this uses GitHub Actions as the trigger, not a polling daemon. No persistent process to maintain. No always-on connection. Just a GHA step that runs pt fix when a build fails.
59 built-in skills
Each agent in PocketTeam draws from a library of 59 structured skill procedures. These aren't just additional prompt instructions — they're markdown files with step-by-step workflows that agents follow.
A few examples:
-
owasp-audit.md— step-by-step OWASP Top 10 check procedure for the Security agent -
tdd-london.md— London School TDD workflow for the Engineer agent -
codebase-map.md— procedure for generating a full codebase overview before planning -
cost-tracker.md— token cost estimation and reporting -
fan-out.md— wave-based parallel execution on git worktrees -
atomic-commits.md— commit structuring guidelines with format rules
Agents declare which skills they use in their frontmatter:
---
name: engineer
model: claude-opus-4-5
tools: [read, edit, bash, mcp]
skills: [tdd-london, atomic-commits, fan-out, codebase-map]
---
ptbrowse: built-in browser testing without the token overhead
The QA agent doesn't just run unit tests. It opens your app in a real headless Chromium browser and verifies the UI works.
The insight behind the implementation: screenshots are expensive. A single screenshot can cost thousands of tokens. For an agent doing E2E testing with multiple steps, that adds up fast.
ptbrowse solves this by using Accessibility Tree snapshots instead. An accessibility tree is a structured representation of what's on screen — element roles, labels, references — at around 100–300 tokens per snapshot. You get enough information to navigate and assert without the visual overhead.
The QA agent interacts with the browser using a command-line interface:
ptbrowse navigate http://localhost:3000/login
# → returns accessibility tree snapshot with element refs like @e1, @e2, @e3
ptbrowse fill @e3 "user@example.com" # fill email field
ptbrowse fill @e4 "mypassword" # fill password field
ptbrowse click @e5 # click login button
ptbrowse wait text "Dashboard" # wait for redirect
ptbrowse assert text "Login successful" # verify outcome
Exit codes are structured for agent consumption: 0 success, 1 assertion failed, 2 stale element reference, 3 timeout. This means the QA agent can branch on outcomes without parsing text.
Screenshots are also available for visual verification, saved to .pocketteam/screenshots/:
ptbrowse screenshot login-page.png
The browser daemon auto-starts on first use and shuts down after 30 minutes of idle time. No config file, no setup step, no Docker container to manage separately.
Set PTBROWSE_HEADED=1 to run in headed mode — useful for watching the QA agent work visually during development.
The result: your AI team doesn't just run unit tests. It opens your app in a real browser and verifies the UI works. Token-efficiently.
Magic keywords
One implementation detail worth sharing: workflow modes via UserPromptSubmit hooks.
autopilot: add dark mode toggle → full pipeline, human gates bypassed
ralph: fix the payment tests → TDD loop, max 5 iterations until green
quick: rename this function → skip planning, implement directly
deep-dive: our auth architecture → 3 parallel research agents, synthesized report
The hook detects the keyword in the first message and injects orchestration instructions into the session before the COO agent sees it. The COO then runs the appropriate pipeline. This keeps the behavior change entirely in the pipeline layer — the individual agents don't need to know which mode they're in.
The honest limitations
- Works best on structured codebases with existing tests. Messy legacy code will produce messy plans.
- Complex tasks consume real tokens. A full autopilot run on a medium-sized feature can be expensive.
- Telegram setup is documented but not zero-config. If you don't want it, you don't need it.
- The self-healing pipeline requires a bit of GHA configuration.
- v1.x — there are rough edges. Open issues, I'll fix them.
Try it
pipx install pocketteam
pt init
pt start
GitHub: https://github.com/Farid046/PocketTeam
MIT License. Open source. The hook system and skill library are the parts most worth reading if you want to build something similar.

Top comments (0)