<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Farid</title>
    <description>The latest articles on DEV Community by Farid (@farid046).</description>
    <link>https://dev.to/farid046</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3865306%2F8e49133f-c2c4-47df-959d-f514965f3598.png</url>
      <title>DEV Community: Farid</title>
      <link>https://dev.to/farid046</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/farid046"/>
    <language>en</language>
    <item>
      <title>Self-healing builds, 59 skills, and runtime safety: what it took to build PocketTeam</title>
      <dc:creator>Farid</dc:creator>
      <pubDate>Tue, 07 Apr 2026 08:18:23 +0000</pubDate>
      <link>https://dev.to/farid046/self-healing-builds-59-skills-and-runtime-safety-what-it-took-to-build-pocketteam-h1b</link>
      <guid>https://dev.to/farid046/self-healing-builds-59-skills-and-runtime-safety-what-it-took-to-build-pocketteam-h1b</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9tfkp8z64pr9qpm6kv5t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9tfkp8z64pr9qpm6kv5t.png" alt=" " width="800" height="538"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How I made AI-assisted coding safe: hook-based runtime interception instead of prompt instructions&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Ask most AI coding tools how they prevent dangerous operations and they'll say something like: "The model is instructed not to do X."&lt;/p&gt;

&lt;p&gt;That's not a safety system. That's a gentleman's agreement.&lt;/p&gt;

&lt;p&gt;I built PocketTeam partly to solve a real workflow problem (solo devs skipping pipeline steps), but the most interesting engineering challenge was this: how do you make an agentic system safe in a way that actually holds up?&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;The problem with prompt-based safety&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Prompt instructions fail in at least three ways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Context compaction.&lt;/strong&gt; When an agent's context window fills, older content gets summarized or dropped. Your safety instructions might not survive.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Prompt injection.&lt;/strong&gt; A malicious or malformed input can override instructions if they're just text in the conversation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Emergent behavior.&lt;/strong&gt; Even well-instructed models sometimes do unexpected things. "Please don't" is probabilistic guidance, not a hard constraint.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For a system that runs code, deploys to production, and has access to your filesystem, probabilistic guidance is not enough.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;The solution: hooks at the tool-call level&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Claude Code has a hook system. Hooks run before and after tool calls. They're Python scripts — not LLM context.&lt;/p&gt;

&lt;p&gt;PocketTeam's safety is implemented as 9 hook layers that run on every tool invocation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Simplified example of what a hook checks
&lt;/span&gt;&lt;span class="n"&gt;BLOCKED_PATHS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.env&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.ssh&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.aws&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*.pem&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*.key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;BLOCKED_COMMANDS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rm -rf /&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DROP DATABASE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TRUNCATE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:(){ :|:&amp;amp; };:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;pre_tool_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_input&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;tool_name&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;write_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;fnmatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_input&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;path&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;pattern&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;BLOCKED_PATHS&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Blocked: write to sensitive path rejected&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;tool_name&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;blocked&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;BLOCKED_COMMANDS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;blocked&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tool_input&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;command&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Blocked: dangerous command rejected&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;  &lt;span class="c1"&gt;# allow
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The LLM never gets to execute the blocked operation. The hook rejects the call and returns an error. The LLM sees a tool failure and (usually) tries a different approach.&lt;/p&gt;

&lt;p&gt;This survives context compaction because the hook code is not in the context — it's in the file system, registered with the Claude Code runtime.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;The pipeline enforcement&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Beyond blocking dangerous operations, the hooks enforce pipeline sequencing. The DevOps agent's deploy tool will fail if QA and Security haven't marked their steps complete. This isn't a prompt instruction to "only deploy after testing" — it's a check in the hook that reads a status file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Simplified
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;pre_deploy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_input&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;read_status_file&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;qa_passed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;security_passed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Deploy blocked: QA and Security must pass first&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;&lt;strong&gt;Persistent learnings: the compounding advantage&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The second interesting design decision: making the system improve over time.&lt;/p&gt;

&lt;p&gt;After every completed task, an Observer agent runs. It writes structured learnings to agent-specific markdown files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# learnings/engineer.md&lt;/span&gt;

&lt;span class="gu"&gt;## Patterns in this codebase&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Always use the &lt;span class="sb"&gt;`db.transaction()`&lt;/span&gt; context manager for multi-step DB writes
&lt;span class="p"&gt;-&lt;/span&gt; The test suite requires REDIS_URL to be set even for non-cache tests
&lt;span class="p"&gt;-&lt;/span&gt; Migrations live in /db/migrations, not /migrations

&lt;span class="gu"&gt;## Common mistakes to avoid&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Do not import from &lt;span class="sb"&gt;`utils/legacy`&lt;/span&gt; — those functions are deprecated
&lt;span class="p"&gt;-&lt;/span&gt; The &lt;span class="sb"&gt;`Config`&lt;/span&gt; class is a singleton; don't instantiate it directly
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These files persist across sessions. Future Claude Code sessions inject them into agent context. The agents that run in month 3 have access to everything learned in months 1 and 2.&lt;/p&gt;

&lt;p&gt;This is not RAG or vector search. It's structured, curated, agent-specific institutional memory written by an agent that watched the previous task run.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Self-healing via GitHub Actions&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The third piece: making broken builds autonomous to fix.&lt;/p&gt;

&lt;p&gt;The workflow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;GHA build fails&lt;/li&gt;
&lt;li&gt;A GHA workflow triggers &lt;code&gt;pt fix --ci&lt;/code&gt; via the Claude Code CLI&lt;/li&gt;
&lt;li&gt;An Investigator agent runs root cause analysis&lt;/li&gt;
&lt;li&gt;An Engineer agent creates a fix (on a branch)&lt;/li&gt;
&lt;li&gt;A Telegram notification is sent with the fix plan and a diff summary&lt;/li&gt;
&lt;li&gt;The developer approves from their phone&lt;/li&gt;
&lt;li&gt;The fix is merged and the pipeline reruns&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The key insight: this uses GitHub Actions as the trigger, not a polling daemon. No persistent process to maintain. No always-on connection. Just a GHA step that runs &lt;code&gt;pt fix&lt;/code&gt; when a build fails.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;59 built-in skills&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Each agent in PocketTeam draws from a library of 59 structured skill procedures. These aren't just additional prompt instructions — they're markdown files with step-by-step workflows that agents follow.&lt;/p&gt;

&lt;p&gt;A few examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;owasp-audit.md&lt;/code&gt; — step-by-step OWASP Top 10 check procedure for the Security agent&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;tdd-london.md&lt;/code&gt; — London School TDD workflow for the Engineer agent&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;codebase-map.md&lt;/code&gt; — procedure for generating a full codebase overview before planning&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;cost-tracker.md&lt;/code&gt; — token cost estimation and reporting&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;fan-out.md&lt;/code&gt; — wave-based parallel execution on git worktrees&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;atomic-commits.md&lt;/code&gt; — commit structuring guidelines with format rules&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Agents declare which skills they use in their frontmatter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;engineer&lt;/span&gt;
&lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;claude-opus-4-5&lt;/span&gt;
&lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;read&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;edit&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;bash&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;mcp&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="na"&gt;skills&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;tdd-london&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;atomic-commits&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;fan-out&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;codebase-map&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;&lt;strong&gt;ptbrowse: built-in browser testing without the token overhead&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The QA agent doesn't just run unit tests. It opens your app in a real headless Chromium browser and verifies the UI works.&lt;/p&gt;

&lt;p&gt;The insight behind the implementation: screenshots are expensive. A single screenshot can cost thousands of tokens. For an agent doing E2E testing with multiple steps, that adds up fast.&lt;/p&gt;

&lt;p&gt;ptbrowse solves this by using &lt;strong&gt;Accessibility Tree snapshots&lt;/strong&gt; instead. An accessibility tree is a structured representation of what's on screen — element roles, labels, references — at around 100–300 tokens per snapshot. You get enough information to navigate and assert without the visual overhead.&lt;/p&gt;

&lt;p&gt;The QA agent interacts with the browser using a command-line interface:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ptbrowse navigate http://localhost:3000/login
&lt;span class="c"&gt;# → returns accessibility tree snapshot with element refs like @e1, @e2, @e3&lt;/span&gt;

ptbrowse fill @e3 &lt;span class="s2"&gt;"user@example.com"&lt;/span&gt;   &lt;span class="c"&gt;# fill email field&lt;/span&gt;
ptbrowse fill @e4 &lt;span class="s2"&gt;"mypassword"&lt;/span&gt;          &lt;span class="c"&gt;# fill password field&lt;/span&gt;
ptbrowse click @e5                      &lt;span class="c"&gt;# click login button&lt;/span&gt;

ptbrowse &lt;span class="nb"&gt;wait &lt;/span&gt;text &lt;span class="s2"&gt;"Dashboard"&lt;/span&gt;          &lt;span class="c"&gt;# wait for redirect&lt;/span&gt;
ptbrowse assert text &lt;span class="s2"&gt;"Login successful"&lt;/span&gt; &lt;span class="c"&gt;# verify outcome&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Exit codes are structured for agent consumption: &lt;code&gt;0&lt;/code&gt; success, &lt;code&gt;1&lt;/code&gt; assertion failed, &lt;code&gt;2&lt;/code&gt; stale element reference, &lt;code&gt;3&lt;/code&gt; timeout. This means the QA agent can branch on outcomes without parsing text.&lt;/p&gt;

&lt;p&gt;Screenshots are also available for visual verification, saved to &lt;code&gt;.pocketteam/screenshots/&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ptbrowse screenshot login-page.png
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The browser daemon auto-starts on first use and shuts down after 30 minutes of idle time. No config file, no setup step, no Docker container to manage separately.&lt;/p&gt;

&lt;p&gt;Set &lt;code&gt;PTBROWSE_HEADED=1&lt;/code&gt; to run in headed mode — useful for watching the QA agent work visually during development.&lt;/p&gt;

&lt;p&gt;The result: your AI team doesn't just run unit tests. It opens your app in a real browser and verifies the UI works. Token-efficiently.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Magic keywords&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One implementation detail worth sharing: workflow modes via UserPromptSubmit hooks.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;autopilot: add dark mode toggle  → full pipeline, human gates bypassed
ralph: fix the payment tests     → TDD loop, max 5 iterations until green
quick: rename this function      → skip planning, implement directly
deep-dive: our auth architecture → 3 parallel research agents, synthesized report
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The hook detects the keyword in the first message and injects orchestration instructions into the session before the COO agent sees it. The COO then runs the appropriate pipeline. This keeps the behavior change entirely in the pipeline layer — the individual agents don't need to know which mode they're in.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;The honest limitations&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Works best on structured codebases with existing tests. Messy legacy code will produce messy plans.&lt;/li&gt;
&lt;li&gt;Complex tasks consume real tokens. A full autopilot run on a medium-sized feature can be expensive.&lt;/li&gt;
&lt;li&gt;Telegram setup is documented but not zero-config. If you don't want it, you don't need it.&lt;/li&gt;
&lt;li&gt;The self-healing pipeline requires a bit of GHA configuration.&lt;/li&gt;
&lt;li&gt;v1.x — there are rough edges. Open issues, I'll fix them.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Try it&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pipx &lt;span class="nb"&gt;install &lt;/span&gt;pocketteam
pt init
pt start
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GitHub: &lt;a href="https://github.com/Farid046/PocketTeam" rel="noopener noreferrer"&gt;https://github.com/Farid046/PocketTeam&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;MIT License. Open source. The hook system and skill library are the parts most worth reading if you want to build something similar.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>opensource</category>
      <category>automation</category>
    </item>
  </channel>
</rss>
