Hector Flores

Posted on May 14 • Originally published at htek.dev

Stop Trusting AI Agents with Git — Start Governing Them

#github #agenticdevelopment #devops #contextengineering

The 2 AM Disaster You Don't See Coming

I run AI agents as my primary development interface — not just for code generation, but for the entire lifecycle: branching, committing, pushing, PR creation, deployment. I use git worktrees extensively for parallel work streams, which means the surface area for agent mistakes is enormous.

And agents make mistakes. Not maliciously. Not because they're broken. Because they're non-deterministic.

You tell an agent to "commit and push this change," and most of the time it does exactly what you'd expect. But often enough to matter? It creates a branch off HEAD~3 instead of main. It pushes to the wrong remote. It runs git add . in the repo root instead of the worktree. It stages your .env file. It invents a merge strategy you didn't ask for.

The standard advice is to write better instructions. Put your git workflow in copilot-instructions.md. Be more specific. I tried all of that. It's insufficient.

Instructions are suggestions. Agents follow them until they don't. And when an agent deviates from your git workflow at 2 AM during an automated pipeline run, you don't find out until you're staring at a force-push that rewrote your commit history.

Instructions vs. Enforcement: A Critical Distinction

I've written about this tension before — in agent hooks, I showed how to build enforcement layers that protect architecture boundaries. In agent harnesses, I covered the broader infrastructure pattern that wraps around agent execution. And cryptographic approval gates demonstrated how digital signatures can enforce human approval.

Those are all valid patterns. But the one that's had the most impact on my daily workflow is the simplest: don't let agents use raw git at all.

The pattern is called primitive blocking + tool replacement. It works in two steps:

Block — Hookflows intercept raw git commands before they execute and return a denial
Replace — A CLI extension provides governed tool alternatives that enforce your workflow rules structurally

When an agent tries to run git commit, it gets blocked. When it uses dev_commit instead, it gets input validation, co-author tracing, and staged-change verification — automatically. The agent doesn't have to remember your rules. The rules are embedded in the tool.

What a Hookflow Rule Actually Looks Like

Copilot CLI hookflows are declarative markdown files in .github/hookflows/ with YAML frontmatter. Think of them as packet filters for AI behavior — iptables for your agent.

Here's the high-level shape of a git-blocking hookflow:

---
name: Block git write commands
description: "Blocks raw git write commands. Use dev-workflow tools instead."
event: bash
action: block
lifecycle: pre
conditions:
  - field: command
    operator: regex_match
    pattern: "\\bgit\\s+(add|commit|push|pull|checkout|merge|rebase|reset|stash|worktree)"
---

## Blocked — Use Governed Tools

| Blocked Command | Use Instead |
|----------------|-------------|
| `git add`      | `dev_add`   |
| `git commit`   | `dev_commit` |
| `git push`     | `dev_push`  |
| `git checkout` | `dev_checkout` |

The event: bash means it fires on any shell command. The action: block stops execution before it happens. The lifecycle: pre means it intercepts before the command runs — not after the damage is done. The body below the frontmatter is what the agent sees when blocked: a mapping table that teaches it the correct tools.

Note what's not blocked: git log, git diff, git show, git blame. Read-only operations are safe. I only block operations that mutate state.

This is the declarative firewall. One file, ~15 lines of YAML, and raw git writes are structurally impossible for the agent.

Why Governed Tools Are Better Than What They Replace

Blocking is only half the equation. The other half is providing replacement tools that are better than the primitives. If your governed tools are worse than raw git, agents will fight the system.

My dev-workflow extension provides governed alternatives for every blocked operation. Here's why they're genuinely superior:

dev_commit instead of git commit:

Validates that staged changes actually exist (no empty commits)
Auto-adds a Co-authored-by trailer on every commit — an audit trail that traces AI vs. human authorship
Requires a commit message (no git commit --allow-empty-message)

dev_push instead of git push:

Auto-detects whether you're in a Vercel-connected repo
Polls for the preview URL after pushing a feature branch
Returns structured JSON with the preview deployment status

dev_reset instead of git reset:

Hard resets require an explicit confirm_hard=true parameter
Defaults to mixed reset (the safe option)
Returns a clear warning before destroying uncommitted work

Each governed tool does more than its primitive equivalent. The agent gets better outcomes by using governed tools, which means there's no incentive to fight the system. It's governance through better design, not just restriction.

The Bypass Problem (and How to Close It)

Here's something most people don't think about: agents try to work around blocks.

Not deliberately — but when a hookflow blocks git commit, an agent might try:

$cmd = "git"
& $cmd commit -m "sneak past the filter"

Or:

Invoke-Expression "git push origin main"

This is why you need a second hookflow that catches indirect execution patterns — variable expansion, Invoke-Expression, piped commands. In practice, I've seen agents attempt all of these when the primary block fires. A single hookflow isn't enough. You need defense in depth.

The good news: once you've closed the three or four common evasion patterns, agents converge on using the governed tools reliably. They stop testing the boundaries because every path leads to the same answer: use the tool.

The Closed Loop: Self-Healing Tool Surfaces

The most elegant part of this architecture is what happens when an agent hits a git operation that isn't covered by the governed tools.

The catch-all hookflow blocks the command and tells the agent: "This operation doesn't have a governed tool yet. Add one to the dev-workflow extension."

And here's the thing — the agent can actually do that. Copilot CLI extensions are just JavaScript files in .github/extensions/. The agent can read the existing extension, understand the pattern, add a new tool function, and call extensions_reload to activate it in the same session.

The tool surface grows organically. Gaps get filled. The system self-heals. I started with three tools (dev_commit, dev_push, dev_add) and now have twelve — most added by agents when they hit operations I hadn't covered yet.

Why This Matters for Accelerated Development

If you're on the agentic development maturity curve, this pattern marks the transition from Stage 3 (over-engineering) to Stage 4 (structural simplicity).

Early in agentic development, you write longer and longer instructions trying to control agent behavior. You add more context, more rules, more guardrails. It works — until it doesn't. The instructions get so complex that agents misinterpret them, or they conflict, or they simply get ignored in a long context window.

The mature pattern is structural enforcement: make the correct behavior the only behavior. That's what hookflows + extensions achieve. The agent can't push to the wrong branch because dev_push validates the target. The agent can't skip the co-author trailer because dev_commit adds it automatically. The agent can't hard-reset without confirmation because dev_reset requires it.

This is the same evolution every other part of software engineering went through:

Deployment: manual FTP → CI/CD pipelines
Code quality: review comments → linters and formatters
Security: password policies → MFA and hardware keys

AI agent governance is next. And unlike most "governance frameworks," this one costs you about ~50 lines of YAML and a single JavaScript extension. No vendor lock-in. No SaaS dependency. Just files in your .github/ directory.

The Deep Dive

Everything I've covered here is the pattern — the conceptual architecture that makes governed agentic development work. It's enough to understand why this matters and how the pieces fit together.

But patterns aren't production. If you want:

The complete hookflow rule files — all three, including the bypass catcher and the helper-tool blocker
The full dev-workflow extension code — 1,200 lines, twelve tools, zero dependencies
The story of dev-guard, the failed approach I tried before hookflows (and why it taught me everything)
The 3-layer architecture diagram that ties hookflows, extensions, and agent instructions together
How this pattern extends beyond git — to infrastructure, secrets, and deployment governance

That's all in Issue 001 of the htek.dev newsletter: Controlling Dev Workflows in an Age of Non-Deterministic Agentic Systems. It's the production-grade deep dive — twelve code snippets, real failure stories, and the exact configs I run daily.

The Bottom Line

Instructions tell agents what to do. Hookflows make everything else impossible.

If you're running AI agents in your development workflow — especially with git worktrees, parallel sessions, or automated pipelines — you need structural enforcement, not behavioral suggestions. The pattern is simple: block the primitive, replace it with something better, let the tool surface evolve.

You can start today with a single hookflow file and three governed tools. Your agents will thank you. Your commit history definitely will.

Want the full production configs, the 3-layer architecture, and the code I actually run? Subscribe to the newsletter →

Resources

Copilot CLI Hooks Configuration — Official GitHub documentation on hookflows
GitHub Copilot CLI Extensions: The Complete Guide — Full extension architecture and API reference
Copilot CLI Extensions Cookbook — 16 production-ready extension examples
Agent Hooks: Controlling AI in Your Codebase — Hook-based enforcement for architecture rules
Agent Harnesses: Controlling AI Agents — The broader control-plane pattern
Git Worktree for Agentic Development — Why worktrees are essential infrastructure
Cryptographic Approval Gates — Digital signature enforcement for AI agents
The Agentic Development Maturity Curve — Why experts return to simplicity
Anthropic: Effective Harnesses for Long-Running Agents — Anthropic's take on agent lifecycle management

DEV Community