What this reads like
Continuation of Why AI Agents Don't Follow Rules. Same thesis: policy text settles at load time; physical constraints settle at execution time. Here we show artifacts you can cite inside a governed monorepo: hashed commits, enumerated checks, CI job lanes—without asking strangers to trust a private Actions permalink.
Hook-level code belongs in #003 — Binding AI agents with physics. Production failure patterns are in #005 — Four ways agents silently fail.
What we actually did
Inside a repo running under AOS v0.1 zone semantics, we stood up a thin smoke pillar—not a hero demo, but a tripwire so automated regressions bite when someone "helpfully" rewrites evals or oracle fixtures.
Typical layout (repo-specific paths, portable idea):
tools/smoke_pillar/
├── main.py
├── evals/
├── playwright/ # browser tests isolated from Python core
└── manifest.json # declares writable zones
Design before bytes
The directory tree was not hand-drawn and then backfilled. A scaffold generator (template that emits the full tool tree) ran first; humans and agents edited only inside Permitted zones afterward.
| Step | Action | Why |
|---|---|---|
| 1 | Register the tool shape in an internal design registry | Fix boundaries before line 1 |
| 2 | Generator emits manifest, evals harness, test config | Avoid cosmetic folder sprawl |
| 3 | Edits stay in implementation workspace | Keep oracle/eval truth out of generation paths |
Public vocabulary lives in AOS-spec. Internal ledgers are ops indexing—not something readers need to mirror verbatim.
CI mold — patterns you can copy
After the smoke pillar passed once, we hardened the template so new tools survive bare python3 on GitHub Actions matrices:
| Move | Purpose |
|---|---|
main.py --help exits cleanly before heavy imports |
survives venv-less CI |
optional .env
|
secrets-free matrices |
| keep heavy type-check deps out of baseline requirements unless opted in | deterministic smoke band |
timeout wrappers on local diagnostics |
agents cannot hang infra silently |
| sibling regression probe tool | tripwire if the template starts lying |
The probe is not a vanity metric—it catches "forge stayed green once" rot after refactors.
Local gates before push
Rough checklist historically satisfied:
| Check | Passing means |
|---|---|
python3 evals/run_evals.py |
exit 0, no intentional skips |
npx playwright test inside the tool's isolated test dir |
1 passed, scoped runs only |
| repo layout compliance script (structure audit) | OK / no critical drift |
pre-commit may re-run the structure audit so "green locally" leaks less often onto main. Hooks (PreToolUse, exit 2) and CI are different layers with the same philosophy: stop right before merge or disk.
Commits as receipts (not folklore)
We anchor milestones to short SHAs (your fork will differ—the pattern is the point):
| SHA (prefix) | What changed |
|---|---|
d303ece0 |
initial smoke scaffold + manifest |
85a524e0 |
verification notes + metadata sync |
2bcbb52c |
import-order resilience for naked CI Python |
9870fa67 |
template CI hardening + regression probe |
143dda68 |
tip where the cited graph was green |
URLs rot. SHA + job lane names travel better in outbound writing.
Why we skip raw Actions permalinks
The monorepo is private.
A pasted actions/runs/... badge 404s outside the org and fingerprints repo ownership. For external readers we ship:
- commit SHAs (above)
-
job lanes that were green together—e.g.
evals-matrix,independent-judge, Playwright smoke, structure-audit matrix - cloneable AOS-spec as vocabulary proof
"We cannot show our CI UI" is fine if repeatable commands + public spec remain inspectable.
Agent-operated commits (with caveats)
During this milestone, the human operator did not manually type git commit / git push. An agent toolchain issued operations under consistent author metadata.
Git metadata alone is forgeable. Hence the layered receipts: evals, Playwright, structure audit, and an independent judge job green on the same graph as the cited SHA. "An agent did everything" ≠ "safe" without that stack.
Hook denials — a separate receipt class
Distinct from CI: PreToolUse hook returns exit 2 and the Write never reaches disk. That is execution-time denial with a log excerpt—not prompt theater. Same family as #003.
Independent judge lane
A CI job reviews diffs with a vendor-separated model from the authoring stack.
Letting the same session say "looks fine" is self-grading. That is verification contamination.
Scheduled CI embarrassment beats a chat message that says "all good."
Practical limits
| Constraint | Meaning |
|---|---|
| Private repo narrative | method essay, not a file tour |
permissions: contents: read in workflows |
narrower blast radius |
What we actually check before merge
"This change is safe" shows up in agent chat all the time. We do not merge on that sentence alone.
We ask for the commit SHA and the CI graph: independent-judge and evals-matrix green on the same workflow run. Run ID, Actions export, or a screenshot—all fine.
If that cannot be produced, the change waits. PRs with polished logs but no matching graph show up more often than you might expect.
Where this series goes next
CI and hooks cover execution-time denial. Silent production failures—no trace, no persistence—are #005 plus physical-agent-patterns.
AOS Specification (GitHub)
The "physical governance" approach in this article is formalized as AOS (AI Operating Standard) — v0.2 adds runnable implementation examples.
👉 github.com/aos-standard/AOS-spec — specification
👉 github.com/aos-standard/physical-agent-patterns — patterns
If useful, please ⭐ star the repo. Issues and PRs welcome.
Top comments (0)