wolfejam.dev

Posted on Jul 4 • Edited on Jul 11

AGENTS.md, Hands-On: Build One Step by Step (and Watch an Agent Use It)

#ai #programming #tutorial #agents

In the field guide I covered what an AGENTS.md is and what belongs in it. This is the hands-on follow-up: we'll build a complete AGENTS.md for a real project, one section at a time, then point an AI coding agent at it and watch the difference it makes. By the end you'll have a working file — and you'll have seen it pay off.

New to AGENTS.md? It's a single Markdown file at the root of your repo that tells AI coding agents how to work in it — build steps, tests, conventions, guardrails. The "why" behind each section is in the field guide.

The project we'll use

We'll write the AGENTS.md for a small but real service: a URL shortener API in Python — FastAPI, SQLite, pytest. A couple of endpoints, a thin data layer, a test suite. Follow along with this, or swap in your own repo — the steps are identical.

Its shape:

linkshort/
  app/
    main.py        # FastAPI routes
    db.py          # SQLite access
    models.py      # Pydantic models
  migrations/      # generated SQL — not hand-edited
  tests/
  requirements.txt

Step 0 — Start with an empty file

At the repo root:

touch AGENTS.md

That's the whole step. We'll fill it in one section at a time, building toward a file an agent can read in thirty seconds.

Step 1 — Orientation: one line

Tell the agent what it's looking at. Add:

# AGENTS.md

A URL shortener API in Python — FastAPI, SQLite, pytest.

One sentence sets the agent's priors: it knows the language, framework, and storage before it reads a single line of code.

Step 2 — Setup and run

The agent can't help if it can't start the project. Add the real, copy-pasteable commands:

## Setup
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

## Run
uvicorn app.main:app --reload   # http://localhost:8000

Use the commands that actually work in your repo — no placeholders.

Step 3 — Tests: the agent's feedback loop

This is the most important section, because tests are how the agent checks its own work. Add:

## Test — all must pass before a change is done
pytest
ruff check .
mypy app

Now the agent knows how to verify a change and the bar it has to clear. An agent that knows pytest will run it; one that doesn't hands you a broken branch.

Step 4 — The map: where things live

A short map so the agent finds its way without spelunking the whole tree:

## Structure
- app/main.py    route handlers
- app/db.py      SQLite access (parameterized queries only, never string-built SQL)
- app/models.py  Pydantic request/response models
- migrations/    generated SQL — do not hand-edit
- tests/         pytest, mirroring app/

Notice we're already slipping a convention ("parameterized queries only") and a guardrail ("do not hand-edit") in right where they're relevant.

Step 5 — Conventions: the house style

The patterns you want followed. Be specific — vague rules are noise:

## Conventions
- Validate all input with Pydantic models at the route boundary.
- Raise HTTPException for client errors; never return raw dicts on failure.
- Type everything; mypy must stay clean.
- Match the style of the surrounding file.

"Type everything; mypy must stay clean" tells the agent exactly what to do. "Write good code" wouldn't.

Step 6 — Commits and PRs

If your agent opens PRs, give it the house rules:

## Commits & PRs
- Conventional Commits (feat:, fix:, chore:).
- One logical change per PR; update CHANGELOG.md.

Step 7 — Guardrails: the landmines

The "don'ts" that prevent expensive mistakes:

## Don't
- Don't hand-edit migrations/ — they're generated.
- Don't commit directly to main — branch and open a PR.
- Never run the seed script against a non-local database.

Your finished AGENTS.md

Put it together and you have a complete, copy-pasteable file:

# AGENTS.md

A URL shortener API in Python — FastAPI, SQLite, pytest.

## Setup
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

## Run
uvicorn app.main:app --reload   # http://localhost:8000

## Test — all must pass before a change is done
pytest
ruff check .
mypy app

## Structure
- app/main.py    route handlers
- app/db.py      SQLite access (parameterized queries only, never string-built SQL)
- app/models.py  Pydantic request/response models
- migrations/    generated SQL — do not hand-edit
- tests/         pytest, mirroring app/

## Conventions
- Validate all input with Pydantic models at the route boundary.
- Raise HTTPException for client errors; never return raw dicts on failure.
- Type everything; mypy must stay clean.
- Match the style of the surrounding file.

## Commits & PRs
- Conventional Commits (feat:, fix:, chore:).
- One logical change per PR; update CHANGELOG.md.

## Don't
- Don't hand-edit migrations/ — they're generated.
- Don't commit directly to main — branch and open a PR.
- Never run the seed script against a non-local database.

Thirty seconds to read. Now let's see if it works.

Step 8 — Prove it: point an agent at it

This is the part that matters. Open your repo in an AI coding agent — Claude Code, Cursor, Codex, whatever you use — and give it a real task:

"Add a DELETE /links/{code} endpoint that removes a link, with a test."

Watch what it does with the AGENTS.md in place:

It reads the file first — it knows the stack and where routes live.
It adds the handler in app/main.py, validating input the way your conventions require.
It writes a pytest test in tests/, mirroring the structure.
It runs pytest, ruff, and mypy — because you told it that's the bar — and fixes what fails.
It doesn't touch migrations/, and it doesn't commit to main — it opens a branch.

Now picture the same task without the file. The agent has to guess: Which test runner? Where do routes go? Is there a lint step? So it asks you, or it guesses wrong, or it edits a generated file you'll have to revert. The AGENTS.md is the difference between an agent that interrupts you and one that just ships.

That's the whole payoff — and you can watch it happen in real time.

Keep it alive

One habit before you go: treat the file like code. When the test command changes, or you add a directory, or you catch yourself telling the agent the same thing twice — update AGENTS.md in the same breath. A stale file is worse than none, because the agent trusts it.

That's the loop

You started with an empty file, added eight short sections, and watched an agent use every one of them to land a correct, tested change without hand-holding. Write it once, and every agent that walks into your repo gets the same briefing.

This was the hands-on build. For the principles behind each section — what belongs, the anti-patterns, why short beats complete — see the field guide.

Top comments (2)

Gábor Mészáros • Jul 8 • Edited

Good article! Some caveats:

whenever you are using a specific tool, like python -m or mypy or ruff or HTTPException etc, add inline code block formatting, like mypy app. This formatting helps greatly when the context is large and pressurized. Without it - on saturated context - the instruction is handled as abstract usually.
Structure: worth moving into a separate yaml file
Instruction length: worth having them around 8-12 tokens length (much more compliant)

One mayor thing in this approach that I think would be the most beneficial is to add enabling instruction with an on-topic saline context, and only then follow it with restrictive instruction. We've learned that only giving what not to do to the agents usually priming the model to DO the thing (reporails.com/rules/core/instructi...)

Looking forward the next article, I liked this one too!

wolfejam.dev • Jul 10

Really appreciate this, Gábor — feedback from someone scoring instruction quality at scale carries real weight.

You're right on the inline-code formatting, and the why is the best part: a backticked mypy reads as a literal, not an abstraction, when the context is saturated. I was inconsistent there — fixing it.

The enabling-before-restrictive point is the one I keep turning over. The piece front-loads the enabling sections, but the Guardrails block is bare "Don't" — and you're right that a naked prohibition can prime the exact behavior it's warning against. Leading each with its positive form ("branch and open a PR" is the enable; "don't commit to main" is just the tail) is a genuine sharpening. It's going into the next one.

On the separate YAML: the file the agents read has to stay markdown (that's the standard), but I take the point that the structure wants to be more data-like than prose — real design space between "a markdown file" and "a typed source." That's actually the thread the next piece pulls on: keeping the file true as the code moves, right up to where you stop hand-editing it and generate it from something that can't drift.

Thanks for reading closely enough to make it better — genuinely curious what you'll make of the next one.