Daniel Butler

Posted on Feb 16 • Edited on Mar 2

Designing agentic workflows: the core loop

#ai #agents #tooling

The previous posts laid out the failure modes and the initial structure:

Designing agentic workflows: where agents fail and where we fail
https://dev.to/danielbutlerirl/designing-agentic-workflows-where-agents-fail-and-where-we-fail-4a95
Designing agentic workflows: a practical example
https://dev.to/danielbutlerirl/designing-agentic-workflows-a-practical-example-291j

Those posts described where agents fail, where we fail, and what a constrained workflow looks like in principle.

This post shows the core loop as it is implemented in practice:

https://github.com/daniel-butler-irl/sample-agentic-workflows

This is not a prompt collection. It is a sequence. The ordering matters.

Sessions are disposable. The repository is not

Every command in this workflow runs in a fresh session.

That decision drives everything else.

Long sessions drift. Context grows. Earlier constraints become less salient. Decisions get made implicitly and are hard to reconstruct later. Instead of trying to manage that complexity, the workflow treats sessions as disposable and moves all durable state into the repository being changed.

On the feature branch, alongside the code, you will find:

AGENTS.md (or CLAUDE.md for Claude Code)
.agents/tasks/<issue>/gates.md
.agents/tasks/<issue>/task-N.md
.agents/tasks/<issue>/cleanup.md

Those files are committed. They provide traceability of intent, an explicit definition of done, recorded discoveries, and an auditable cleanup step before the PR.

If it is not written down in the repository, it does not persist.

Intent exists before the workflow starts

Those artifacts — gates, tasks, cleanup — all assume an issue already exists with defined objective, scope, and success criteria.

That issue can be a GitHub issue, a Jira ticket, a markdown file in .agents/issues/, or whatever your team uses. The format does not matter. The content does: objective defined, scope bounded, success criteria explicit.

The agent does not start by “figuring out what to do.” It starts from defined intent.

Everything that follows is downstream of that.

If you need to investigate a codebase or research architectural decisions before defining clear gates, the repository includes supplementary commands for those situations. This post focuses on the core loop once intent is established.

`AGENTS.md`: non-negotiables and evolving guardrails

This file is injected into every session. It is the first constraint in the system.

Agents optimise for visible signals: passing tests, plausible diffs, confident summaries. If you do not constrain that behaviour, they will take shortcuts.

The repository includes a base AGENTS.md template. The generic parts handle commit discipline and test protection. The real value is in project-specific rules.

That section evolves.

You will see patterns. Maybe the agent keeps adding axios when your standard is fetch. Maybe it refactors files you meant to leave stable. Maybe it rewrites tests instead of fixing the implementation.

When I saw axios added the third time in a codebase that uses fetch, this rule was added:

Never add axios. This project uses native fetch.

That mistake never happened again.

Because sessions are fresh by design, the only durable memory is what you encode here.

This file lives in source control. The entire team works against the same rule set. When a new constraint is added, it is visible in the diff. Some rules are permanent. Some are temporary, tied to a migration or architectural transition. The file evolves with the codebase.

Keep it under 200 lines. Longer files dilute the important rules. This is not a coding standard. It is a focused control surface aimed at preventing known shortcuts and protecting architectural boundaries.

`wf-01`: Define gates before writing code

The agent does not immediately modify code.

wf-01 reads the issue and produces:

.agents/tasks/<issue>/gates.md

Each gate defines:

A concrete success condition.
How it will be verified.
A complexity classification (SIMPLE or COMPLEX).

The verification must be independent of the agent’s judgement.

A gate might look like:

email_validation_rejects_invalid_domains test fails before the change and passes after.
terraform plan shows exactly three new resources and no replacements.
Manual: saving preferences with an invalid form leaves the save button disabled.

Invalid examples are things like “The implementation looks correct” or “The agent confirms this works.”

Gates define the contract. They make “done” explicit before implementation begins.

`wf-02`: Plan tasks as bounded units of change

wf-02 reads gates.md and creates:

.agents/tasks/<issue>/task-N.md

Each task:

Covers one or more specific gates.
Is sized to a single commit.
Produces a reviewable diff.

The default is to plan one task at a time.

Complex changes surface unknowns during implementation. Planning only the next task allows direction to change without rewriting a stale multi-step plan.

For trivial, deterministic changes, planning all tasks up front is fine. The SIMPLE/COMPLEX classification in gates.md makes that explicit.

Each task file includes:

An implementation checklist.
A completion checklist tied to gates.
A commit template.
An Implementation Notes section.

Implementation Notes capture discoveries that would otherwise be lost between fresh sessions.

For example, during one change I discovered that UserService already handled rate limiting. Instead of introducing new middleware, the next task reused that logic. That discovery was written into Implementation Notes so it became part of the durable context, not tribal knowledge from a single session.

`wf-03`: Implement one task per session

wf-03 executes exactly one task.

The agent:

Reads task-N.md.
Executes the checklist.
Verifies the gates assigned to that task.
Records discoveries.
Stops.

The human reviews and commits.

The agent never commits.

Then the loop repeats:

Plan the next task.
Implement exactly one task.
Stop.
Commit.

The loop ends when all gates in gates.md are satisfied.

Not when “all tasks are complete.” Tasks are scaffolding. Gates define the contract.

Keeping changes at commit-sized units preserves review quality. It keeps the reviewer in verification mode instead of plausibility mode.

`wf-04`: Cleanup before the PR

Cleanup runs once per issue, after all tasks are implemented and all gates pass. Not before.

The branch must represent the complete intended change before you audit it.

Cleanup produces:

.agents/tasks/<issue>/cleanup.md

It performs three steps:

Audit the branch for residue.
Apply selected fixes.
Re-run all gates and the full test suite.

Only after cleanup passes do you open the PR.

This step catches the small things that accumulate during incremental work: temporary logging, unused imports, defensive code that became unnecessary. It also forces a final re-validation of the original contract.

How this constrains the failure modes

This workflow does not try to make the agent smarter. It changes the environment the agent operates in.

AGENTS.md prevents shortcuts the agent would otherwise take: deleting tests to make CI green, introducing inappropriate dependencies, bypassing established patterns. That constrains baby-counting, half-assing, and scope creep.

Gates force independent verification before any claim of completion. That constrains cardboard-muffin implementations and premature “done” signals.

Commit-sized tasks keep review within human cognitive limits. That constrains review fatigue and rubber-stamping.

The human-commits-only rule keeps architectural decisions explicit. That constrains decision delegation.

Implementation Notes turn discoveries into durable context across fresh sessions. That constrains intent drift.

Cleanup catches residue systematically rather than hoping a reviewer notices it. That constrains litterbug behaviour.

The workflow assumes these failures will occur if the structure allows them. The structure is designed to make them harder to hide and cheaper to detect.

The sequence

For each issue:

Define gates.
Plan a bounded task.
Implement exactly one task.
Repeat steps 2–3 until all gates pass.
Run cleanup.
Re-validate all gates.
Open the PR.

That ordering is not accidental.

If you change the order, you weaken the constraints.

If you keep the sequence intact, the failure modes described earlier are systematically constrained instead of managed informally.

What comes next

This loop is the default path. Most issues go straight through it.

When two pressures appear - context degradation or intent drift - supplementary commands exist to keep the loop viable. Those commands and when to use them are covered in the next post:

Designing agentic workflows: supplementary commands and pressure valves

DEV Community

Designing agentic workflows: the core loop

Sessions are disposable. The repository is not

Intent exists before the workflow starts

`AGENTS.md`: non-negotiables and evolving guardrails

`wf-01`: Define gates before writing code

`wf-02`: Plan tasks as bounded units of change

`wf-03`: Implement one task per session

`wf-04`: Cleanup before the PR

How this constrains the failure modes

The sequence

What comes next

Top comments (0)

Sessions are disposable. The repository is not

Intent exists before the workflow starts

AGENTS.md: non-negotiables and evolving guardrails

wf-01: Define gates before writing code

wf-02: Plan tasks as bounded units of change

wf-03: Implement one task per session

wf-04: Cleanup before the PR

How this constrains the failure modes

The sequence

What comes next

`AGENTS.md`: non-negotiables and evolving guardrails

`wf-01`: Define gates before writing code

`wf-02`: Plan tasks as bounded units of change

`wf-03`: Implement one task per session

`wf-04`: Cleanup before the PR