I’ve been experimenting with a spec-driven workflow, and I accidentally discovered something I didn’t expect: the agent started reviewing and improving its own work.
What I discovered is not new in terms of agentic AI; it's the point of agentic AI, but how I stumbled across it in my test was interesting nonetheless.
The Basic Idea
I created a spec.prompt.md file. This prompt accepts a ticket number and the pasted contents of the technical specifications. Then I run a command like:
/spec <ticket-number> <pasted-contents>
Originally, each time I ran /spec, it would overwrite everything and start from scratch. That worked, but it didn’t allow me to iterate or compare changes between passes.
So I added two modes:
-o = overwrite
-i = iterate
The Iterate Model
When I run:
/spec -i 1234 <contents>
It creates a folder structure like this:
/specs/1234/
p01/
spec.md
plan.md
implementation.md
files...
p02/
spec.md
plan.md
implementation.md
files...
p03/
...
The original intent was to provide myself a comparison between p01 vs p02 to see if changes to the spec were implemented correctly but something unexpected happened.
For my first test of -i, I didn't change the spec at all. I just ran the same spec again:
/spec -i 1234 <original contents>
I expected it to regenerate everything so I could compare Pass 1 to Pass 2. Instead, Pass 2 did something even smarter.
When reviewing Pass 2, I couldn't find the original build scripts and the bulk of the work was missing. I was confused why I only found a single file.
Then, it clicked: it had reviewed its previous pass to identify any gaps in the duplicated spec I provided, determined it was mostly correct, and refined its original pass.
That’s when this stopped being just a prompt and started looking like an agent. I didn’t tell it to rewrite everything. I didn’t tell it to only fix what was wrong. I didn’t tell it to review its previous implementation. It chose to.
Its self-refinement process made me rethink my own process.
The Process
Currently I have these prompts:
spec.prompt.mdspec-implement.prompt.mdspec-testing.prompt.md- using
backend-engineeragent
Right now, I manually run each step after the previous one completes. But the long-term goal is for the agent to understand the workflow and run the steps itself.
Pass Workflow
Pass 1 — Spec → Plan → Implement
/spec 1234 <contents from Jira>
The agent:
- Reads the ticket
- Writes the spec
- Creates a plan
- Implements the code
- Saves everything into
p01
Pass 2 — Self-Refinement
/spec -i 1234
The agent:
- Re-reads the spec
- Reviews Pass 1
- Fixes gaps
- Improves implementation
- Adds missing tests
- Refactors if needed
- Saves into
p02
This becomes a self-refinement loop:
Implement → Review → Refine → Review → Refine
The agent continues iterating until it believes the implementation matches the spec.
Pass 3 — Spec Updates from Engineer
After reviewing, the engineer may realize:
- Something was unclear
- Requirements changed
- Edge cases were missed
- Naming should be improved
- Logic should be handled differently
The engineer updates the spec, then runs:
/spec -u 1234 <updated contents>
Now the agent:
- Compares original spec vs updated spec
- Compares p01 vs p02 vs p03 etc.
- Determines what changed in the spec and how that changes the code
- Implements only what’s different, not everything.
- Creates a new pass
This becomes iterative spec-driven development.
Final Phase — Testing Instructions
When the engineer believes the implementation is ready for validation:
/spec-testing 1234
The agent then:
- Reviews the latest pass
- Identifies all changed files
- Provides test scenarios to validate the changes
At that point, the engineer tests — not writes everything from scratch.
Expect the Unexpected and "Just Keep Swimming"
The interesting part of this experiment wasn’t that the agent wrote code. It was that it started reviewing, refining, and improving its own work in passes — the same way a developer does.
I didn’t set out to build this agent or this workflow. I set out to write a better prompt. Then I wanted it to do a little more, and a little more. Somewhere along the way, the prompt turned into a process.
This isn’t just prompt engineering anymore.
It’s process engineering.
Top comments (0)