TL;DR
- While using Copilot coding agent, I ran into regressions where the design doc (file tree) drifted out of sync with the implementation — without anyone noticing
- The root cause wasn't the AI. It was that assumptions kept changing, but we had no way to lock them down
- I introduced ADR (Architecture Decision Record) to explicitly fix assumptions and have the agent execute steps in order
- Result: the same class of regression hasn't happened since, and we got the workflow stabilized in about 2 hours
Background: The "Silent Regression" Problem
I was using Copilot coding agent to automate the flow from design → implementation → testing.
At some point, even though the implementation looked correct, I started noticing:
- The design doc and the actual implementation were subtly misaligned
- The file structure didn't match the original intent
- Follow-up changes broke something elsewhere
This was a "regression that doesn't look like one at first glance" — and that made it hard to catch.
Root Cause: Not the AI — The Assumptions Weren't Locked
My first instinct was to blame:
- The agent's understanding being too shallow?
- Weak prompts?
- Model limitations?
But when I traced the logs, that wasn't it.
What Was Actually Happening
- The file tree definition in
architect.mdhad changed mid-way - But the agent kept implementing based on the assumptions it read at the start
- And I hadn't explicitly communicated that the assumptions had changed
In short:
The assumptions were changing, but there was no mechanism to lock or update them
That was the real cause.
Before vs. After: Visualized
Before: Assumptions drift
sequenceDiagram
participant H as Human
participant I as Issue
participant A as Agent
participant C as Code
H->>I: File issue (assumptions at this point in time)
Note over I: ⚠️ Assumptions change mid-way
H->>I: Design changes via another PR
I-->>A: Old assumptions passed to agent
A->>C: Implements based on stale assumptions
Note over C: Regression occurs (hard to notice)
After: Assumptions locked via ADR
sequenceDiagram
participant H as Human
participant ADR as ADR
participant A as Agent
participant C as Code
H->>ADR: Document decisions (goal, constraints, verification)
H->>A: Point Issue to ADR
A->>ADR: Reads latest assumptions
A->>C: Implements based on ADR
A->>A: Runs verification — confirms zero drift
Note over C: Implementation matches design
Why Agents Are Particularly Vulnerable to This
In a human team, you can share context implicitly — through conversation, memory, shared understanding. You fill in the gaps naturally.
Agents can't do that. They can't carry context across sessions. If the assumptions change but aren't explicitly communicated, the agent will keep treating whatever it last read as ground truth.
ADR has traditionally been framed as a "large team documentation practice." But in agentic development, it plays a different role: a handoff document from humans to the agent. The more you rely on agents, the more critical it becomes to have a clear answer to "where are the decisions written down?"
Reference: The github/gh-aw Design Philosophy
I drew inspiration from github/gh-aw — GitHub's reference repository for Agentic Workflows.
Its core philosophy is:
- Humans explicitly write down what was decided
- Agents work within those fixed assumptions
- When assumptions change, decisions must be updated first
Solution: Lock Assumptions with ADR
The approach I took was ADR-based workflow.
What is an ADR?
Architecture Decision Record — a document that captures "what" was decided, "why," and "to what extent." See adr.github.io for more.
How I Used It
- Scoped to Chapter 4 of
architect.md: the file tree definition - Explicitly marked what was "already decided" vs. "still open"
- Had the agent execute steps sequentially, using the ADR as its ground truth
Repository Structure
.
├── architect.md # Overall design
├── adr/
│ └── ADR-001.md # Locked decisions and assumptions
└── .github/
└── copilot-instructions.md
Example ADR Content
| Field | Content |
|---|---|
| Scope | File tree structure |
| Decision | Responsibility separation under pkg/
|
| Out of scope | Future extension directories |
| Prohibited | Any structural changes without updating ADR first |
How to Drive the Agent (The Important Part)
I wrote the following steps into the ADR and had the agent execute them in order:
- Read the ADR
- Check the relevant section of
architect.md - Implement
- Test and verify
- Report results
This stopped two recurring failure modes:
- Expanding scope beyond what was intended
- Continuing to implement based on stale assumptions
Results: Regressions Became a Solvable Problem
Since switching to this workflow:
- The same class of regression hasn't happened again
- When something does go wrong, it's immediately clear which assumption broke
- Review load on the human side has gone down
The AI didn't get smarter. We just locked the assumptions.
Key Takeaways
- AI will fill in ambiguous assumptions on its own — that's not a bug, it's by design
- If assumptions change, humans need to document the updated decision explicitly
- Copilot coding agent is powerful, but it doesn't eliminate the need for design and decision-making
If anything:
"Humans decide. Agents execute."
The cleaner that division is, the more stable your development gets.
Closing Thoughts
Regressions aren't caused by AI. They're caused by workflows that don't record decisions.
ADRs are unglamorous. But in agentic development, they work surprisingly well.
What's Next (Under Consideration)
- Publish an ADR template
- Build a minimal sample based on the gh-aw structure
- Turn this into a team-facing hands-on workshop
Top comments (0)