hagishun

Posted on Mar 1

How I Stopped Silent Regressions with GitHub Copilot Coding Agent and ADR

#github #githubcopilot #agents #githubagenticworkflow

TL;DR

While using Copilot coding agent, I ran into regressions where the design doc (file tree) drifted out of sync with the implementation — without anyone noticing
The root cause wasn't the AI. It was that assumptions kept changing, but we had no way to lock them down
I introduced ADR (Architecture Decision Record) to explicitly fix assumptions and have the agent execute steps in order
Result: the same class of regression hasn't happened since, and we got the workflow stabilized in about 2 hours

Background: The "Silent Regression" Problem

I was using Copilot coding agent to automate the flow from design → implementation → testing.

At some point, even though the implementation looked correct, I started noticing:

The design doc and the actual implementation were subtly misaligned
The file structure didn't match the original intent
Follow-up changes broke something elsewhere

This was a "regression that doesn't look like one at first glance" — and that made it hard to catch.

Root Cause: Not the AI — The Assumptions Weren't Locked

My first instinct was to blame:

The agent's understanding being too shallow?
Weak prompts?
Model limitations?

But when I traced the logs, that wasn't it.

What Was Actually Happening

The file tree definition in architect.md had changed mid-way
But the agent kept implementing based on the assumptions it read at the start
And I hadn't explicitly communicated that the assumptions had changed

In short:

The assumptions were changing, but there was no mechanism to lock or update them

That was the real cause.

Before vs. After: Visualized

Before: Assumptions drift

sequenceDiagram
    participant H as Human
    participant I as Issue
    participant A as Agent
    participant C as Code

    H->>I: File issue (assumptions at this point in time)
    Note over I: ⚠️ Assumptions change mid-way
    H->>I: Design changes via another PR
    I-->>A: Old assumptions passed to agent
    A->>C: Implements based on stale assumptions
    Note over C: Regression occurs (hard to notice)

After: Assumptions locked via ADR

sequenceDiagram
    participant H as Human
    participant ADR as ADR
    participant A as Agent
    participant C as Code

    H->>ADR: Document decisions (goal, constraints, verification)
    H->>A: Point Issue to ADR
    A->>ADR: Reads latest assumptions
    A->>C: Implements based on ADR
    A->>A: Runs verification — confirms zero drift
    Note over C: Implementation matches design

Why Agents Are Particularly Vulnerable to This

In a human team, you can share context implicitly — through conversation, memory, shared understanding. You fill in the gaps naturally.

Agents can't do that. They can't carry context across sessions. If the assumptions change but aren't explicitly communicated, the agent will keep treating whatever it last read as ground truth.

ADR has traditionally been framed as a "large team documentation practice." But in agentic development, it plays a different role: a handoff document from humans to the agent. The more you rely on agents, the more critical it becomes to have a clear answer to "where are the decisions written down?"

Reference: The github/gh-aw Design Philosophy

I drew inspiration from github/gh-aw — GitHub's reference repository for Agentic Workflows.

Its core philosophy is:

Humans explicitly write down what was decided
Agents work within those fixed assumptions
When assumptions change, decisions must be updated first

Solution: Lock Assumptions with ADR

The approach I took was ADR-based workflow.

What is an ADR?

Architecture Decision Record — a document that captures "what" was decided, "why," and "to what extent." See adr.github.io for more.

How I Used It

Scoped to Chapter 4 of architect.md: the file tree definition
Explicitly marked what was "already decided" vs. "still open"
Had the agent execute steps sequentially, using the ADR as its ground truth

Repository Structure

.
├── architect.md        # Overall design
├── adr/
│   └── ADR-001.md      # Locked decisions and assumptions
└── .github/
    └── copilot-instructions.md

Example ADR Content

Field	Content
Scope	File tree structure
Decision	Responsibility separation under `pkg/`
Out of scope	Future extension directories
Prohibited	Any structural changes without updating ADR first

How to Drive the Agent (The Important Part)

I wrote the following steps into the ADR and had the agent execute them in order:

Read the ADR
Check the relevant section of architect.md
Implement
Test and verify
Report results

This stopped two recurring failure modes:

Expanding scope beyond what was intended
Continuing to implement based on stale assumptions

Results: Regressions Became a Solvable Problem

Since switching to this workflow:

The same class of regression hasn't happened again
When something does go wrong, it's immediately clear which assumption broke
Review load on the human side has gone down

The AI didn't get smarter. We just locked the assumptions.

Key Takeaways

AI will fill in ambiguous assumptions on its own — that's not a bug, it's by design
If assumptions change, humans need to document the updated decision explicitly
Copilot coding agent is powerful, but it doesn't eliminate the need for design and decision-making

If anything:

"Humans decide. Agents execute."

The cleaner that division is, the more stable your development gets.

Closing Thoughts

Regressions aren't caused by AI. They're caused by workflows that don't record decisions.

DEV Community