DEV Community

hagishun
hagishun

Posted on

How I Stopped Silent Regressions with GitHub Copilot Coding Agent and ADR

TL;DR

  • While using Copilot coding agent, I ran into regressions where the design doc (file tree) drifted out of sync with the implementation — without anyone noticing
  • The root cause wasn't the AI. It was that assumptions kept changing, but we had no way to lock them down
  • I introduced ADR (Architecture Decision Record) to explicitly fix assumptions and have the agent execute steps in order
  • Result: the same class of regression hasn't happened since, and we got the workflow stabilized in about 2 hours

Background: The "Silent Regression" Problem

I was using Copilot coding agent to automate the flow from design → implementation → testing.

At some point, even though the implementation looked correct, I started noticing:

  • The design doc and the actual implementation were subtly misaligned
  • The file structure didn't match the original intent
  • Follow-up changes broke something elsewhere

This was a "regression that doesn't look like one at first glance" — and that made it hard to catch.


Root Cause: Not the AI — The Assumptions Weren't Locked

My first instinct was to blame:

  • The agent's understanding being too shallow?
  • Weak prompts?
  • Model limitations?

But when I traced the logs, that wasn't it.

What Was Actually Happening

  • The file tree definition in architect.md had changed mid-way
  • But the agent kept implementing based on the assumptions it read at the start
  • And I hadn't explicitly communicated that the assumptions had changed

In short:

The assumptions were changing, but there was no mechanism to lock or update them

That was the real cause.

Before vs. After: Visualized

Before: Assumptions drift

sequenceDiagram
    participant H as Human
    participant I as Issue
    participant A as Agent
    participant C as Code

    H->>I: File issue (assumptions at this point in time)
    Note over I: ⚠️ Assumptions change mid-way
    H->>I: Design changes via another PR
    I-->>A: Old assumptions passed to agent
    A->>C: Implements based on stale assumptions
    Note over C: Regression occurs (hard to notice)
Enter fullscreen mode Exit fullscreen mode

After: Assumptions locked via ADR

sequenceDiagram
    participant H as Human
    participant ADR as ADR
    participant A as Agent
    participant C as Code

    H->>ADR: Document decisions (goal, constraints, verification)
    H->>A: Point Issue to ADR
    A->>ADR: Reads latest assumptions
    A->>C: Implements based on ADR
    A->>A: Runs verification — confirms zero drift
    Note over C: Implementation matches design
Enter fullscreen mode Exit fullscreen mode

Why Agents Are Particularly Vulnerable to This

In a human team, you can share context implicitly — through conversation, memory, shared understanding. You fill in the gaps naturally.

Agents can't do that. They can't carry context across sessions. If the assumptions change but aren't explicitly communicated, the agent will keep treating whatever it last read as ground truth.

ADR has traditionally been framed as a "large team documentation practice." But in agentic development, it plays a different role: a handoff document from humans to the agent. The more you rely on agents, the more critical it becomes to have a clear answer to "where are the decisions written down?"


Reference: The github/gh-aw Design Philosophy

I drew inspiration from github/gh-aw — GitHub's reference repository for Agentic Workflows.

Its core philosophy is:

  • Humans explicitly write down what was decided
  • Agents work within those fixed assumptions
  • When assumptions change, decisions must be updated first

Solution: Lock Assumptions with ADR

The approach I took was ADR-based workflow.

What is an ADR?

Architecture Decision Record — a document that captures "what" was decided, "why," and "to what extent." See adr.github.io for more.

How I Used It

  • Scoped to Chapter 4 of architect.md: the file tree definition
  • Explicitly marked what was "already decided" vs. "still open"
  • Had the agent execute steps sequentially, using the ADR as its ground truth

Repository Structure

.
├── architect.md        # Overall design
├── adr/
│   └── ADR-001.md      # Locked decisions and assumptions
└── .github/
    └── copilot-instructions.md
Enter fullscreen mode Exit fullscreen mode

Example ADR Content

Field Content
Scope File tree structure
Decision Responsibility separation under pkg/
Out of scope Future extension directories
Prohibited Any structural changes without updating ADR first

How to Drive the Agent (The Important Part)

I wrote the following steps into the ADR and had the agent execute them in order:

  1. Read the ADR
  2. Check the relevant section of architect.md
  3. Implement
  4. Test and verify
  5. Report results

This stopped two recurring failure modes:

  • Expanding scope beyond what was intended
  • Continuing to implement based on stale assumptions

Results: Regressions Became a Solvable Problem

Since switching to this workflow:

  • The same class of regression hasn't happened again
  • When something does go wrong, it's immediately clear which assumption broke
  • Review load on the human side has gone down

The AI didn't get smarter. We just locked the assumptions.


Key Takeaways

  • AI will fill in ambiguous assumptions on its own — that's not a bug, it's by design
  • If assumptions change, humans need to document the updated decision explicitly
  • Copilot coding agent is powerful, but it doesn't eliminate the need for design and decision-making

If anything:

"Humans decide. Agents execute."

The cleaner that division is, the more stable your development gets.


Closing Thoughts

Regressions aren't caused by AI. They're caused by workflows that don't record decisions.

ADRs are unglamorous. But in agentic development, they work surprisingly well.


What's Next (Under Consideration)

  • Publish an ADR template
  • Build a minimal sample based on the gh-aw structure
  • Turn this into a team-facing hands-on workshop

References

Top comments (0)