Building ARCHITECTURE.md Files That Prevent AI From Making Silent Architectural Decisions

#architecture #aidevelopment #documentation #systemdesign

Building ARCHITECTURE.md Files That Prevent AI From Making Silent Architectural Decisions

A few weeks into our experiment with an AI coding assistant, I noticed something odd. Variable names were shifting. Not dramatically. A rename here, a slightly reworded method there. Each change seemed reasonable in isolation. Professional, even.

But collectively, they were rewriting our architecture one rename at a time.

The Silent Coup

It started innocuously. The assistant would suggest replacing a small, plainly named method with a longer, more "enterprise" sounding one. The PR looked fine. More descriptive, right? Then it began restructuring how we organized service classes, introducing new abstraction layers that felt Enterprise Java-ish. Again, each individual change passed code review because none were objectively wrong.

Our human reviewers saw isolated improvements. Our automated reviewer, another AI, rubber-stamped the changes because they followed general best practices. No linter flags a variable rename. No formatter catches architectural drift.

A few weeks later, our codebase felt foreign. The naming conventions had subtly shifted toward verbose enterprise patterns. Service boundaries had blurred as the assistant introduced "helper managers" and "utility coordinators." The original lean architecture was drowning in abstraction.

The honest truth hit during standup when someone asked: "Why does this keep coming up?" We had spent half our sprint discussions untangling unnecessarily complex class hierarchies that had not existed a month prior.

Why Traditional Safeguards Failed

This was not a tooling problem. Our static analysis caught syntax violations perfectly. Our formatters maintained consistent indentation. Linters enforced style rules religiously.

But architectural decisions live in the spaces between the rules. When the assistant chose composition over inheritance, or decided to extract a new service class, or renamed methods to follow different conventions, these choices were invisible to mechanical constraints.

Human reviewers failed too, and for understandable reasons. Who flags a PR that renames a couple of variables and extracts a helper method? The changes looked professional. They followed general principles we had all learned in computer science classes.

The AI reviewer was even worse. It enthusiastically approved changes that made the code "more maintainable" according to whatever training data informed its judgment. Two AIs agreeing does not create architectural coherence. It creates compounded drift.

What Actually Worked

I started keeping an ARCHITECTURE.md file. Not as punishment for the assistant, but as explicit guidance about our actual architectural decisions and why we made them.

The file documents our naming conventions, why we keep service classes thin, and why we avoid deep inheritance hierarchies. More importantly, it explains the reasoning behind these choices: performance characteristics, team preferences, and integration constraints.

Every time the assistant made a questionable architectural call, I updated the document. When it introduced unnecessary abstraction layers, I added a section explaining our preference for composition patterns. When it renamed methods to be more "descriptive," I clarified our naming philosophy and provided examples.

This created a feedback loop that actually worked. The assistant began suggesting changes that aligned with our documented architecture. Not because it suddenly understood our preferences, but because the context was explicit rather than implied.

The Documentation Defense

The ARCHITECTURE.md file serves as a kind of architectural constitution, a set of principles that constrain both human and AI decision-making. It is not comprehensive documentation of every class and method. It is the theory of the system: how business problems map to code structure, what rules must always hold, and why certain patterns are preferred over alternatives.

I organize it around decision rationale rather than implementation details. Instead of documenting what the code does, it explains why it is structured that way. This gives both humans and AI agents the context needed to make consistent architectural choices.

The file evolves with the system. When we change architectural direction, we update the documentation. When AI agents make questionable choices, we refine the guidance. It is a living document that captures institutional knowledge about how we build software.

The Uncomfortable Reality

AI coding assistants are exceptionally good at writing code that works. They are terrible at understanding why your architecture exists and what constraints shaped its evolution. Without explicit guidance, they will optimize for whatever patterns dominated their training data, usually enterprise Java or academic computer science examples.

This is not the assistant's fault. It is doing exactly what we asked: writing better code according to general principles. The problem is that "better" is contextual, and context lives in documentation that most teams never write.

The ARCHITECTURE.md approach is not elegant or automated. It requires discipline to maintain and vigilance to enforce. But it is the only method I have found that consistently prevents AI agents from silently rewriting your architectural decisions.

Because the alternative, discovering weeks later that your codebase has quietly become someone else's idea of good architecture, is far worse than the effort of writing it down.