DEV Community

Architecture Documentation as a First-Class Engineering Asset

Alexander Tyutin on April 16, 2026

How autonomous AI agents can generate a complete architecture snapshot of your microservices platform - while you do push-ups - and why that docume...

Read full post

Mykola Kondratiuk • Apr 17

auto-generated docs are a snapshot, not a contract. feeding stale docs into a quality pipeline doesn't catch systemic failures - it just fails silently until someone notices the docs are 3 sprints behind

Alexander Tyutin Google Developer Group • Apr 17 • Edited

That's it! But the quality gate may also be configured to block the MR in case of significant changes without additional documentation.
For example, my quality gate didn't pass my MR with new docs which shown my problems 😅 I was turned into documenting security/arch exceptions and dates when they should be fixed 😆

Mykola Kondratiuk • Apr 17

quality gates that block on undocumented security exceptions — that's the forcing function most teams skip. pain upfront > surprise in prod.

Alexander Tyutin Google Developer Group • Apr 17 • Edited

Yeah, that's why I'm working in security 15+ years and still do not afraid to be unemployed even in the agentic AI era 😁

Mykola Kondratiuk • Apr 17

15 years of security intuition is exactly what AI can’t replicate — you know why gates exist, not just that they should. Honestly the agentic shift probably makes your pattern recognition more valuable, not less. What’s your biggest concern right now: trust boundaries or supply chain?

Alexander Tyutin Google Developer Group • Apr 17

What’s your biggest concern right now: trust boundaries or supply chain?
The tricky question 🙃 I've already faced topics AI security and AI governance in my projects and it was extremely interesting. Fighting hallucinations, catching jailbreaking...
Right now I'm diving into ML trying to understand "behind the scene" of AI overall. And the deeper I go the more interesting it is. Because now the supply chain is not only about the software but about the data too. It is a really not deterministic new world. So the supply chain control is extremely important not only in terms of software but in terms of data too.
Other hand agents with access to the system definitely bring a lot of new challenges. I'm still experimenting with prompts and still do not allow agent execute all commands by default even inside a repo. Because 1 of 10 requests leads the agent to a command which I would deny. But as I understand it is not a general approach and a lot of people just allow an agent to execute any command inside the user context 🫠
So I think trust boundaries or supply chain are both important and I can't prioritize any of them without enough context about the environment.
I believe in a future there will be some kind of a governance around agents and models where we will be able to create a lightweight fast models trained on our (or trusted) data with possibility of defining boundaries and monitor agents behavior like antivirus, MS group policies or GCP organizational policies do 😇

Thomas Landgraf • Apr 18

Landed on almost the same thesis from the requirements side rather than architecture. The thing that unlocked it for me was making the docs both hierarchical and machine-readable — one file per entity (goal / feature / requirement / AC) with YAML frontmatter holding IDs, status, and trace links, committed alongside the code. Colocating is only half the value; the other half is that the AI pipeline can ingest it as structured context instead of free-form prose.

I ended up building a VS Code extension around this called SPECLAN (disclosure: I'm the creator). What makes it click is the MCP server — the spec tree is exposed as tools, so the AI quality gate queries architecture and requirements the same way it queries tests. That's the piece that turned "docs in git" into "docs the gate actually uses."

Your Vertex AI gateway sounds like the natural place this lands — did the agents start quoting architecture doc IDs back in quality findings, or do they still frame concerns in natural language? That's been the hardest part of the loop for me.

Alexander Tyutin Google Developer Group • Apr 18 • Edited

Hello @thlandgraf ! It looks like on the screenshot. But as it is the Gemini answer it can be formatted as needed. I'm using critical findings section (blockers) and overall recommendations (non-blockers). This is a bit old screenshot because now it also writes additional info about OWASP. Next week I will share more info about the gate as I'm planning to perform GDG Workshop about it :)

Also it is screenshot from the developer's IDE. I decided to give me (the developer) possibility to run it from the IDE before creating MR to save time :D

Thomas Landgraf • Apr 18

Thanks @alexandertyutin — the blockers / non-blockers split is exactly the piece I was missing. Severity-bucketed findings map cleanly to CI logic ("block MR if blockers bucket is non-empty") in a way strict doc-ID quoting never really does, and it's the bit a human reviewer scans first anyway. That's a much nicer forcing function than what I was imagining.

The IDE-before-MR placement is interesting too — do you run it in both places (IDE + pre-merge CI), or did you drop the CI gate once the IDE version was fast enough? I keep hitting the same tension between "fast feedback" and "actually enforced."

Would love to catch the GDG Workshop writeup if you publish slides afterwards. Which region / date are you running it in?

Alexander Tyutin Google Developer Group • Apr 20

@thlandgraf Thanks for your interest and questions! They really make me thinking deeply about the practical side 🙌
This gate was born during indie-development of a niche product which intended to reach adults and children. So I understood that I was facing huge conflict of interest between CEO, CTO and CISO in my head from the very beginning 😁 Also I have a teammate and understanding that there may be a point in future when additional people will added to the process.
So my first intent was to provide another point of me for any stage of the SDLC. When I'm implementing a new feature I have my code and approach at whole be reviewed by a security guy. And when I'm a security guy I need a developer/architect counterparty. This approach turned me into the process of finding trade-offs.
Other hand I was interested to test this approach of process automation and documenting. I've discovered that there may be interesting side effect as the practically approved approach of bringing required competence into the IDE and CI/CD.

The IDE-before-MR placement is interesting too — do you run it in both places (IDE + pre-merge CI)

Firstly I've added the check into the CI and was happy as CISO. But then I realized as a developer that wait several minutes for the "MR Failed" response it too expensive 😁 So I just cloned the CI step into a bash script and started use the script as a developer (before the final push or just when I feel that I need a fresh eyes). But I haven't removed the CI check because of human aspect (I can forget to check locally) and for possible case of sharing the development process.

And sometime this cross check brings the value 😁 I saw different cases when I forgot to check locally of when the CI check provided additional view. It some kind of a real process modelling when different approvers may provide different details 😁

Also it turned me to use documented security and architecture exceptions with approved compensatory measures and due dates 😂 So I can state the it has definitely improved my development and deploy discipline and provided the scalable part of change management process at the same time.

Alexander Tyutin Google Developer Group • Apr 20

Would love to catch the GDG Workshop writeup if you publish slides afterwards. Which region / date are you running it in?

We will run it in Russian at 25th of April (this Saturday). Link here. But I plan to prepare repo with code samples and explanatory video. Based on threads here I understand now that I should compile threads and questions into some kind of a supportive demo video and process description. At least for myself 😅 I think I will do it and publish here also 🙌 And maybe additional workshops will be ran in English then...

Thomas Landgraf • Apr 21

The CEO/CTO/CISO-in-one-head framing really resonates — I've spent the last 7 years in Head-of-Digitization roles where the same conflict plays out, except spread across actual humans rather than one person. The gate ends up doing the same job in both cases: forcing a structured trade-off conversation that would otherwise happen as a vibes-based argument in a meeting. The artifact becomes the place where the conflict resolves instead of where it starts.
Watching for the English follow-up. Happy to be a second pair of eyes on the workshop materials before the English run if it would help.

Manoj Mishra • Apr 16

Treating architecture documentation as a first-class engineering asset is long overdue. When documentation lives alongside code and follows the same workflows, it naturally stays relevant and actionable.

Appreciate the emphasis on keeping it lightweight, continuously updated, and developer-friendly — that’s what makes it actually usable rather than just existing for compliance.

Well articulated — this is the kind of discipline that truly scales engineering teams.

Alexander Tyutin Google Developer Group • Apr 16

Yeah! Exactly 🙌

PEACEBINFLOW • Apr 22

The finding about the trace header stripping hit me. Not because it's a dramatic bug, but because it's the exact kind of decision that looks correct in isolation and becomes obviously wrong only when you zoom out. The engineer who wrote that middleware probably felt responsible. "I'm protecting our internal traces from external tampering." Good instinct. Wrong layer.

What's interesting is that this class of error is almost impossible to catch with traditional tooling. A linter sees a function that modifies headers. Fine. A security scanner might even flag it as a good practice—sanitizing inputs at the boundary. You need the intent of the system to recognize that this particular header isn't a threat vector, it's a load-bearing piece of observability infrastructure.

The documentation didn't just help the AI find the bug. It gave the AI permission to reason about what the system was supposed to do. Without that, it's just pattern-matching against a corpus of code. With it, it's evaluating whether the implementation honors the design.

Makes me wonder about the inverse failure mode. If the documentation is wrong—if it describes an intent that never made it into the code, or that rotted over time—does the Quality Gate become an engine for confidently flagging "violations" of a fictional standard? An AI that trusts stale docs might be worse than no AI at all. How are you handling the drift problem? Is the agent also responsible for detecting when the implementation has moved on and the ARCHITECTURE.md needs a refresh?

Thomas Landgraf • Apr 22

The drift problem is the one that keeps me up at night too. My current mental model splits it in two: mechanical drift (the doc references a symbol, endpoint or table that no longer exists) and semantic drift (the artifact is still there, but its behavior moved on). Mechanical drift you can catch with plain structural checks — each doc entity carries a pointer to a code symbol, and CI fails when the target is missing. Semantic drift is the hard one. An agent can flag "this function's behavior diverges from the description," but it's often just the agent re-reading the code and convincing itself of whichever story is more polished. I haven't found a purely automated answer. Best I've landed on is scheduled re-reviews of docs older than N weeks, with the agent surfacing "sections most likely to have drifted" to shorten the reviewer's path — which is kind of an admission that the problem isn't solved. Your "engine for confidently flagging violations of a fictional standard" line nails the failure mode I worry about most.

Alexander Tyutin Google Developer Group • Apr 23

The finding about the trace header stripping hit me. Not because it's a dramatic bug, but because it's the exact kind of decision that looks correct in isolation and becomes obviously wrong only when you zoom out. The engineer who wrote that middleware probably felt responsible. "I'm protecting our internal traces from external tampering." Good instinct. Wrong layer.

Exactly! That engineer was me 😅 And I had precise the same thoughts you've described 😆 Especially I understood that when in several days I've got a tricky bug and realized that I can't trace it from the client (mine too but in another GCP project). And while from one side I still thinking about the trace header security now I understand that some kind of transparent traceability should be not only inside the core service mesh but between the platform and its client. Will dig into it a bit later. Because it was a trade-off between the MVP speed and quality level (just as mentioned in other comments). But the exception is documented for the quality gate and due date is also defined 😁

Makes me wonder about the inverse failure mode. If the documentation is wrong—if it describes an intent that never made it into the code, or that rotted over time—does the Quality Gate become an engine for confidently flagging "violations" of a fictional standard?

Yes, good point 💯 Here are lot things to think about and perform experiments.

Is the agent also responsible for detecting when the implementation has moved on and the ARCHITECTURE.md needs a refresh?

Another good point for experiments 💯

Thank you for such a deep dive and such a meaningful comment! 🙌

Henry A • Apr 17

The finding about distributed tracing headers being stripped is a perfect example of something no linter will ever catch. I've seen the same class of problem with security groups and VPC endpoint policies — the code-level decision looks reasonable in isolation, but violates a system-level invariant that only exists in someone's head (or, if you're lucky, in an architecture doc).

The practical insight that resonates most: colocating documentation with code in the same commit. The moment architecture docs live in a wiki, they're fiction within two sprints. An ARCHITECTURE.md next to the Dockerfile, updated in the same PR that changes the service — that's the only pattern I've seen survive past month three. The agent-generated first draft approach is smart too. The blank page problem is real, and a structured template (Intent, Principles, Interaction Diagram) gives the agent enough constraints to produce something worth editing rather than something worth deleting.

Alexander Tyutin Google Developer Group • Apr 17 • Edited

I'm using approach to create a task doc in the new branch initially. Before the MR I append (with agents of course) a what was done section. It was helpful during quality gate check because it looks not only through code but also through a supportive doc . But now I realized that not only what was done should be added but also why was done 😁

Wasey Jamal • May 24

This hits close to home. I built NetworksInsights — a live B2B SaaS — and the moment I started treating architecture decisions as documented artifacts rather than things "in my head," the entire build quality improved. Your point about AI quality gates needing architectural context to reason at system level rather than code level is exactly right. Without that context layer, any automated tool is just a sophisticated linter. The distributed tracing finding is a perfect example — that's the kind of bug that only makes sense when you can see the whole system at once. Going to look at the ARCHITECTURE.md template approach for my next project.

Alexander Tyutin Google Developer Group • May 24

🙌

Jonathan Demir • Apr 16

Or just scan with vouch-secure and be sure 🤷‍♂️

Alexander Tyutin Google Developer Group • Apr 16

Subject for research :)

Archit Mittal • Apr 18

Treating architecture docs as a first-class asset is one of those things every team agrees with in principle and almost no one does in practice. The trick that worked for my last team: making ADRs a required part of every PR that touches a system boundary (new service, new external dep, schema change). Not 'should write one' — the PR template literally has an adr-link field that fails CI if empty for those changes. Suddenly the docs stay current because they're a precondition for shipping, not an afterthought. Curious whether you've found a forcing function that works without becoming bureaucratic.

Thomas Landgraf • Apr 21

The ADR-as-PR-precondition is honestly the cleanest version of that pattern I've seen — works because the cost of writing one is small if the change deserves it and large if it doesn't, which is exactly the right signal.

The variant I've landed on moves the gate one level up: requirements can't transition from review to approved without sign-off, and a check fails if code references something still in review. Same shape as your ADR-link field, just on the spec entity rather than the PR. Bureaucratic-ness depends almost entirely on how granular you make the unit — too fine and every comma needs documentation, too coarse and you're back to free-form prose. Haven't found a clean answer to that beyond tuning per team.

Muggle AI • Apr 18

Agree with the core move here — ARCHITECTURE.md as context turns AI review from linting into reasoning. The Vertex finding (the two architectural violations) maps cleanly to what we've seen.

The honest limitation of architecture-aware review: the AI still reasons from the map the team drew. If the team didn't think to worry about a specific user journey, the architecture doc doesn't mention it, and the review doesn't catch it either. An internal pentest catches what the company already knows to worry about. The value of an outside bug bounty is the adversarial ignorance the team doesn't have.

That's roughly where we've been spending time — behavioral testing that starts from observable user intent rather than from our own ARCHITECTURE.md. Not instead of your approach; the pair is stronger than either alone.

Interested if you've seen Vertex bring in observable-behavior context yet or if it's still pure static-structure input.

ASKAR AITUOV • May 2

Loved the technical debth of the article. Hello from Almaty!

Alexander Tyutin Google Developer Group • May 4

Thanks Askar! 🙌