DEV Community

Cover image for Advancing Your Own AWS AI Architect with DrawIO Skills and Living Documentation
Oresztesz Margaritisz
Oresztesz Margaritisz

Posted on • Originally published at dev.to

Advancing Your Own AWS AI Architect with DrawIO Skills and Living Documentation

Introduction

Since the last article, I've improved quite a few things in my AWS Architect agent. You've probably already spotted the sketchy, unprofessional diagrams it created. I also noticed two recurring issues: it consistently over-engineered solutions, and the generated responses were falling short on quality: verbose, cluttered with emojis, and padded with the kind of filler that makes AI output feel generic. In this article, I'll show you how I turned it into a genuine professional copilot — one capable of delivering client-facing outputs. I found it to be a great aid for our team in many ways, and I hope you will too. Here's what we're adding:

  • DrawIO MCP and skills — the agent now generates draw.io diagrams directly as structured XML, instead of describing them in prose.
  • C4 documentation skill — enforces the right level of abstraction per audience, keeping deployment details separate from logical structure.
  • ADR skill — Architecture Decision Records become a first-class output: every non-obvious choice comes with documented alternatives and a revisit trigger.
  • Minimalism in the system prompt — a "default to no" philosophy baked in; every service and pattern must earn its place.
  • Complexity budget — architectural additions carry a point cost tied to user scale, forcing justification before anything gets added.
  • arc42 skill — living architecture documentation in Markdown, decision-focused and co-located with code rather than written once and forgotten.

What are we going to build?

Our previous architect was not capable of doing self-improvement. So we're going to introduce a feedback loop through a knowledge base. The knowledge base will actually be our architecture documentation in a well-structured format. For this particular example I was using arc42 architecture template, as it's publicly available and offers Markdown as a format.

simple-feedback-loop

I also have a detailed diagram showing the whole component decomposition. The main difference from the previous version is the AWS diagram MCP server usage is now minimized to only two tool calls: get_diagram_examples and list_icons. I found that doing these preliminary steps before the agent is using the DrawIO diagram generation improves the quality of the output.

component-decomposition

Prerequisites

To make the MCP servers work, you need npx installed and available on your machine. I recommend Node Version Manager to manage versions.

DrawIO MCP

The biggest visible change is diagram generation. The agent now opens proper draw.io diagrams directly in your browser. This is powered by the official DrawIO MCP server from jgraph.

Here's how to configure it.

Configuration

I have a ready-made sample for OpenCode with the [DrawIO MCP] server. To define the DrawIO MCP, you have to add them to your opencode.jsonc inside the mcp section.

{
  "drawio": {
    "type": "local",
    "command": [
      "npx",
      "@drawio/mcp"
    ],
    "environment": {
      "FASTMCP_LOG_LEVEL": "ERROR"
    },
    "enabled": true
  }
}
Enter fullscreen mode Exit fullscreen mode

Permissions

Make sure that for all other agents the MCP servers above are disabled, by adding the following line to your opencode.jsonc:

  "tools": {
    "awslabs*": false,
    "drawio": false
  },
Enter fullscreen mode Exit fullscreen mode

We're going to enable them only for the AWS architect agent. Set these permissions inside your aws-architect.md file's frontmatter:

permissions:
  bash: deny
  skill:
    "c4-documentation": "allow"
    "drawio": "allow"
tools:
  write: true
  read: true
  grep: true
  glob: true
  awslabs*: true
  awslabs.aws-diagram-mcp-server: true
  drawio: true
Enter fullscreen mode Exit fullscreen mode

Testing MCP configuration

Check if the specified MCP servers are working correctly with the following OpenCode CLI command:

opencode mcp list
Enter fullscreen mode Exit fullscreen mode

Alternatively you can just type /mcps in OpenCode.

Add the DrawIO skill

Fortunately DrawIO is offering an official skill. It is not meant to be used with this version of their MCP server, but including it greatly improves XML correctness and reusability. You can find the official skill over here, but I have my version that's introduced in my opencode-agents GitHub repo.

Place it in the same location as you see in my GitHub repository config, so it goes into .opencode/skills/drawio/SKILL.md

Referring to AWS blueprints

One of the modifications that's different from the official skill is that the agent has to do two invocations to the traditional AWS diagramming MCP first:

  • get diagram examples by using get_diagram_examples.
  • get AWS icons by using the list_icons.

Based on my experience this improves the outcome a bit.

Giving it a spin

Now that all the configuration is complete, let's revisit our previous example and redo the diagram in DrawIO. Don't forget to add the skill we've created with /drawio:

/drawio Look at the @raw/problem_statement.md. Redo the previous architecture diagram at @raw/previous_aws_architecture.webp in DrawIO format. Don't use the same arrangement, arrange it a better way. Use AWS documentation and best practices to enrich your understanding of the target architecture.
Enter fullscreen mode Exit fullscreen mode

As you can see, the generated diagram is a big step forward from the previous approach. First of all it's editable, secondly it is reusable in further conversations, when adjustments are required.

aws-user-flow

Let's see a deployment view generated from the collected information so far. You'll notice that you only need to make slight adjustments to the generated content. If you are willing to invest the effort, I encourage you to use the Claude Opus model, as it will significantly improve the quality of the analysis and generated content.

aws-deployment

Updating diagrams

If you need the agent to update an existing diagram, the best way to do it is to just refer to the .drawio XML file in the conversation. Good thing is that the agent is annotating the XML with its own comments, which improves understandability in the future.

C4 documentation skill

C4 model is a pragmatic approach to architecture modelling. There are many ways to introduce this into your architecture agent, for instance by using Structurizr syntax plus tooling as they also have official MCP servers. For now we're just going to rely on a simple skill from my GitHub repo and existing DrawIO MCP and skills. Configuration is super-simple, you just have to drop in the skill into .opencode/skills/c4-documentation/SKILL.md and you should be able to use it.

Example for C4 diagrams

OK, now after having this new skill in place, let's try it out. Just for curiosity, I switched to Claude Opus for this exercise.

/c4-documentation Create a C4 contextual diagram from the @raw/problem_statement.md based on the AWS architecture created at @raw/aws_deployment.drawio and @raw/aws_architecture.drawio Use DrawIO for visualization.
Enter fullscreen mode Exit fullscreen mode

⚠️ I did not manually modify any of the generated output.

c4-context

c4-containers

For token-friendly, quick visualization I recommend using Mermaid diagrams. If you use the C4 skill, you can prompt your AWS architect for Mermaid diagram as well as an output.

/c4-documentation Create a Mermaid diagram version from the C4 container diagram above.
Enter fullscreen mode Exit fullscreen mode

c4-mermaid

I personally prefer to use DrawIO most of the time, as those diagrams are better arranged. It is true that the notation for C4 diagrams needs to be corrected in the underlying SKILL.md file.

Adding the feedback loop to your architecture agent - templates

Even when working on architectures in a traditional way, templates are a great aid. They're not just guaranteeing quality output, but also representing a thinking system. For this purpose I created a separate architecture template skeleton in GitHub. If you want to start a new project and would like to use your architecture agent, just clone the repository, give it a name and start using it. You can find this project template over here. All the content is mainly relying on arc42 templates.

Introduction to the project structure

All sections of the architecture documentation are in separate files. This not only helps the agent navigate, but also keeps input token usage economical. If you want to add more context to the agent, you just simply refer to multiple files under this repository.

The engine behind the project is based on Nextra. To start the project you need three commands:

yarn install
yarn build
yarn dev
Enter fullscreen mode Exit fullscreen mode

A search bar is also included, so you can search in each section and look for specific terms.

project-skeleton

Prompt instructions during documentation

The prompt instructions are located in two places. A general-purpose description and guidance in arc42 is in the arc42-documentation skill. You have to configure it the same way as the other skills.

File-specific instructions are commented out with HTML comments, so it's visible for the agent, when reading the file, but invisible during documentation generation.

Let's try this in practice:

/arc42-documentation extract the @content/01._Introduction_and_Goals/1.2_Quality_Goals/Quality_Goals.md based on the formatting guidelines in the document. Use the @raw/problem_statement.md and @raw/aws_deployment.drawio for reference.
Enter fullscreen mode Exit fullscreen mode

Even if our prompt is quite minimalistic, there are a lot of things going in the background. This should be also a guidance for your future work: Every general-purpose instruction should either go to the specific template or to a skill to eliminate repetition.

Let's see the generated content:

Quality Goals

Priority Quality Goal Scenario
1 Scalability System handles millions of concurrent players by auto-scaling Lambda functions across AZs, using DynamoDB on-demand capacity, and GameLift fleet scaling for game sessions.
2 Performance Game client loads under 3s via CloudFront CDN; matchmaking completes within 10s using SQS-buffered FlexMatch; in-game latency stays below 50ms through direct UDP/TCP to GameLift servers.
3 Browser Accessibility Full game experience (WebGL/WASM) runs in any modern browser without plugins, delivered as static assets from S3 via CloudFront.
4 Reliability Multi-AZ deployment with Redis replication, NAT Gateway redundancy, and Lambda cross-AZ failover ensures no single point of failure for matchmaking and session management.
5 Extensibility Architecture supports future extraction as a reusable framework: stateless Lambda functions, decoupled matchmaking via SQS, and separated game server fleet allow independent evolution of components.

As you're advancing through your architecture investigation, you're going to make the agent smarter and smarter by either referring back to previous content or by using semantic search. I'll explain semantic search more on the final chapter.

Comprehensive understanding and proactivity - semantic search

By default, OpenCode comes with simplistic search capabilities, like Glob and Grep. So if you start a new session and ask questions, most probably these will be the tools used by the architecture agent, to look for context in your existing documentation:

What are the key quality attributes?
Enter fullscreen mode Exit fullscreen mode

traditional-search

Where this approach is falling short is when working with information spread across multiple files. So for instance if you have to make a summary page containing multiple files, or if you have to harmonize content available across several resources, a semantic search would be more efficient.

Another reason for using semantic search is performance: Including a smaller, focused chunk of information into the agent's context is much more efficient, than reading full files. OpenCode is also quite slow, when reading up segments of files based on results of its builtin Grep tool.

All we need is a local in-memory vector store, like LevelDB and some simple functionality: Reindexing files and then exposing vector search through MCP. I found a project which almost had everything, but had to be skimmed so agent can use the exposed MCP tools with minimum effort on instructions. That leads to my third project on GitHub, project-tessera forked from another open-source solution.

Setting up local vector search

Clone the project and then create a local workspace.yaml in the project root. You can copy the workspace.yaml.example and start from that point. The config should refer to your architecture documentation:

workspace:
  root: /home/gitaroktato/Projects/architecture-project-skeleton
  name: "webcraft"

# Define which directories to index
sources:
  - path: content
    type: architecture
    project: webcraft

# Project definitions (for status tracking and filtering)
projects:
  webcraft:
    display_name: "Webcraft"
    root:  "/"

models:
  embed_model: BAAI/bge-small-en-v1.5
  # embed_model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

# Search tuning
search:
  max_top_k: 50                  # Upper bound for top_k parameter
  reranker_weight: 0.7           # Semantic vs keyword (0.0-1.0, higher = more semantic)
  fetch_multiplier: 6            # Over-fetch factor before reranking
  result_text_limit: 1500        # Max chars per search result
  unified_text_limit: 800        # Max chars per unified search result

# Ingestion tuning
ingestion:
  chunk_size: 1024               # Text chunk size for indexing
  chunk_overlap: 100             # Overlap between chunks
  max_node_chars: 800            # Truncate chunk text to this length

Enter fullscreen mode Exit fullscreen mode

We're going to use a small embedding model that can work only on English text, but more lightweight than multilingual models.

Indexing files and configuring MCP API

Indexing files

Just run python ./main.py ingest to build up the LevelDB index. After files are updated, you can just run python ./main.py sync to update your vector index.

Configuring MCP

You can start the FastMCP server exposing LanceDB with the python ./main.py serve command. To use these additional tools, you need to change your opencode.jsonc configuration and add the following lines to your mcp section:

"tessera": {
    "type": "remote",
    "url": "http://localhost:8000/mcp",
    "enabled": true
}
Enter fullscreen mode Exit fullscreen mode

I collected these changes in a separate pull-request as I don't consider this solution complete yet.

You can do the same trick as we did - disabling this tool for all other agents and enable it for just the Aws-Architect in its frontmatter. I'll skip this step. Checking MCP configuration is the same as before.

Let's see if adding this MCP server will improve our agent's self-learning capabilities.

What are the key performance related quality attributes for our project?
Enter fullscreen mode Exit fullscreen mode

semantic-search

You can see, that the architecture agent is now using the newly introduced tools to search our document.

What's missing, what's next?

A couple of things to keep in mind: The semantic search is at an early stage and requires a lot of tweaking and polishing. I think it's a valuable centerpiece tightly connected to the quality of the output, since it's affecting the agent's thought process. Something to keep in mind, if you find issues during usage: You can still refer to a file if search fails you.

Also, I think I need to explore better C4 diagramming capabilities, probably through Structurizr MCP and Structurizr DSL. This would mean a better version of the C4 skill that better explains the aforementioned syntax to the LLM.

AWS cost estimation: This could be done with AWS MCP, but the related tools require AWS authorization, so I left this one out. Another option would be a simple reference to all the pricing page of each AWS service. Similarly to an AGENTS.md. So I think it's an important topic, still unsolved in my implementation.

Top comments (0)