Davide de Paolis

Posted on Apr 7

Kiro showcase: Automating Changes Across Several Repos with Spec-Driven Development and Custom Sub-Agents

#kiro #ai #ainativedlc #terraform

As I mentioned in my previous post about engineering leadership and AI, I'm constantly looking for opportunities to experiment with AI-native development. Given how scarce hands-on time is in a manager's calendar, I'm often "vibe-coding between meetings" - trying to make the most of short, focused sessions with an AI-native SDLC.

In my recent talk at AWS Community Day Slovakia, I referenced the four archetypes of AI adoption by AWS Serverless Hero Lee Gilmore. Until last year, I was a "dismissive expert", and I still recognise that scepticism around AI in engineering almost daily - at work and online.

Many sceptics limit AI to debugging tasks or use it as a glorified search engine. That's a missed opportunity.

This post documents a real-world case where I used AI for a small but necessary "side-quest" that would have required several hours of tedious refactoring. It's a practical example of spec-driven development and AI-native SDLC in action.

The Problem

While reviewing IAM policies on a colleague's PR, I noticed references to DynamoDB tables that were used for Terraform state locking. HashiCorp deprecated DynamoDB-based state locking in Terraform 1.10 (November 2024) in favor of S3 native locking: I was sure we had already updated all our 200+ repositories' Terraform backend configurations to avoid the deprecated lock mechanism.
Was this policy an unnecessary leftover, or had we forgotten some repos?
Or had other teams recently created new IaC using the old mechanism?

Searching and fixing this was definitely out of scope for the original ticket and PR. We have bigger, more important problems to solve than this small cleanup. But as my team is responsible for our Cloud Infrastructure Platform and for Cloud Engineering practices, it bothered me that we had such an inconsistency, that we were using deprecated Terraform versions, and risked pipeline failures if a Terraform update suddenly broke things, causing panic and wasted time in feature teams.

Since a manual migration/refactoring would be tedious and error-prone, this appeared like a great task for AI - not just to run a search-and-replace across 200 repos, but to explore the capabilities of several MCP servers, spec-driven development, and agent orchestration with Kiro.

A well-integrated AI-native SDLC use case: simple but useful.

Phase 1: Discovery

Rather than manually searching through repositories, I used Kiro with the GitHub MCP server to:

Find all repositories using the deprecated config
Identify repos using Terraform versions below 1.10
Retrieve the actual files (Terraform configs, Terragrunt files, CI workflows)
Group them by codeowners for easier review coordination

Since switching to S3 object locking could corrupt state if bucket versioning isn't enabled, I also wanted to verify our buckets' configurations across all AWS accounts.

Using the AWS MCP server and our own connect-aws skill, I could:

Automatically reconnected to all accounts in our AWS config
Checked every S3 bucket used by these repos
Verified versioning was enabled

The discovery phase took less than few minutes of actual work. The AI handled the tedious data gathering while I focused on validation and decision-making.

With the data collected, I used the Atlassian MCP server to create a Jira ticket documenting the context, affected repos, and migration risks. Without leaving my IDE, all the relevant information was nicely formatted in a user story.

Phase 2: Spec-Driven Development

Rather than jumping straight to coding, I used this as an opportunity to practice spec-driven development and AI-native SDLC.

I started a new project and asked Kiro to pull the Jira ticket and help write a spec.

Requirements Phase

The AI-generated requirements document captured the key aspects well:

Load migration targets from repositories.json
Validate Terraform version compatibility (>= 1.10 required)
Update backend configurations via text replacement
Handle edge cases (older Terraform versions, CI pipeline overrides)

I refined a few details around how pipelines use backend-config flags, but overall the requirements were solid.

Design Phase

This is where iteration became necessary. The AI's initial approach was a monolithic Python script - procedural and difficult to parallelize (let alone review and debug).

After several iterations where I emphasized the need for orchestration, progress tracking, error correction, and race condition avoidance - basically where I tried to achieve the so-called "Ralph Wiggum Loop" (see Ralph Wiggum Pattern by G. Huntley and Massimo Ferre's Kiro implementation) - we arrived at an agent-based architecture:

Orchestrator Script
├── Spawns multiple worker agents
├── Each agent claims a repository atomically
├── Agents work in parallel
├── Shared state via migration-status.json
└── Learning system via corrections.md

The design included:

Atomic operations: Each repo migration is independent
Status tracking: migration-status.json prevents duplicate work
Learning loop: corrections.md records failures and solutions
Race condition protection: Agents verify ownership after claiming

Tasks Phase

Despite defining the design with an agent-based approach, the tasks initially generated by Kiro had issues: the first task was meant to write a script for subsequent tasks to use, which wouldn't work for parallel orchestrated execution.

However, the task descriptions themselves were useful. Therefore I pivoted and asked Kiro to use its /custom-agent-creator to create an agent (not a script) with proper tool settings and a prompt to execute the workflow.

It's quite annoying that Kiro IDE creates and uses agents written in markdown, but if you rely on agents invoked as subagents or start kiro-cli directly with an agent, you need a JSON format. So usually I tend to write (or ask Kiro to create) the agent in Markdown (with frontmatter section) and then in JSON format as well (but importing the prompt from the markdown file: "prompt": "file://migrator-agent.md").

Phase 3: Building the Migrator Agent

The agent prompt handled the complete workflow:

Claim a repository from migration-status.json atomically
Load corrections from previous failures
Clone the repo and create a branch
Search for the deprecated configuration using dynamodb_table
Replace the config with the new S3 use_lockfile
Check and update Terraform versions if needed
Format the code (terraform fmt / terragrunt hclfmt)
Commit, push, and create a PR
Update status (completed/failed)
Clean up temp directories
Record errors in corrections.md for learning

The agent configuration (.kiro/agents/migrator-agent.json) defined the Tools that the agent has available and what are allowed with or without approval:

  "tools": [
    "read",
    "write",
    "shell",
    "mcp:github",
    "mcp:atlassian"
  ],
  "allowedTools": [
    "fs_read",
    "write",
    "shell",
    "mcp:github",
    "mcp:atlassian"
  ],
  "toolsSettings": {
    "write": {
      "allowedPaths": ["*"]
    },
    "fs_read": {
      "alwaysAllow": [
        "*"
      ]
    },
    "shell": {
      "allowedCommands": [".*"],
      "autoAllowReadonly": true
    }
  }

Defining the tools and what is allowed is one of the trickiest parts. I want to orchestrate agents and have as much as possible automated, and I definitely want to spend my time solving other problems rather than being a guardian/gatekeeper clicking "approve or reject" for each action. On the other hand, granting full write or shell access is concerning. Finding the right balance is important.

After running the agent on its own and addressing a couple of issues, the agent was ready to be called as a subagent.

Phase 4: Orchestration

The agent worked well sequentially, but processing 30+ repos one at a time would be slow. Therefore I asked Kiro to create an orchestrator script that:

Spawns 4 agents in parallel (the current maximum of kiro-cli)
Monitors progress
Shows real-time stats (completed, in progress, failed, pending) and pipes progress and errors
Waits for all workers to finish
Generates final report

The agents coordinate through migration-status.json using atomic file updates with jq, save the progress to that file and to a more detailed corrections.md where all errors and different attempts are documented, to retry without going through the same mistakes but instead trying different approaches.

Necessary Reminder About AI-Generated Code

AI-generated code can be confidently incorrect. During this project/experiment, Kiro produced several times:

Made-up Kiro Agents' JSON schemas - which is particularly surprising since I would expect Kiro should know how it works...
Non-existent commands
Illogical implementations

Always review what's generated. Think critically about the problem and solution. Challenge the AI when something seems off. Otherwise, you'll waste time debugging broken code.

The Results and What This Really Demonstrates

I spent probably less than 90 minutes iterating on the entire "side-quest" - from identifying the scope of the changes (number of files and repos, type of change) and defining the prompt of the agent and putting everything together with the orchestrator (once again, not in one focused session but spread across meetings).
Once working, all 30+ PRs were created within a few minutes.
The amount of credits consumed was roughly ~80 (out of 1K monthly allowance with $20/month plan).

Was that worth it?

The manual alternative would have required likely 4-6 hours of very repetitive work:

Clone each repo
Find the files
Make the changes
Handle version updates
Create PRs

Tedious and error-prone. Considering breaks and some meetings, working manually on this ticket would have required a full day of an engineer, and likely a very wasted one - not such great conceptual work, nor a great learning experience (if not becoming faster in git clone, search & replace, git commit, and gh pr create --title --body --base main).

The value isn't just time saved.

This project provided:

Hands-on experience with AI automation and parallel agent orchestration
Practice with spec-driven development (requirements → design → tasks)
A learning system (corrections.md) that improves over time
Reusable patterns for any bulk repository operation

I took this small opportunity ( a necessary, but not so crucial refactoring task ) as a training ground for more complex problems.
You don't learn advanced techniques by tackling the hardest problems first - you build skills incrementally.

When I face truly challenging automation needs, these patterns will be ready.

Key Takeaways

What Worked

MCP servers (GitHub, AWS, Atlassian) enable AI to interact with real APIs effectively - although I am already aware that MCP servers, especially for local development, are losing the attention they received last year because actually using any CLI is likely faster and cheaper. Using them allowed me to really start experimenting more with AI and see the benefit of integrating different tools to improve productivity and focus. (Read more about the MCP vs CLI discussion and why CLI tools are beating MCP for AI agents.)
Spec-driven development forces clarity before implementation: this is something that we engineers should always do, and this is how we should have always been working (receiving well-described tickets with clear specifications and requirements, and then spending some time thinking about the problem and how to implement the solution rather than just diving into coding). Now spec-driven development finally forces us to do that.
Agent-based architecture with shared state scales simply
Learning loops (corrections.md) turn failures into improvements

What Required Iteration

AI defaults to monolithic solutions; modular design requires explicit guidance
Generated code needs critical review before execution
Edge cases emerge during testing that specs don't anticipate
Despite refined prompts and clear guidance, context rot and hallucination are relatively common

Key Lessons

Use AI for discovery and data gathering before making decisions
Document requirements and design - specs are thinking tools
Iterate on design; don't accept the first solution
Review generated code critically
Test incrementally before full-scale execution

Conclusion

By taking on this side-quest - likely an improvement ticket that would have been postponed for months, if ever, due to low urgency and low importance (and almost a day of effort) - I praticed AI-native development workflows, built reusable automation patterns, experimented with spec-driven development, and with Ralh Wiggum loop.

When faced with tedious, automatable work, using it as an opportunity to experiment and learn is more valuable than optimizing for immediate completion. The patterns and experience gained here will apply to far more complex problems in the future.

Hope it helps!

Top comments (1)

Kaustubh • Jul 16

Liked it!!