As a developer who's worked with various IDEs, AI-coding tools, and agent-assisted workflows, I recently spent some time using Kiro. It bills itself as an agentic IDE that brings "Spec-driven" development into the mainstream. In this post I'll walk through what’s great, what's frustrating, and what downright exposes rough edges from my hands-on experience.
What is Kiro
Before we go into the pros & cons, a quick overview to set the context.
- Kiro (by AWS) is an AI-powered IDE built on a VS Code core.
- Its standout idea is spec-driven development (SDD). Instead of just vibe coding (talk with an AI to jump in and code), you start by defining requirements, then design, then implementation tasks.
- Supports "steering" (steering files: product.md, tech.md, structure.md) to give context so the AI agent has real project context to work with. Kinda like rules to be honest but more transparency since Kiro shows you whats steering that loaded to context.
- Hooks / agent hooks: automations or triggers that run on file events etc., to enforce standards, generate docs/tests, etc.
1. Spec
This feature basically gives you three documents—requirements.md
, design.md
, and tasks.md
—that flow naturally from one another.
-
requirements.md
is a collection of user stories generated from your prompt. -
design.md
is a technical guideline created using both the requirements and your original prompt, with some extra research to make sure it uses correct technical terms, diagrams, and architectural notes. -
tasks.md
pulls it all together, breaking the design down into actionable checklists with states like in progress, done, and failed.
The Good
In my month of working with Kiro, the Spec feature stood out as a surprisingly useful thinking partner. It's great at taking a rough prompt and slicing it into different perspectives-whether that-s from a user's point of view, from a developer’s perspective, or even through acceptance criteria. The benefit I felt from this structure was the ability to self-assess my own prompt and refine it into manageable chunks with clear success criteria. Sometimes I'd start with a very broad, messy prompt. Running it through requirements.md
gave me a spread of user stories that became a kind of prompt inspiration engine that I could select, refine, or even pivot based on what it generated.
The design.md
stage was also a game changer. It translated my vague ideas into more concrete, technical language. There were times when I didn’t know the exact term or pattern I needed, but the design doc filled that gap, often pointing me toward the right flow, architecture type, or even the correct jargon. It wasn't just generating code it was helping me learn more from it.
Finally, the tasks.md
output turned that high-level thinking into a neat checklist. What I loved most is how it integrated with my workflow. For example, I could tell it to create a new branch and commits from the requirement, and it would produce proper branch names and git messages without me having to overly explain. When I started the tasks, it automatically followed the flow outlined in the spec. In practice, this meant I could automate my git flow through the spec alone, without hand-holding the AI every step of the way.
Another success case I had with the Spec feature was using it to create a design system—basically a UI template guideline. At first, it was just an experiment, because my app looked like every other vibe-coded project. You can’t just tell Kiro “make it pretty,” but you can be precise in the spec prompt. For example:
"Create a spec for a design system. Scan through the #codebase and transform every component into [your style]."
And it worked like magic. In one run, I transformed a standard shadcn look and feel into a full neobrutalist UI style—all from a single Spec. That moment genuinely impressed me. It felt less like AI throwing random CSS at the wall, and more like having a design system generator built right into my workflow. The neat part? You can even use images as references, and Kiro will generate a surprisingly solid design system spec doc around them.
The Bad
My first frustration with the Spec feature is how rigid the flow is. It only comes in three flavors requirements, design, and tasks. You have to follow them in that order. Let's say I just want a design doc. Nope, I have to generate requirements first. Or maybe I want to jump straight to tasks from my prompt. Again, not possible. I'm forced to walk through the same steps every single time. What frustrated me even more was the pricing model. Apparently, creating a spec is charged as a "vibe" request because it's generated through the chatbox. But… why? Specs are supposed to be structured, not casual chat. This means every time I try to refine or have a conversation about a spec, I’m burning vibe credits. That felt like paying extra just to use the feature the way it’s meant to be used.
The Ugly
The very first time I ran tasks, I noticed how over-eager Kiro was about testing. Unit tests? ✅ Integration tests? ✅ E2E tests? ✅ ✅ ✅. It was like Kiro assumed I was building software for a billion users with an enterprise QA department on standby. Meanwhile, I wasn’t even sure if I had the core feature right, and my vibe credits were burning away on a mountain of tests—half of which didn’t even pass. Instead of helping me move fast, I ended up with this over-engineered jungle of tests, no working feature, and a sinking feeling that the AI was planning for “999 updates from 999 developers” when in reality it was just me, alone, trying to get one feature working. Yes, this is tweakable via steering and that's exactly what I did. Pro tip: if you're not sure what you're building yet, strip out all the tests and optimizations from the design spec. Even if you are sure, limit them as much as possible. Otherwise, you’ll watch your vibe requests evaporate faster than you can say npm run test.
2. Hooks
Hooks in Kiro are basically automations that trigger on file events. They can enforce standards, generate docs, or even keep your tests up to date. Think of them as little invisible helpers running alongside your workflow.
The Good
Hooks shine when they take care of the boring stuff. Generating API docs when I change an endpoint, scaffolding a test file when I add a new module—those things just happen without me lifting a finger. It really does feel like having a junior dev who quietly cleans up after me, making sure nothing slips through the cracks.
The Bad
The downside is that every hook run costs a full vibe request, no matter how small the automation. If the only available model is Claude Sonnet, I get why it’s expensive—but burning the same credit for a tiny linting fix as for a big multi-file refactor feels… off. I often catch myself hesitating to set up hooks for small, tedious tasks, because the “cost to benefit” just isn’t there.
The Ugly
Here’s where it got frustrating: hooks only trigger if you explicitly save a file or create a new one inside the editor. 🙃 So when I tried to drive automation from outside the IDE, nothing happened. I had a bunch of ideas for how hooks could turbocharge my workflow—but the “automatic” part turned out to be not so automatic. That killed a lot of the excitement for me.
3. Overall Agentic Experience
Kiro calls itself an agentic IDE, and it delivers on that promise in several ways: a chat sidebar, context management, RAG-style indexing of the codebase, terminal execution, and multi-file awareness.
The Good
The context management is excellent—especially with Steering. I found steering a far better approach than static rules
or agent.md
files. You can create multiple steering files and decide which ones apply to which folders or files, which gives you far more control. And then there’s the main course: the chat sidebar. This is where agentic coding really shines. Powered by Claude Sonnet 4, Kiro can handle long prompts, write huge chunks of code (I’ve seen it generate 2k+ lines in one go), and search context quickly with the #codebase
keyword. With a clear, well-structured prompt, it rarely fails. When it works, it feels like coding with a diligent, context-aware partner.
The Bad
But the agent isn’t perfect. Terminal execution, for example, constantly hangs unless I type commands manually. And yes—it still insists on generating tests (sometimes too many). Those are annoyances, but the real headache is pricing. Kiro’s vibe requests sit in a strange middle ground. The free tier is generous (50 per month), but I’ve blown through all of them in a single 2-hour session. Each task the agent works on takes about 5–10 minutes, and with back-and-forth clarification, 20 messages is enough to wipe you out. The pricing feels confusing and opaque: it’s marketed like Windsurf, but tallies up like Cursor—somehow landing in the worst of both worlds.
The Ugly
If pricing is the "bad" then the ugly part is how vibe requests are consumed by context summarization. Asking Kiro to summarize context burns 1 vibe request. Then re-entering a refined prompt burns another. It feels like the system punishes you for trying to debug or iterate quickly. More than once, I found myself losing momentum—not because the AI failed, but because I was too busy watching my credits drain away. It’s a real buzzkill when you’re excited to implement a feature and the meter is working against you.
Verdict
Kiro is one of the boldest takes on AI-assisted coding I’ve tried. The Spec feature is structured and surprisingly educational. Hooks are powerful when they work, though they cost too much for small tasks. And the agentic experience is genuinely impressive—but dragged down by confusing pricing and rough edges in automation.
For medium-to-large projects, it feels like a promising glimpse of the future. But for fast-moving, vibe-coded sessions, Kiro’s structure, costs, and quirks can get in the way. It’s not the magic bullet for every dev—but it’s definitely one of the most ambitious steps toward making AI a true coding partner.
I believe more IDEs will soon offer features like planning, documenting, diagnosing, and generating checklists. Right now, Kiro is one of the few that does spec-driven development well, but if it doesn’t refine its approach and fix some of the flaws I’ve highlighted, it risks being left behind as the space matures. The competition is moving fast, and “good at specs” won’t be enough if the overall developer experience feels costly, rigid, or frustrating.
Top comments (0)