How a wrong turn in San Francisco led to building Unified Agent Experience(AX) & pragmatic memory for AI coding agents
After an amazing experience exhibiting at the ODSC AI West Conference in San Francisco (October 28–30), I decided to stay in the city for a few extra days. Partly to explore, partly to experience the AI energy in San Francisco. I'd heard so much about the hype, energy, and vibes on social media, and I wanted to experience it firsthand. On the morning of October 31st (Halloween Day), I set out early to explore without any agenda or plan. I called an Uber to drop me close to the OpenAI office on 18th Street, excited to soak in the AI atmosphere. On the way, I'd seen huge billboards all about AI lining the highways. When I got dropped off, the experience was... not what I expected. I found myself surrounded by a group of people on a still-dark street. They were shouting, acting erratically. For a moment, I felt genuinely unsafe, I somehow managed to escape the situation by entering into local shop and taking help from shop-owner to reach to Financial District. Forget that, I do want to share what happened next: the vibrant, electric AI scene that made San Francisco live up to its reputation.
In the San Francisco: Financial District
After reaching Embarcadero area and walking around, I needed a place to charge my phone and MacBook. It was still early morning. I started looking for a co-working space around and maps pointed me toward something called the AWS Builders Loft. I assumed it was simply a co-working hub run by AWS but decided to check it out. When I arrived, I asked the front desk if they had co-working space available. The receptionist mentioned there was an event going on upstairs, so not sure about the space, but he was kind enough to let me try anyway. I went up to the second floor, and the moment I stepped in, I could sense the energy of an AI coding hackathon in full swing. Developers hunched over laptops. I told the receptionist on the second floor that I come from London and would love to join. She asked me to register on Luma, checked my passport, and let me in. The event was called Kiroween, a clever mix of "Kiro" and "Halloween."
I'd recently heard about Kiro during a prompt engineering conference talk in London by Ricardo Sueiras, where I attended his talk right before mine. But I'd never tried Kiro myself. I've always been skeptical of VS Code forks and AI-driven IDEs like Cursor or Windsurf. I loved coding with CLI and lightweight text editors and I didn't want to go back to VS-Code Forks including Kiro. I have no intention to try out Kiro anytime soon or probably never.
On other note, In recent months, since the AI boom, I've noticed a shift in the hackathon landscape. Many AI hackathons today feel less like genuine innovation events and more like user acquisition campaigns funded by VC money, offering modest prizes and platform credits in exchange for engagement. The authentic spirit of pre-AI tech hackathons, where builders came together purely to create and learn, seems to have faded. Don't get me wrong: hackathons still offer valuable opportunities for free food, platform credits, and making new connections. But as a founder, I've become increasingly selective about which events I join. Why invest time building on someone else's platform, potentially navigating IP complexities, when that energy could go toward my own startup?
But here I was, in the middle of the San Francisco at AWS Builders Loft, surrounded by Silicon Vally builders hacking away. A kind lady at the registration desk handed me a badge and pointed me to the breakfast bar. It felt like fate. Breakfast, the possibility of free lunch, space to charge my phone and Mac and coding in all in one place in the heart of the San Francisco. I figured, why not stay for the few hours and give Kiro a try instead of paying for co-working space elsewhere?
The AWS office was vibrant and inspiring. After a few attempts to get my UK-to-US adapter working, I finally settled down, downloaded Kiro, and officially joined the Kiroween Hackathon kickoff event.
First Impressions of Kiro: Good Part
Kiro is marketed as structured AI coding with spec-driven development. I’ve been hearing a lot about Spec Driven Development for Agentic Coding lately, especially GitHub launched SpecKit and AWS Launched Kiro, there are some startups in the Bay Area as well in London promoting SDD practices recently but I was still not convinced with their point of Spec Driven Development and what they are going to achieve in the Agent space. The debate is still ON if the Spec Driven Development is real or just hype especially from the some articles from ThoughtWorks and Marmelab and talk from the recent AI Engineer Code Summit by Dex Horthy from HumanLayer.
As someone who has always appreciated TDD and BDD practices, Since 2012 used RSpec or Cucumber and implemented BDD practices in major companies like AOL, BCC. I can get the ideas and concepts pretty quickly. At Superagentic AI, we’ve applied similar principles to our own work, in particular through SuperOptiX and our SuperSpec DSL, which allows users to define agent specifications in a human-readable way and practically followed TDD/BDD principles in development of AI Agents. But this was my first experience with an IDE that directly supports Spec Driven Development as a core workflow. I launched Kiro for the first time in my MacBook.
The onboarding experience with Kiro was smooth and intuitive. The setup was quick, and I had a project running within minutes. Only caveat was I need to have existing project either on disk or remotely on Github. I wish Kiro should gave me some template project to get started. Anyways, with uv I slapped new Python project within a sec to get ready for Kiro. What immediately stood out to me was how Kiro structured development around specifications written in the same language used to build the software, reminiscent of frameworks like RSpec or Cucumber, but fully integrated into the IDE.
Kiro divides the workflow into three logical stages like Product Owner, Technical Architect and Developers works together.
- Requirements (Product Owner/Business Analyst)
This is where you write high-level requirements, user stories, and acceptance criteria, exactly how business analysts and development teams have always done. Stories follow the familiar As a… I want… So that… format. Remember the top part of the Gherkin feature file?
This followed buy the scenario style acceptance criteria. However the acceptance criteria in the Kiro wasn't following the proper GIVEN/WHEN/THEN Style syntax. It looks a bit awkward but I can take it as new DSL to learn if needed.
- Design (Tech Lead and Architects)
Here you define the system’s technical architecture, not implementation of code, but the conceptual design and structure. It’s the space where tech leads and developers brainstorm the architecture before diving into the code.Think of this as technical architect documentation. The design step is actually Technical architect not design that designers do in Figma. I wish this step should be something called "Architect" or "Plan" rather than design.
- Tasks (Developers)
This is where the actual implementation task defined. You can skip unnecessary tasks, view individual changes, and observe execution details in real time. You can watch your task getting executed in modular way thats amazing like you asking the tasks to the developers from your project management tools like JIRA but you can have all in there in the IDEs now.
This structure makes Kiro feel like a truly behavior-driven IDE, bringing the principles of RSpec or Cucumber, to modern AI-driven development. It reminded me of two books I bought back in 2013, The RSpec Book and The Cucumber Book, which feel more relevant than ever in the era of agentic coding. I was going back to the old 2013 days with modern touch of AI Agents.
- Lightweight and Focused
Despite my skepticism toward heavyweight AI IDEs, Kiro genuinely surprised me. Having used numerous editors including VS Code and its various forks, I've grown accustomed to the trade-off between features and performance. Most feel heavy and resource-intensive, especially when AI capabilities are layered on top. Kiro, by contrast, felt remarkably lightweight and responsive, comparable to Zed with only a slight overhead. There's clearly thoughtful engineering happening under the hood. The experience is noticeably smoother than most IDEs in its class, which suggests the team has prioritized performance alongside functionality. For developers who value a snappy editing experience, this is a meaningful differentiator.
First Impressions of Kiro: Limitations
Kiro shows tremendous promise in bringing TDD and SDD concepts to software development with AI coding agents. However, coming from extensive use of Claude Code, Codex, and other CLI-based tools, I spotted several areas for improvement at first glance.
Model Support
Currently, Kiro supports only a limited selection of models primarily Claude models and an "Auto" mode. For developers who rely on model diversity, this is a significant limitation.My typical workflow spans multiple models for different phases of development:
Research: GPT, Grok
Planning: Gemini (for its broad web coverage)
Architecture/Code: Claude or Qwen
Since Kiro doesn't yet integrate with local models or allow flexible model selection, I couldn't fully apply my preferred workflow.
Spec DSL in Requirements.md
The domain-specific language used in Kiro for writing requirements doesn't feel entirely consistent with established frameworks like RSpec or Cucumber Gherkin. It blends elements from both but doesn't fully adopt either style, using keywords such as WHEN, THE, and WHERE. While readable, this feels slightly unconventional for developers familiar with traditional BDD syntax.That said, the natural language approach is promising and could evolve into a strong industry standard with further refinement.
Task Planning
Kiro's tasks.md file is a valuable feature, listing all generated tasks in one place. However, it sometimes creates tasks that aren't necessary, which can disrupt developer flow, especially when tasks have dependencies. Making the task list more editable, allowing developers to easily prune irrelevant tasks, would significantly streamline the experience.
Testing Integration
There are ways you can write tests as part of the task but Kiro did not use the clearly refined acceptance criteria to turn into executable specification which can automatically become the API or UI tests.
Executable Specifications and Living Documentation
One of the most exciting opportunities for Kiro lies in making specifications executable. In TDD and BDD, executable specs naturally become tests, eliminating the need for separate test suites. If business stakeholders could run these specs directly to validate requirements, it would create a powerful "living documentation" system, aligning business and technical teams around a single source of truth.
Limitations as Inspiration
Rather than viewing these limitations as dealbreakers, I saw them as opportunities. These gaps sparked an idea: what if I could build something that addresses these challenges while benefiting both Kiro and the broader Agentic Coding agent ecosystem?
The Kick-Off & Interview at AWS Builder Loft San Francisco Office
The Kiroween Hackathon kick-off itself was electric. The AWS Builders Loft had an incredible atmosphere filled with creativity and collaboration. I met several amazing builders and founders throughout the day, had enriching conversations, and shared ideas about AI development. I also enjoyed friendly argument San Francisco vs London with Aymen. I also had the chance to give a short interview with the Kiro team, sharing my first impressions and how I could see Kiro fitting into my future workflow. I look forward to seeing that interview published on Kiro or Devpost channels soon. Participating in Kiroween turned what was meant to be a casual day of sightseeing into one of the most memorable and productive experiences of my trip. A big thank you to AWS Builders Loft, the AWS team, and everyone at the hackathon who helped me with setup and made me feel welcome. Thanks Helen, Vinni and Erik making my day memorable with interview. It was an unforgettable experience.
Back to London: The Problem That Kept Nagging Me
I came back to London with amazing memories showcasing the products in OSDC AI and Kiroween Kick-off at AWS builders loft but the current problems in Agentic Coding kept nagging me all the time. After returning to London, I got pulled into attending conferences, business shows, hosting meetups, and various other commitments. I couldn't start working on the hackathon project until November 30th, when my Kiro credits finally loaded. I had only 6 days left to build something. And with the SaaSr AI London conference on December 1-2, I genuinely had 4 days to work on the Kiroween hackathon project. Meanwhile, Gemini-3, Claude Opus 4.5 launched with massive buzz, adding to the momentum to agentic coding space.
Markdown Madness in Agentic Coding
AI Conferences and talks everywhere discussed the overload of markdown files that Claude Code generates. Experts shared advice on writing CLAUDE.md and AGENT.md files effectively. Some promoted how they turned coding agents into better code reviewers by sharing hacked markdown files as prompts, often thinly veiled product promotions. It felt less like engineering and more like a prompting guide for coding models. This approach troubled me. Every provider was trying to lock developers into their specific coding agent through proprietary file formats and prompting strategies. Developers were drowning in markdown madness, following prompting guides that varied wildly between tools. All those file-system-based approaches to generate context felt... messy.
AI Engineer Code Summit: Everyone Promoting Their Own Approach
At the AI Engineer Code Summit in New York, talks covered Agent Skills by Anthropic, Antegravity by Google DeepMind, and various tools advocating their own ways to build context. Some companies like Amp were taking interesting opiniated approaches combining large and small language models. Dex clearly claimed that Spec-Driven Development is broken, referencing a ThoughtWorks blog post arguing that specs are just detailed prompts. Swyx mentioned Fast Agent approach from Devin. I watched every single talk carefully. What emerged was a pattern: everyone was promoting own promoting techniques with different terminology. Context Engineering. Skills. Harness Engineering. Eval Engineering. Compressed Context. The concepts overlapped, but the branding differed. Another recurring theme was "don't outsource thinking" and "keep humans in the loop." I appreciated Amp's different opinionated approach, but nobody talked about how to optimize the code or prompts that go into coding agents in a portable way. I kept thinking: This space is getting so messy. How can developers switch coding agents without rebuilding context again and again?
Why are providers trying to lock developers into their ecosystems with proprietary prompting strategies and ideologies? They promote modular approaches for using coding models and embedding models, but nobody talking about being modular in terms of selecting coding agents. One thing became clear: a lot of this mess is caused by the file-system-based approach to gathering context.
Does Spec-Driven Development Really Work?
Spec-Driven Development is being criticized, particularly in blog posts by ThoughtWorks and Mermelab. Coming from the TDD/BDD world, I'd experienced firsthand how few developers actually practice Test-Driven Development. Hardly anyone does it consistently. So why are we spending so much time reviewing specs and code (double work)? What happens to specs once they're implemented? They rot. I also tried other SDD frameworks like GitHub SpecKit and browsed Tessl during this time. I noticed something troubling that crystallized the pain points:
Problems with Spec-Driven Development :
- Too verbose, feels like waterfall
- Specs rot after features ship
- Nobody maintains the documentation
- Bureaucratic gates that slow development
But pure vibe coding was equally brittle:
- Agents forget everything between sessions
- No constraints, no memory
- Repeated mistakes
- Unpredictable behavior
Whats the middle ground here? Fast Agent approach by Devin or Harness Engineering by HumanLayer, Semantic Search by Cursor or Model based approach by AMP. Or the Kiro or SpecKit approach of Spec Driven Development. No clear answers. The specs are necessary but it shouldn't be over engineered. I kept thinking: The specs generated by these tools could be reused somehow. Either for living docs, executable tests, or... memory.
The Agent Experience and Agent Memory Paradigm
Having spent the last 10+ years in Developer Experience roles, building tools and frameworks to make developers productive, I realized something fundamental was shifting. Developer Experience is evolving into Agent Experience. I was inspired by Netlify's CEO, Matt Biilmann discussing the Agent Experience paradigm and made it one of the core pillars for Superagentic AI. We're not building for human developers anymore. We're building for agents. With all this chaos happening in the coding agent space, I thought: why not build the Agent Experience layer for coding agents by giving them Pragmatic Memory?
Nobody had thought about building centralised memory as context for coding agents that could be used across various tools and tasks. Memory where relevant context would be retrieved dynamically by the agent, regardless of which coding agent you're using. Memory that doesn't blindly follow Anthropic's file-system-based approach. Memory that keeps development spec-centric without the bureaucracy. Agentic Memory has been research topic while some cool tools like Zep, mem0, Letta evolving I thought this could be the great opportunity to build the Agent memory for coding Agents.
That's when it hit me. Specs as Memory. Memory for Specs.
SpecMem was born.
What I Built at Kiroween: SpecMem
SpecMem is the first-ever Agent Experience (AgentEx) platform: a unified, embeddable cognitive memory layer for AI coding agents.
The Burning Problems that I tried to solve
- Developers Are Drowning in Markdown
CLAUDE.md, AGENTS.md, .cursorrules, requirements.md, design.md, tasks.md... the list grows with every feature. These specifications represent hours of careful thought. But what happens after the feature ships? They rot. They're forgotten. They become digital dust. Agentic coding needs other scalable approach than the File System search for gathering context.
- AI Coding Agents Have Amnesia
Modern coding agents suffer from catastrophic forgetting. Sessions reset, context is lost, previous decisions vanish. Agents write code without knowing your specs, acceptance criteria, or earlier decisions. Agnetic Coding needs dedicated memory that can retrieved on demand.
- Vendor Lock-In Is Real
Every coding agent uses its own proprietary format. Claude uses CLAUDE.md, Cursor uses .cursorrules, Kiro uses .kiro/specs/. Switching agents means rewriting all your specs. Your project knowledge is trapped in one tool. Agentic coding tools need modularity.
- No Agent Experience Layer Exists
We have DevEx (Developer Experience) for humans. But where is Agent Experience Layer for AI coding agents? There's no unified memory layer, no context optimization, no impact analysis.
Key Features Built
- Framework Adapters: Pro support for Kiro and limited support for SpecKit, Tessl, Experimental Support for non Spec Driven coding Agents like Claude Code, Cursor, Codex, Factory, Warp, Gemini CLI
- Cognitive Memory: Vector-based semantic search with LanceDB, ChromaDB, Qdrant, or AgentVectorDB
- SpecImpact Graph: Bidirectional relationships between specs, code, and tests
- SpecDiff Timeline: Track spec evolution, detect drift, find contradictions
- SpecValidator: 6 quality rules for specification health
- Spec Coverage: Map acceptance criteria to tests, identify gaps
- Health Scores: Project health grades (A-F) with improvement suggestions
- Web UI: Interactive dashboard with live sync and WebSocket updates
- GitHub Action: CI integration with PR comments and configurable thresholds
- MCP Server: Native Kiro Powers integration via Model Context Protocol
- Multiple CLI Commands: Full-featured command-line interface
- Python API: Programmatic access via SpecMemClient
The Killer Feature: Swap agents without losing context.
SpecMem creates a unified, normalized, agent-agnostic context layer. Switch from Kiro → Claude Code → Cursor → SpecKit without rewriting spec files or losing project knowledge. Your specifications become portable. Your memory persists. Your agents remember.
Pragmatic SDD: The Balance Struck. Pure Spec-Driven Development feels like waterfall. Pure vibe coding is chaos.
SpecMem strikes the balance:
Specs as Memory: Not bureaucratic gates, but searchable knowledge
Selective Context: SpecImpact gives agents only relevant specs, not everything
Living Docs: SpecDiff detects drift, SpecValidator finds contradictions
Gradual Adoption: Start with any format, no big-bang migration
SpecMem 😍 Kiro: First-Class Integration for Spec-Driven Development
SpecMem was born during Kiroween 2025 with one mission: make Kiro's Spec-Driven Development workflow even more powerful. We've built first-class support for Kiro IDE with native adapters, MCP server integration, and seamless workflow enhancements that feel like they've always been part of Kiro.
⚡ Kiro Powers Integration: Install SpecMem as a Kiro Power and unlock persistent memory for your coding agent. Query specs without leaving Kiro, analyze impact in real-time, and get context-aware suggestions that understand your entire project history. Your agent finally remembers.
🔗 MCP Server: Full Model Context Protocol support means Kiro's agent can query your specifications, analyze change impact, and retrieve optimized context automatically. No manual copy-pasting. No context switching. Just intelligent, on-demand memory that knows what your agent needs.
📄 Native Kiro Adapter: SpecMem understands .kiro/specs/ structure natively. Your requirements.md, design.md, and tasks.md files are parsed into searchable, semantic memory. Every user story, acceptance criterion, and design decision becomes queryable knowledge.
🎯 Visualize Your Specs: Build the SpecMem dashboard to see your Kiro specifications come alive. Validate them against tests, detect drift, track coverage, and generate health scores. Show this dashboard to your Product Owner and watch their face light up when they see living, trackable specs.
⚙️ CI/CD Integration: Add SpecMem to your GitHub pipelines to validate specs and generate coverage reports, just like you do for test coverage. Treat spec quality as a first-class citizen in your delivery process.
🔍 Smarter Pull Requests: Integrate SpecMem into your PR workflow. Get instant insights on specification impact, coverage gaps, and potential drift with every code change. Catch spec issues before they merge.
🧠 Specs as Memory: Index your Kiro specifications using your preferred vector database—LanceDB, ChromaDB, or Qdrant. Transform static markdown into searchable, semantic memory that coding agents can query across sessions.
⚡ Selective Testing: Run SpecMem against your code changes to identify only the impacted tests. When you modify auth/service.py, SpecMem knows which specs are affected and which tests to run. Save CI time, reduce compute costs, and accelerate your feedback loop.
SpecMem amplifies Kiro. Your Kiro specs become living documentation, your agent gains persistent memory, and your workflow stays intact. That's the power of Agent Experience.
You can call it Pragmatic SDD. SpecMem is on GitHub or Browse Documentation.
Watch SpecMem in Action
The Hackathon Submission Drama: A Race Against Time
With only 4 days to build SpecMem, I was coding until the final hour. With 10 minutes left, I discovered the hackathon required a demo video. I rushed through recording, answered questions with "N/A" where possible, and waited as YouTube crawled through the upload. I clicked Submit. The browser spun. Then: "Sorry, this hackathon is no longer accepting submissions." My heart sank, not because I might miss an opportunity, but because the Kiro team wouldn't see the work. I wanted this visible to help improve the Kiro and Agentic Coding ecosystem.I immediately emailed the organizer with a screenshot showing the seconds-late submission. Within minutes, they sent a late submission link. I rushed through again and got it in.
15,000+ lines of code. 14 major features. Built in 4 days. Submitted with seconds to spare.
How Can Kiro users Use SpecMem right Now?
SpecMem is published on PyPI and available on GitHub. Kiro users can start using it today.
Here's what you can do:
- Visualize Your Specs: Build the SpecMem dashboard to visualize your specifications, validate them against tests, and detect drift. Host it as GitHub Pages for team collaboration. Show this dashboard to your Product Owner or Business Analyst and watch their face light up.
- Integrate with CI/CD: Add SpecMem to your GitHub pipelines to validate specs and get coverage data, just like you do for test coverage. Catch spec issues before they reach production.
- Enhance Pull Requests: Add SpecMem to your PR workflow to get insights on specification impact, coverage gaps, and potential drift with every code change.
- Index Specs as Memory: Use your favorite vector database (LanceDB, ChromaDB, Qdrant) and embedding models to index your specs as searchable memory for coding agents.
- Run Selective Tests: Use SpecMem against your code changes to identify only the tests that need to run, saving CI time and compute costs.
Get Started: Browse the documentation to see where you can plug SpecMem into your existing Kiro projects.
What's Next for SpecMem in terms of Features
Since SpecMem is submitted as a hackathon project, I can't touch the codebase during the judging period. However, I'll be forking it to my personal GitHub and continuing development in parallel. The hackathon was just the beginning.
Short Term
My immediate focus is on Spec-Driven Development coding agents, specifically Kiro, GitHub SpecKit, and Tessl. I want to make those adapters more robust so users can fully leverage the specs generated by these frameworks, transforming them into living documentation, searchable memory, and actionable insights. Currently, SpecMem can run as a GitHub Action to lint specifications, detect drift, and map tests to acceptance criteria. I want to take this further by providing valuable, contextual feedback directly on GitHub Pull Requests, helping teams catch spec issues before they merge.
Additional short-term priorities include:
Enhanced semantic search with better relevance ranking
Support for more vector databases and embedding models
Improved SpecMem dashboard that serves both developers and product owners
Medium Term
SpecMem Cloud: A hosted solution for teams who prefer not to self-host SpecMem dashboard. Connect your GitHub repository containing Kiro, SpecKit, or Tessl specs, and SpecMem handles the rest.
Real-time Collaboration: Multi-user support where spec changes trigger notifications and keep teams synchronized.
Native Support for Non-SDD Agents: Bind code to specs for coding agents that don't follow Spec-Driven Development, including Claude Code, Cursor, Windsurf, and others. Bring pragmatic memory to every coding agent, regardless of their native approach.
Long Term
Yes, there's a long-term vision. Interested in where this is heading? Reach out. I'd love to connect.
The SpecMem Vision
SpecMem is redefining Agent Experience for coding agents. By introducing Pragmatic Memory, we're making coding agents smarter, more context-aware, and more effective. More importantly, SpecMem gives developers the freedom to switch between coding agents based on tasks and capabilities, without vendor lock-in. Your specifications, your memory, your choice of tools.
A Note on the Competitive Landscape
Current market leaders like Claude Code, Cursor, Codex, Windsurf, Factory, Amp, Gemini, and yes, even Kiro, may not embrace this approach enthusiastically. Agent portability could potentially disrupt their user retention strategies. Each provider has invested heavily in their proprietary formats and ecosystems. But that's precisely why SpecMem matters as it's giving freedom to ultimate builders.
Beyond the Hackathon
The goal of SpecMem was never simply to impress hackathon judges or increase winning chances. It was to address a fundamental challenge in the current coding agent space: fragmentation, lock-in, and amnesia by defining the new approach of Agent Experience to coding agents.
My goal is also to help improve Kiro through constructive feedback. I've shared honest observations about Kiro's strengths and areas for improvement because I want to see Kiro succeed in an increasingly competitive landscape. The Kiro team has built something promising, and I hope this feedback helps them come back even stronger.
An Open Invitation
To the incumbents: you're welcome to evaluate these ideas, adopt them, or build similar features. You have the funding, resources and engineering talent to take this further. Perhaps some of these concepts will inspire new startups or product directions. For Superagentic AI, SpecMem represents a foundation we intend to build upon. We'll continue developing killer features that push the boundaries of what Agent Experience can be. I am happy to collaborate on ideas and concepts if you are interested. The future of coding agents shouldn't lock developers-in but give their agents the best possible experience, regardless of which tools they choose.
That's the vision. That's SpecMem.
Kiro Usage Experience Hackathon and Beyond
I used Kiro extensively during the hackathon, exploring nearly all of its features in a compressed timeframe. While the window was short, it was enough to form clear opinions about where Kiro excels and where it needs improvement.
Where Kiro Shines
- Structured Development Workflow: Kiro excelled at keeping my requirements, technical designs, and tasks organized and executable. I could always refer back to what had been implemented, feature by feature. This traceability is genuinely valuable for complex projects.
- Modularity and Steering: Kiro gave me the flexibility to modify specific features without disrupting the entire project. The steering docs allowed me to enforce my own coding standards and rules as I worked.
- Hooks and Powers: These newer features appear powerful. I used hooks effectively during the hackathon, though I didn't find an opportunity to use Powers for SpecMem since the use case didn't require them.
- CLI Launch: The Kiro CLI landed just in time, allowing me to return to my preferred lightweight coding experience rather than relying solely on the IDE.
- Collaborative Experience: It feels like collaborative experience even If I was woking solo on the project. It feels like I am working with various people in the team like Product Owners, Tech Architects and developers.
Areas for Improvement
While working with Kiro solely, I felt like I should have more features to make my workflows even better.
- Limited Model Selection: Kiro's model choices felt restrictive. I ended up using Claude Opus 4.5 for all my work because the alternatives were limited. I wanted to switch models based on tasks, using Gemini for planning and other models for specific purposes. Neither Gemini nor GPT models were available. I avoided the "Auto" mode since there's no transparency about which models it uses under the hood, and I didn't want unexpected disruptions. Kiro doesn't support local models hosted via Ollama, MLX, SGLang, vLLM, and similar tools. It would be also great to allow developers to select different models for plan, architect and code as some models are very specialised vs others to perform specific tasks.
- Workflow Friction: Kiro slowed down my workflow significantly. I had to review all generated requirements, which weren't always written properly. When I tried to amend them through the model, the results didn't match my expectations, forcing manual edits. The same applied to architecture designs and tasks. Eventually, I found myself accepting whatever Kiro generated without thorough review, just to maintain momentum.
- Response Times: The response times when executing tasks or generating requirements, designs, and tasks were noticeably slow. I haven't experienced such latency with any other CLI or IDE to date. Session Management: Kiro didn't notify me when approaching context limits. It summarized my sessions mid-task, and subsequent sessions completely lost the flow. I had to manually copy tasks to new sessions to continue, which was disruptive. Forced Workflow Loops: I had no control over when Kiro would cycle through the requirements/design/task loop. When I didn't need this workflow. I resorted to overriding prompts with instructions like "PLEASE DO NOT GENERATE REQUIREMENTS/DESIGN/TASKS."
- CLI Experience: I tried the CLI as soon as it was announced but soon realised that its too early stage to explore it fully as too much manual configuration to get full support. I returned to the IDE.
Again these are my personal experience using Kiro as solo developer and my own hackathon project which didn't explore full power of hooks and Kiro powers. I truly understand Kiro is still new and emerging but shown so much potential so far. I really hope feedback from this hackathon will definitely shapes the future of the Kiro and entire Spec Driven development and SpecMem project has already provided some food for thoughts. Kiro has potential. With the right improvements, it could become a serious contender in the coding agent space. I hope this feedback helps the team prioritize what matters most to developers like me.
Kiro at AWS re:Invent: The Future Looks Promising
I recently caught up with the keynotes and talks from AWS re:Invent related to Kiro, and I'm genuinely impressed by the new feature announcements. CEO Matt Garman's keynote unveiled exciting agent announcements including Kiro Autonomous Agent, Security Agent, and DevOps Agent. I'm looking forward to seeing how these play out in real-world scenarios.
Dr. Swami's keynote also highlighted Kiro, reinforcing its strategic importance within AWS's vision. Byron Cook's talk was particularly interesting, discussing how Kiro leverages natural language specifications for both specs and tests, a concept that aligns perfectly with what SpecMem is trying to achieve. I also watched several Lab sessions demonstrating how to use Kiro and the CLI effectively. These hands-on walkthroughs showcased the practical applications and workflow improvements Kiro enables.
The future development of Kiro-specific features within AWS looks solid. The investment and roadmap are clear. I can't wait to see the next releases and how Kiro continues to evolve in the competitive coding agent landscape.
The Takeaway
Sometimes the meaningful things happen completely by accident. I went to San Francisco to exhibit at ODSC AI and explore AI vibes. I stumbled into a hackathon. I discovered limitations that sparked an idea. I built something in few days that I believe can change how developers work with AI coding agents or at least start of something new in this space. The coding agent space is messy. Every provider promotes their own file formats, prompting strategies, and context engineering approaches. Developers are drowning in markdown madness while agents forget everything between sessions.
SpecMem introduces a new paradigm: Agent Experience (AgentEx). Just as DevEx optimizes the experience for human developers, AgentEx optimizes the experience for AI coding agents. At its core is Unified Pragmatic Memory, a centralized, agent-agnostic memory layer that lets you switch between coding agents without rebuilding context or losing project knowledge.
Specs shouldn't be documents that rot. They should be memory that agents use on demand. Context Engineering shouldn't be forces it should come as natural.
That's SpecMem.
Links
Landing Page: https://super-agentic.ai/specmem
GitHub: https://github.com/SuperagenticAI/specmem
Documentation: https://superagenticai.github.io/specmem/
SpecMem is developed by Superagentic AI as part of the Kiroween Hackathon, December 2025.
A big thank you to AWS Builders Loft, the AWS Startups team, the Kiro team, and everyone at the hackathon who helped me with setup and made me feel welcome. It was an unforgettable experience.
Top comments (0)