AI‑Assisted Software Development — 6 Pitfalls to Avoid

#ai #sdlc #softwaredevelopment #kiro

Generative AI isn't just another tool, it's rewriting how we build software. Tools like Cursor and Claude Code, endless LinkedIn hype threads, and YouTube '10x productivity' demos have flooded the space. But most teams chasing the hype are about to learn a hard lesson: speed without discipline creates chaos faster than it creates value.

Most people see the vibe-coding demos and think "let me trade my VS Code for an AI version of it (Kiro, Cursor, etc…) and I'll be the king". They'll produce code faster, sure. But without good practices, proper specifications and design, test harness and review process, they'll write 10x faster code that's 10x messier.

This blog post details 6 concrete pitfalls I've seen - and sometimes fallen into myself - and recommendations to avoid them.

Disclaimer: this blog post represents my personal thoughts and is not endorsed by AWS.

1. Vibe-coding without a plan

I wanted to title this blog post "AI-Driven Development - 10 pitfalls to avoid" but that's wrong: developers drive the AI, not the other way around!

Vibe-coding is a bit like enabling the auto-pilot mode in your car: you might arrive at destination but not sure it will be the safest and shortest trip.

With a powerful assistant in your editor, it's tempting to jump straight into prompting and code generation: "Generate a service that does X, Y, Z." Ten minutes later, you have a shiny pull request with ten new files that introduce a new logging library, violate your coding standards, don't respect the project structure, and so on.
Then you end up in a long conversation where requirements (if we can call them that), design (eventually) and implementation details are all mixed together. It results in an infinite back-and-forth between the developers and the AI.

Of course, you can also ship the code directly without any code review, it will come back sooner than expected, like a boomerang, in the form of a production bug. In the end, it's a complete waste of time and what was supposed to make you x times faster finally slows you down (bug fix, rewrite, etc.).

Recommendations:

You will see everywhere "Spec-driven development" as if it were the new "agile", but let's face it, it's not new! Go back 20+ years in the past with the "V model" and you'll see nothing was invented: you start by defining the requirements and design before starting any implementation. With AI-assisted development, it's the same: If you want the AI to understand what you want and respect several rules, you need to specify things upfront.
You also need to make sure to properly design what you want: how it integrates in the project architecture, what the inputs and outputs are, the business rules, etc.
And last step before actually implementing anything, you need to plan the work for the AI not to diverge. Having a detailed list of tasks will avoid the back-and-forth we mentioned before and will permit the AI to stay focused.

If you don't know it yet, I encourage you to have a look at Kiro, a VS Code-based IDE with built-in "spec-driven development". Kiro will help you define your requirements, produce detailed specifications, design documents and task lists.

2. Providing insufficient context

You can see this pitfall as an extension of the previous one. Most often, specifications and design aren't enough. Without well-defined rules, the AI assistant will produce code that doesn't respect your standards, system architecture, or constraints.

You will have to repeat things you've already said (when I started, I had to repeat each time to put imports on top of Python files). You will have to ask for changes in the code again and again to generate the desired code. At that point, it would have been faster to write it yourself.

Recommendations:

Not only do you need to define your requirements, but also the constraints to work with: architecture, compliance, coding standards, non-functional requirements, etc.
Create steering documents to specify your constraints. Invest in them early. Treat them as code, share them in the repo. You can eventually share them more broadly at the enterprise level. Make them live, update them when the AI makes mistakes (just like you'd update unit tests to ensure non-regression after a bug fix).
In addition to steering docs, you can give the assistant access to broader information (enterprise policies, security guidelines, compliance rules, etc…) from a Knowledge base through MCP tools or structured docs.
Make your code "discoverable". Models perform better when projects have a clear structure, consistent naming, great conventions. Note this isn't just for AI, it's just a general good practice, even for humans to maintain a project.

In Kiro, you can use steering documents to provide rules and constraints. You can also plug MCP servers, for the assistant to retrieve additional information, either third-party ones (AWS for example) or your own.

3. Amassing too much context

On the contrary, sometimes, you provide the assistant too much context: a very big codebase, lots of MCP servers, and an endless conversation about many different topics. The session accumulates many partially related requirements, refactors, bug fixes, and side explorations. Over time, the AI starts mixing up domains and quality gradually degrades.

Some tools provide auto-summarization/compaction of the context, which can help, but it doesn't reliably reconstruct a clean, task-oriented context.

Recommendations:

Have one major task per session: treat each significant feature, bug, or refactor independently as a separate AI session. For each session, provide only the required assets (code, specs, steering docs, MCP servers) - no more. Clear or restart the session when moving to the next task or if you start to say "ignore this" to the assistant.
Reduce the scope of the requirement, don't ask the AI to build a complete application. Instead, focus on a specific feature or even smaller: a specific service. This will avoid the AI going into many directions, generating tons of poor and useless code.
Capture decisions in documentation or steering documents, not endless conversation. These assets will be saved in the repo while conversation will be lost at some point. As the context grows, the LLM will have issues "remembering" and applying everything discussed. Instead, you (and your colleagues) will be able to leverage steering docs in future sessions. It's also a great way to document the project.

Kiro recently introduced the notion of "powers". Powers permit to load MCP servers dynamically, based on user query, rather than preloading all of them and filling the context with hundreds of tool definitions.

For steering documents, Kiro permits to specify an "inclusion mode" to define when to load them, and thus avoid adding all steerings to the context:

always (for common best practices and rules)

based on the presence of certain files (for example "components/**/*.tsx" for React components)

manual (using the #steering-file-name)

4. Treating AI as a developers-only tool

Unfortunately, today, these tools generally land only in the hands of developers. Product owners still write fuzzy user stories, architects produce PowerPoint and wiki pages, security arrives at the end with a PDF of "blocking issues", and QA becomes the new bottleneck, flooded by tens of new features released by the developers.

It's not new, organization has always been a factor of failure or success. Agile tried to gather people and competencies to build multi-skilled autonomous teams. DevOps tried to reduce the distance between developers and ops. But with the developers' (expected) increased velocity, it will become even more critical. AI becomes central here - not leading, but supporting the process.

Recommendations:

As discussed above, the context provided to the AI is key, to get the right product and the good code. And most of the actors listed above must participate in generating this context: specifications and validation (PO, business, QA), design (architects), security and compliance (security and architects).
But it's not just about giving markdown files to the developer instead of PowerPoints or Word documents to be more AI-readable. It's about using AI themselves, in close collaboration with developers and other actors to provide the best possible input:
1. Product owners, business, QA should now participate in the requirements generation with the developers. Keep an iterative approach and continuously refine them during sprints. Notice that sprints might be shorter than before, one week or even less.
2. During the specification phase, prepare the validation by creating BDD scenarios ("Given / When / Then") that everyone signs off before any code is generated. Leveraging these scenarios and BDD frameworks like Cucumber will drastically help QA validation. I will cover the "test" topic in the next part.
3. During the design phase, developers can certainly design their software, with the help of well-crafted steering documents, but when integrating with other system parts, or handling non-functional requirements, architects and security should join developers to ensure alignment.
During all these steps, AI generates content, but it's the teams' responsibility to review, enrich, correct it, to produce the most accurate context, free of ambiguity, for the development phase.
With this "tool" comes a new organization. Don't throw away what you built with agile, reinforce it: build a strong product-oriented, multi-skilled team, co-build and share the same AI context (specs, design, steering docs, tests) and co-own the result. AI becomes an assistant of the team, not just of the developers.

5. Testing after

We briefly touched on tests in the previous part. AI is quite good at generating tests… that pass its own generated code. You will tell me it's still better than what most teams do: no test at all. The problem is if we forgot a business rule in the code, it will be also absent from tests. Tests confirm potential biases in the code, they don't challenge them.

Recommendations:

Write tests before writing code, between the specification/design and development phases. It's not new, this is called Test-Driven Development (TDD). This practice, as good as it is, was rarely adopted except by a few purists and craftsmen. The mental model is very different, writing tests without code feels counterintuitive. But now with AI, you can generate them easily. This ensures all the requirements are backed by one or more tests, before starting the development. Later, it validates the AI-generated code against those tests.
Write tests even earlier, during specification. These tests, written with a specific language (Gherkin) can then be "implemented" (with Cucumber for example) to test the application. These "executable specifications", can be used by QA to validate the behavior of the application, it is called Behavior Driven Development (BDD). Gherkin/BDD exist for a very long time, but like TDD, the practice did not spread as it should have. POs/QA often struggle to write good Gherkin and developers struggle with the glue code. But AI can solve both:
1. Generate Gherkin specifications with the help/review of POs and QA during specification.
2. Write the glue code (with Cucumber or the framework of your choice) during development. POs and QA get instant validation and avoid post-development bottlenecks.

6. Over-trusting AI-generated code

As we saw, providing good context and writing tests first to get better AI-generated code is primordial. But that's still not enough! What? You thought that leveraging AI, you'd have almost nothing to do?! You thought "developer" was a prehistoric job already?! Not at all!

Even with solid specs, steering docs, and TDD/BDD in place, AI can still hallucinate, miss requirements, forget some edge cases or introduce unexpected changes in your codebase. Your job now is to control what gets in (context and tests) and what gets out (review and guardrails).

Recommendations:

Humans always review AI-generated code.
Not just review, but understand the generated code. If you can't explain a function to your colleagues during PR review, don't commit it.
Human review alone is not enough. Enforce automated guardrails:
- CI/CD quality gates: minimum test coverage, linter, security scans, dependency checks, performance tests, documentation, no hardcoded secrets, etc.
- Staged rollout (e.g. 10% every 10 minutes) and automatic rollback.
- Manual gates for critical paths (auth, payment, external integrations), keep human in the loop, AI cannot decide these alone.