DEV Community

Cover image for 5 Things To Avoid When Working With AI Tools
Erik Hanchett for AWS

Posted on

5 Things To Avoid When Working With AI Tools

As a software developer in 2026 you can't escape AI. It's everywhere, and almost every company is using some sort of AI coding tool. And as a long time full-stack developer whose roots are in the front-end, I wasn't always convinced.

In the past two years, I've seen the problems with AI, and its strengths. I've seen terrible designs, and not-so-bad ones as well. I've built demo apps that used to take me hours in just a few minutes. And I've found myself bogged down in AI slop.

So are these coding tools making me faster or not? I had to look into it more.

Want to watch a video on the subject then read about it? Check this out!

Last year, Model Evaluation and Threat Research, or METR, ran a randomized controlled trial and found that when experienced open source developers used AI tools, it took them 19% longer to complete tasks than without. AI was actually a detriment to their productivity.

On the other hand, GitHub conducted their own controlled trial around the same time and found that developers coded up to 55% faster with AI assistance.

So which is it?

After trying these tools out myself for the last two years, I can confidently say I think both are true. And the difference comes down to the approach used. AI tools, just like any other tool, can be misused. If the only tool you have is a hammer, every problem starts to look like a nail. When I first started using AI, I used it for everything. New features, bug fixes, refactors, brainstorming, all of it. And at first it felt amazing. But I kept running into the same problems over and over. The output looked good on the surface, but I'd spend more time fixing what it gave me than if I'd just written it myself.

Here are 5 ways AI can hurt you instead of help you, especially on the frontend.

1. No Real Feature Definition

I like jumping into a coding agent right away. I start vibe coding as soon as my fingers reaches the keyboard, but this actually isn't a great way to start. The AI will absolutely give me what I ask for, and it might even work, but the problem lies in the details.

The design generated is often not great (I'm looking at you GPT) and the features don't always work. Validation, error handling, responsiveness, and accessibility will often be partially implemented or skipped entirely. That's when things fall apart.

The real issue is that you need to define what success looks like. If you don't, the AI just guesses, and though it gets some of it right, it gets a lot wrong. Ambiguity isn't your friend.

Don't get me wrong, you don't need a 15 page requirements document with detailed designs for every use case. However, maybe a few bullet points, and some basic acceptance criteria will help. Simply defining what the feature should do and not do is the minimum you should ask. Adding this all to a markdown file before you get started will dramatically improve what the AI generates for you.

2. Too Much Bad Context

So that means I should put everything into the context window? Use as many AGENTS.md and markdown files as I can before I start right? Well, yes we need more context but there is more to it than that.

The second problem is putting too much into your context window. Whenever you work with a coding tool, I prefer Kiro so I'll use that as an example, it has a defined amount of space it can handle. This is often influenced by the model you select. Some models have larger context windows, others have smaller.

Either way stuffing everything you can find into lots of markdown files causes the opposite problem from #1. Now the model has to comb through lots of useless information to find what it needs. We've recently seen research that shows that more is not always better.

Recent research across over 60,000 repos found that context files are often too long, too vague, and are actively making agents worse. In one study, accuracy dropped from 87% to 54% just from context overload.

Context overload doesn't just drop accuracy, it can also increase token cost. When every request sends more information than is needed, you uselessly waste tokens, which in turn hits your pocket book.

At the end of the day it's more about quality than quantity when you're dealing with context windows. Think about it like giving directions. If someone asks you how to get to the grocery store and you hand them a 200-page atlas, that's technically more information. But it's not more helpful.

When creating an AGENTS.md file (or steering file if you're using Kiro), only include information that is needed for your project. Constrain it to your coding practices, tabs vs spaces, design system, API contracts and rendering. Make sure to remove unneeded fluff, and keep it up to date. When I use Kiro I create an agent that automatically updates my markdown files when I make changes to my components. That way I always have the latest updates in my context files.

There's probably a Goldilocks zone of context. Not too little, not too much. Just the stuff that actually matters for the task.

3. Too Much in One Shot

While having too much or too little in your context is important, you still need to know the limitations of what you're asking your coding agent to do. One common problem I see is that developers will try to one-shot a whole application. I'll see prompts to build a frontend, backend, tests, create all the things at once. It feels like you're making a lot of progress, and it outputs something in just a few minutes.

This could actually work for a quick prototype or demo, however in production this is not what you want. You'll end up with a lot of AI slop that often takes a lot of rework.

I was reading the other day that a whole industry has popped up to help small teams fix their vibe-coded messes. This should really tell you something about the state of the industry.

The reality is that creating everything at once causes the AI to lose architectural consistency. It may solve the same problem different ways in different files. It creates patterns that are contradictory or nonsensical.

To fix this issue you need to scope down the tasks. Unless you're running some autonomous agent loop that's going to churn through a large detailed requirements document and tasks for hours, you'll need to break things down. Build this component, refactor these tests. Here are the edge cases for this part of the app. The key, as we'll discuss in the next section, is to check the output. It's a lot easier to catch a problem in 50 lines than 2,000.

This problem has actually been mitigated in a lot of ways with spec-driven development. I use Kiro all the time to break down complex features into requirements, design and tasks. That way you can check the plan before the AI writes any code at all.

4. Too Much Trust

As hinted in #3, we should not be trusting the output of AI at face value. And of course, it does look convincing at first. When I started using Claude 4.5 Opus, I really thought coding was a solved problem. The code passed my linter, the types looked ok, and it ran well.

But as I started looking at the code in more detail, I saw some issues and edge cases I didn't like. Maybe one day AI will write 100% accurate code, but we still aren't there yet.

There is no question AI makes writing code faster, but you need to spend more time checking the output. In other words, AI compresses generation time, but it often expands verification time.

I'll be the first to admit, I love writing code, debugging it, finding clever ways to abstract the logic, and seeing it run for the first time. I'm not as big of a fan of doing code reviews. Something about reviewing other people's code and giving constructive feedback isn't as fun.

However, when AI is doing the writing, reviewing becomes the most important part of your job. For example, on the frontend I've seen things like weak accessibility, brittle logic, weird abstractions, duplicated behavior, and terrible design. Let me emphasize again that AI is not great at design. It's passable if you love purple and you don't mind generic Tailwind-looking apps. You really need to prompt it to make something look good and unique.

To help mitigate these problems you can set up tests, and those help, but sometimes the AI games those too. Really though, there is no better way than having a real life breathing human verify everything along the way. You should read the code, understand it, gauge the design, and test the assumptions.

5. Speed Over Maintainability

As a developer I love writing code fast. I remember being a young software developer and learning as many hotkeys and keystrokes as I could. I learned Vim so I would never have to touch the mouse, because I knew that every second I touched it, it would be a second I could be typing and creating code.

With AI creating all our code, it was easy to mistake speed for maintainability. No matter how fast I could write code, I could never beat out how fast an Anthropic Claude model could. When code gets cheap to generate, bad abstractions get cheaper too.

AI will over-generate wrappers, components, abstractions and APIs. These tools are like eager interns, they want to show off what they can do. But if you don't monitor them closely they'll go off the rails.

And my cognitive load can only handle so much. I remember one of the first codebases I worked on had so many levels of abstraction, it took me 15 minutes of ctrl-clicking into every class and interface to figure out what it was exactly doing. (Yes this was in Java). If you let AI go off, it will do the same thing.

In the end, you'll end up owning code you didn't fully think through or really understand. Your job is to understand what the AI is creating, because six months from now, someone has to maintain it. And that someone is probably you.

Modern Tooling

So where are we at today? Well we know that models are getting better all the time, and maybe in a few years most of these problems will be solved by the new tooling.

But we are not there yet.

Tools like Kiro have spec-driven development, and other tools have plan mode, checkpoints, and better workflows. But these tools are only as good as the person driving them and there still is no silver bullet. You still need your judgement, your context, and your review to be successful.

Conclusion

If we go back to our two studies at the start, both things can be true at once. Some developers will see productivity gains with AI tools, many others won't. It really comes down to how you use them in your day-to-day life.

You can't vibe code every app, and every app shouldn't be vibe coded. You need the right amount of context, not too little, not so much that you get context rot. You need to scope your asks. You need to become a better code reviewer. And you need to think about whether what the AI created is something you can actually maintain six months from now.

Let me know in the comments if you agree or disagree. Until next time.

Top comments (0)