Anshuman

Posted on Apr 21 • Originally published at Medium on Apr 19

Things I Wish Someone Had Told Me Before I Built an AI Agent

#aitools #agents #lessonslearned

A year of hard lessons on planning, tool design, latency, and the $90 bill that changed everything.

I've been building AI agents seriously for about a year. I made a lot of the same mistakes — some expensive, some embarrassing, most avoidable. Here's what I had to figure out myself, so you don't have to.

Agents are not chatbots

I spent the first month building something I genuinely thought was an agent. It was a chatbot with extra steps.

A real agent takes actions, uses tools, handles failures, and knows when to stop. That's a completely different problem than building something that just responds to messages. The mental model has to shift before anything else does.

The planning step matters more than execution

Most agent tutorials jump straight into tool calling and LLM chaining. The part they skip is how the agent decides what to do in the first place.

Bad planning produces wrong actions — confidently, and repeatedly.

I spent three weeks debugging before realizing the problem was in how the agent was breaking down tasks, not in the tools themselves. Fix the planner before you touch anything else.

Tool design is everything

Your agent is only as good as the tools you give it. Vague tool descriptions produce bad tool usage — full stop.

I couldn't figure out why my agent kept calling the wrong tool. Then I rewrote the descriptions to be extremely specific about when to use each one. The problem disappeared. Write tool descriptions like you're explaining to a junior developer what the tool does and exactly when to reach for it.

Context window management will break you

Tokens cost money, and context fills up fast. Long-running agents accumulate history quickly. I hit a wall where my agent was spending 80% of its context budget on conversation history — leaving almost nothing for actual reasoning.

Implement context pruning before you need it, not after you're already hitting limits.

Error handling is not optional

Tools fail. APIs go down. Rate limits get hit. A badly built agent either loops forever or stops silently with no explanation.

I learned this after a client's agent ran a loop for four hours and racked up a $90 API bill on a single failed task.

Error handling goes in before the features, not after. Treat it like load-bearing infrastructure, because it is.

Evaluation is the hardest part

How do you know if your agent is working — not just running, but actually producing correct results? Tutorials never answer this.

You can't just run it and assume it's fine. You need real test cases, expected outputs, and a way to detect subtle failures: wrong assumptions, misread intent, steps that technically completed but sent everything down the wrong path. Green dashboards lie.

Latency kills user experience

An agent that takes 45 seconds to respond feels broken, even if it's correct. Users don't wait.

Use parallel tool calling wherever possible. Cache what makes sense. Stream output so users see something happening. The best agent I built had the worst retention — because it was slow.

The web access problem is real

Almost every useful agent needs to read from the web at some point. The naive approach of just fetching URLs breaks immediately on modern sites that render with JavaScript.

I built a research agent that was returning empty pages for 60% of its web lookups because of this. Use purpose-built tools — Firecrawl for scraping, Exa for LLM-friendly search — rather than building an entire browser layer from scratch.

Don't deploy what you haven't stress tested

Real users will use your agent in ways you never imagined. Every edge case you didn't test for will be discovered in production, usually at the worst possible moment.

I now build in a human-in-the-loop checkpoint for any irreversible action before deploying. It has saved me from more than a few bad situations with clients.

Final thoughts

None of this is theoretical. These are the things that cost me time, money, or a client relationship.

Building agents is genuinely different from building products powered by LLMs — it demands different instincts, different architecture, and a much higher tolerance for things going wrong in creative ways.

Start with the hard parts. They don't get easier by ignoring them.

DEV Community