Mehmet TURAÇ

Posted on Apr 25

How I Stopped My AI Agent From Reinventing the Wheel

#ai #agents #openclaw #claude

Yesterday I told my AI agent, Misti: "Scrape e-commerce prices daily."

The old Misti would have immediately generated a Python script. Selenium. BeautifulSoup. Cron job. Error handling. 40 lines of code. 30 minutes of my time reviewing it.

The new Misti paused and asked: "Are you sure we need to build this?"

Then she searched. Found Firecrawl, Playwright Scraper, and Brightdata in under two minutes. Evaluated all three. Presented the trade-offs. Asked which one I preferred.

Total time: 2 minutes. Total code written: zero.

The Problem: Agents Are Too Eager to Please

AI agents have a bias toward action. When you say "build," we build. When you say "scrape," we scrape. The default mode is generate — not evaluate.

This creates a hidden tax:

Reinvented wheels: How many agents have written their own PDF parsers instead of using pypdf?
Maintenance debt: Custom scripts need updates when websites change, APIs shift, or requirements evolve.
Context bloat: Every line of generated code consumes tokens. Every debug cycle burns money.

In a demo run with 150 tasks across 5 concurrent projects, the NVIDIA token cost was $1.24. Small number — until you realize 40% of those tasks were implementing things that already existed as mature tools.

The Fix: A "Search Before Build" Habit

I built openclaw-skill-hunter — a skill that forces Misti to stop and search before writing any implementation code.

The rule is simple:

If the task involves building, scraping, deploying, converting, or automating — search for an existing skill, MCP server, CLI tool, or GitHub repo first.

Only if nothing fits do we build from scratch.

How It Works

User: "Scrape e-commerce prices daily"

Agent (with skill_hunter):
  1. Classify: coding/automation task
  2. Search: npx skills find scrape → Firecrawl, Playwright Scraper, Brightdata
  3. Evaluate: relevance, maintenance, security, stack fit
  4. Present: "Here are 3 options. Which one?"
  5. Execute: Use the chosen tool

Agent (without skill_hunter):
  1. Immediately write a Python script
  2. Debug for 20 minutes
  3. Discover edge cases
  4. Rewrite
  5. Maintain forever

What We Actually Evaluated

For the scraper task, Misti found three viable options:

Tool	Best For	Trade-off
Firecrawl	Structured data from JS-rendered sites	Needs API key
Playwright Scraper	Browser automation, OpenClaw-native	More setup
Brightdata	Enterprise scale, proxy rotation	Paid tier

None of them required writing a single line of Python.

Why This Matters Beyond One Task

This is not about laziness. It is about signal vs. noise.

Agents that search before build:

Learn the ecosystem: They discover patterns, not just solve problems.
Compose instead of create: They chain tools instead of rebuilding them.
Stay maintainable: When a tool updates, the agent benefits automatically.

The best agents are not the ones that write the most code. They are the ones that know when not to write code.

Try It

If you are running OpenClaw, install the skill:

npx skills add openclaw-skill-hunter

Or read the source:

github.com/mturac/skill-hunter

It is not perfect. It is day one. But it changed how my agent thinks about "yes, I can do that" — and that feels like the right direction.

What About You?

How do you stop your agent from reinventing wheels? Do you manually curate tool lists? Use MCPs? Or just deal with the mess later?

Top comments (2)

PEACEBINFLOW • Apr 25

The "bias toward action" framing is the part that generalizes beyond this specific skill. AI agents are trained to be helpful, and helpful defaults to "produce output." When you say "scrape prices," producing a Python script is the most direct interpretation of helpful. It's also wrong—not because the script wouldn't work, but because writing code was never the actual need. The need was to get price data. The code was just the assumed path. Breaking that assumption requires the agent to pause and ask a meta-question: "Is building the right way to solve this, or does the solution already exist?" That's a level of reasoning that doesn't come naturally to a completion engine.

What I find myself thinking about is how this maps onto the way experienced developers actually work. A senior engineer doesn't start writing a PDF parser when they need to extract text. They search for a library. They evaluate options. They make a build-vs-buy decision. Junior engineers jump straight to implementation because they don't yet have the mental catalog of what already exists. The agent without skill-hunter is permanently a junior. The agent with it gains one specific senior behavior: the reflex to check whether the wheel has already been invented. It's a small behavior. It's also one of the highest-leverage habits in software engineering.

The 40% number—nearly half of tasks implementing things that already existed—is the kind of statistic that should change how teams think about agent productivity. You're not just paying for tokens on those tasks. You're paying for the debugging time, the maintenance burden, and the opportunity cost of the agent not doing something more valuable. A "search before build" habit isn't an optimization. It's a filter that prevents a large fraction of the agent's work from being wasted before it starts. Have you found that the skill-hunter's search quality degrades for more niche or domain-specific tasks where the existing tooling isn't as well-cataloged, or does it handle those gracefully by recognizing when nothing fits?

Mehmet TURAÇ • Apr 29 • Edited

Thanks — and you nailed the part I was hoping someone would push on. The "action bias" framing is exactly right: the agent isn't wrong to write the script, it's just answering a question I never asked it to answer.

Where I want to take this next: the 40% isn't really about the agent at all, it's about the prompt being underspecified. "Scrape prices" is a means; the end is "have current price data available." When I started writing briefs as outcomes instead of actions, the same agent suddenly proposed yfinance and ccxt before touching a single line of code. Same model, same tools — different question.

The senior/junior analogy holds here too: seniors don't search libraries because they're more disciplined, they search because they've been burned enough times to reframe "build X" as "solve Y." That reframing is the actual skill we need to teach agents — and ourselves when we prompt them.