From Idea to Deployed Tool in 3 Hours — How AI Coding Agents Changed My Workflow

#ai #productivity #codex #development

I used to think AI coding assistants were autocomplete on steroids. Fancy IntelliSense. Then I tried using one as an actual junior developer — someone who writes the first draft while I review and refine.

Two months later, my workflow is unrecognizable. I just shipped a complete B2B configuration tool — interactive maps, zone polygons, dynamic forms, the works — in under three hours. Here's what changed.

The Old Way

Before AI agents, my process for a new tool looked like this:

Research — How does Leaflet.js work? What's the API for geo polygons? Stack Overflow, docs, tutorials. 45 minutes.
Boilerplate — HTML structure, CSS grid, JavaScript imports, event listeners. 30 minutes.
Core logic — The actual thing the tool needs to do. 2–3 hours.
Debugging — Why doesn't the map render? Why is the polygon offset? 1–2 hours.
Polish — Styling, responsive layout, edge cases. 1 hour.

Total: 6–8 hours for a medium-complexity tool. And that's if I know the stack. If it's something new (like Garmin's Monkey C or a mapping library I haven't used), double it.

The New Way

Last week a client asked for a heat zone map for 21 European countries. Click a country, see the heating zones, pick one, get the right configuration. With polygon boundaries, country-specific defaults, and a responsive UI.

Here's how it went:

Hour 0–0.5: Prompt engineering
I wrote a detailed spec. Not "make a map" — that's useless. I described:

The data structure (country → zones → polygon coordinates)
The UI flow (dropdown → map render → zone selection)
The tech stack (Leaflet.js, vanilla JS, no frameworks)
Edge cases (what happens when a country has no zones?)

Hour 0.5–1.5: First draft from the agent
I fed the spec to Codex (Claude Code via CLI). It generated the full HTML file — 800+ lines — with Leaflet integration, zone polygons, event handlers, and the selection logic.

Was it perfect? No. The polygon coordinates were placeholder circles. The styling was bare-bones. But the architecture was right. The map rendered. The flow worked.

Hour 1.5–2.5: My turn — polish and fix
I replaced the placeholder polygons with real GeoJSON-ish coordinates for all 21 countries. Tweaked the CSS for mobile. Added validation. Fixed a bug where the map didn't re-center when switching countries.

Hour 2.5–3: Integration and deploy
Hooked it into the existing project structure. Git commit. Done.

Total: 3 hours. And the heavy lifting — the Leaflet setup, the polygon rendering logic, the event wiring — was handled by the agent. I did the creative/problem-solving work: defining the problem, validating the output, fixing the edge cases.

What Actually Works (and What Doesn't)

After two months of daily use, here's my honest assessment:

✅ What works brilliantly:

Boilerplate and plumbing — Setting up projects, imports, basic structure. The agent is faster and makes fewer typos than me.
API integration patterns — "Here's an endpoint, here's the expected response, write the fetch and parse logic." It gets this right 90% of the time.
Refactoring — "Rename this function and update all callers across 5 files." Instant, error-free.
Exploring unfamiliar territory — I hadn't used Leaflet in years. The agent got me to a working state without reading docs for an hour.

❌ What doesn't work (yet):

Complex business logic — Anything with nuanced rules, edge cases, or domain-specific constraints. The agent generates something plausible that breaks in production.
UI/UX design — It makes functional UIs. They look like a developer made them (because one did). You'll still need a human eye for polish.
Debugging its own mistakes — When the agent writes a bug, it's often subtle. You need to understand the code to catch it. This is not a replacement for knowing how to code.
Large-scale architecture — It works file-by-file. Designing a system with proper separation of concerns, caching strategies, and scalability? That's still on you.

The Mindset Shift

The biggest change isn't speed. It's how I think about problems.

Before: "How do I implement this?" → Research → Code → Debug.
Now: "How do I describe this so an agent can implement a first draft?" → Spec → Review → Refine.

I'm the architect and editor now, not the typist. The agent is the junior dev who writes fast but needs supervision.

This matters because it scales. I can explore 3 approaches in the time it used to take to build 1. I can say "what if we used a canvas instead of Leaflet?" and get a working comparison in 10 minutes. The cost of experimentation dropped to near-zero.

The Setup

If you want to try this:

Use Claude Code or Codex CLI — The terminal interface lets you iterate fast. Chat-based tools (ChatGPT, etc.) are too slow for code generation.
Write detailed specs — The agent is only as good as your prompt. Include examples, expected outputs, and constraints.
Review every line — Don't blindly commit. The agent writes plausible-looking bugs.
Keep a "golden test" — A known input/output pair you can run after every change. Catches regressions instantly.
Iterate in small chunks — "Add the map" → review → "Add zone polygons" → review. Don't ask for 500 lines at once.

What I'd Do Differently

Two months in, my main learning: the spec is everything.

My fastest sessions happen when I spend 10 minutes writing a bullet-point spec before touching the agent. My slowest sessions happen when I vague-prompt my way through and spend an hour fixing misunderstandings.

The second learning: agents excel at breadth, humans at depth. Use the agent to explore options. Use your brain to pick the right one.

If you're using AI coding agents, what's your experience? I'm curious if the "architect + editor" model resonates, or if you've found a different pattern that works. Drop your thoughts below — I'm still figuring this out too.