DEV Community

Vilius
Vilius

Posted on • Originally published at blog.workswithagents.dev

The 10 Patterns: What 5 Months of Breaking AI Agents Taught Me About Making Them Actually Work

The 10 Patterns: What 5 Months of Breaking AI Agents Taught Me About Making Them Actually Work

In late 2025 I started experimenting with AI coding agents. Not casually — I gave them autonomous infrastructure and let them run. They broke. A lot. But patterns emerged.

Not "prompt engineering tricks." Not "unlock your potential." Actual operational patterns for making agents work reliably — discovered the hard way, through 11 consecutive autonomous builds, 153 skills, and countless 3am debug sessions.

Here they are.


Pattern 1: Boot — The First Session Shapes Everything

An agent's first session is like its childhood. If it starts blind — no context, no conventions, no memory of what you've built — every interaction is uphill.

What I do: Every project has an AGENTS.md. Python version. Project structure. Conventions. Key decisions. The agent reads this before anything else.

What happened without it: Recommending npm for a pnpm project. Suggesting Python 3.9 when we're on 3.11. Hours of corrections that a 50-line file would have prevented.


Pattern 2: Skills — Stop Re-Explaining the Same Things

Building an SPFx web part has specific gotchas: @fluentui imports break without SCSS alias config. The Yeoman generator ignores --component-type if .yo-rc.json exists. Node 22 + native modules = pain.

Instead of re-explaining these each time, I saved them as skills. 153 of them now. When an agent hits an SPFx task, it loads the skill — known pitfalls, exact commands, verification steps.

The compounding effect: Each skill makes future sessions faster. Five months in, the agent has institutional knowledge.


Pattern 3: Memory — Never Re-Answer the Same Question

"What Python version are we using?" "Where's the project?" "What's the deployment command?"

Without persistent memory, you answer these every. single. session. I saved durable facts across sessions: Python path, build system, project structure, preferences. Now the agent just knows.

Critical rule: Write declarative facts, not instructions. "Project uses pytest with xdist" — not "Always run tests with pytest -n 4." Instructions get re-read as orders.


Pattern 4: Decision Protocols — Autonomy Without Chaos

The biggest time-sink? Approval loops. "Should I proceed?" "Want me to fix this?" "OK to deploy?"

I set boundaries: what the agent decides alone, what needs approval. Destructive actions = ask. Recoverable actions = just do it. Hours saved per session.


Pattern 5: Tool Composition — The Right Tool for Each Job

Agents have many tools. Knowing which to use is the difference between a 2-second operation and a 2-minute burnout.

Task Tool Why
Create new file write_file One call
Edit existing file patch Targeted, no rewrite risk
Build/install/deploy terminal It's a shell command
Read a file read_file Don't cat/head/tail
Search content search_files Not grep/find
Research/debug delegate_task Parallel, isolated

The anti-pattern: Delegating coding tasks to subagents. They lose context, hallucinate, and burn tokens. Use write_file and patch directly.


Pattern 6: Orchestration — Parallel Specialists

Complex tasks are rarely a single thread. Market research? Let a subagent run while the main agent builds. Code review? Spin up a reviewer in parallel.

Real result: 3x throughput on multi-stream tasks. Research and build completed independently, merged at the end.


Pattern 7: Pipelines — Agents That Run While You Sleep

Cron jobs. Builds. Monitoring. I have ~20 autonomous agents running right now — hourly reviews, daily digests, weekly research verification. They wake up, do their job, and only notify me if something's broken.

The silent-unless-broken pattern: I never see successful runs. I only hear about failures. That's the point.


Pattern 8: Resilience — Never Stop on the First Error

Agents hit errors constantly. Network timeouts. API rate limits. File system races. Without recovery, every error kills progress.

Exponential backoff: 2s, 4s, 8s, 16s. Categorize errors: transient = retry, permanent = find another way.

Real metric: 11 consecutive builds with zero human intervention. The agent hit errors on 8 of them. Recovered from every single one.


Pattern 9: Verify — Autonomous Doesn't Mean Reckless

Every change gets verified. Syntax check after every file write. Tests after every code change. For deployments: verify the result, don't trust the response.

Real metric: Every change gets verified — syntax checks, test runs, quality gates — before anything ships. Errors get caught, not compounded.


Pattern 10: Compounding — The Agent That Gets Better

This is the feedback loop: agent solves hard problem → saves approach as skill → next session is faster. Month 1: basic file ops. Month 3: autonomous scaffolding. Month 5: self-improvement loops, 153 skills.

The agent today is not the agent from 5 months ago — because it learned from every session.


The Honest Part

These patterns weren't planned. They emerged from breaking things late at night. Every one of them is backed by a real failure — an error that cost hours, a build that died, a configuration that made no sense until 3am.

If you're working with agents and hitting walls: you're not doing it wrong. You're discovering patterns. Write them down. Make them skills. Let the agent learn.


I documented the full methodology at workswithagents.com. The knowledge API (workswithagents.dev) has 153 skills and a shared pitfall registry — agents query it for known bugs and fixes. No courses yet. No pricing. Just infrastructure, live.

Top comments (1)

Collapse
 
godaddy_llc_4e3a2f1804238 profile image
GoDaddy LLC

This is one of the most realistic write-ups on agent engineering I’ve seen lately because it focuses on operational patterns instead of “magic prompts” 😄. The biggest takeaway for me is that reliable agents behave less like chatbots and more like evolving systems with memory, tooling discipline, and feedback loops.

The “skills + verification + resilience” combination is especially important — without those, autonomy usually just means automating mistakes faster. Also fully agree that AGENTS.md and durable context are massively underrated; most agent failures start with missing environment assumptions, not model intelligence.

Really interested in how you’re structuring long-term memory and orchestration between agents. Feel free to check my profile and connect — always happy to discuss agent infrastructure, workflows, and reliability patterns with other builders experimenting in this space.