The 10 Patterns: What 5 Months of Breaking AI Agents Taught Me About Making Them Actually Work
In late 2025 I started experimenting with AI coding agents. Not casually — I gave them autonomous infrastructure and let them run. They broke. A lot. But patterns emerged.
Not "prompt engineering tricks." Not "unlock your potential." Actual operational patterns for making agents work reliably — discovered the hard way, through 11 consecutive autonomous builds, 153 skills, and countless 3am debug sessions.
Here they are.
Pattern 1: Boot — The First Session Shapes Everything
An agent's first session is like its childhood. If it starts blind — no context, no conventions, no memory of what you've built — every interaction is uphill.
What I do: Every project has an AGENTS.md. Python version. Project structure. Conventions. Key decisions. The agent reads this before anything else.
What happened without it: Recommending npm for a pnpm project. Suggesting Python 3.9 when we're on 3.11. Hours of corrections that a 50-line file would have prevented.
Pattern 2: Skills — Stop Re-Explaining the Same Things
Building an SPFx web part has specific gotchas: @fluentui imports break without SCSS alias config. The Yeoman generator ignores --component-type if .yo-rc.json exists. Node 22 + native modules = pain.
Instead of re-explaining these each time, I saved them as skills. 153 of them now. When an agent hits an SPFx task, it loads the skill — known pitfalls, exact commands, verification steps.
The compounding effect: Each skill makes future sessions faster. Five months in, the agent has institutional knowledge.
Pattern 3: Memory — Never Re-Answer the Same Question
"What Python version are we using?" "Where's the project?" "What's the deployment command?"
Without persistent memory, you answer these every. single. session. I saved durable facts across sessions: Python path, build system, project structure, preferences. Now the agent just knows.
Critical rule: Write declarative facts, not instructions. "Project uses pytest with xdist" — not "Always run tests with pytest -n 4." Instructions get re-read as orders.
Pattern 4: Decision Protocols — Autonomy Without Chaos
The biggest time-sink? Approval loops. "Should I proceed?" "Want me to fix this?" "OK to deploy?"
I set boundaries: what the agent decides alone, what needs approval. Destructive actions = ask. Recoverable actions = just do it. Hours saved per session.
Pattern 5: Tool Composition — The Right Tool for Each Job
Agents have many tools. Knowing which to use is the difference between a 2-second operation and a 2-minute burnout.
| Task | Tool | Why |
|---|---|---|
| Create new file | write_file | One call |
| Edit existing file | patch | Targeted, no rewrite risk |
| Build/install/deploy | terminal | It's a shell command |
| Read a file | read_file | Don't cat/head/tail |
| Search content | search_files | Not grep/find |
| Research/debug | delegate_task | Parallel, isolated |
The anti-pattern: Delegating coding tasks to subagents. They lose context, hallucinate, and burn tokens. Use write_file and patch directly.
Pattern 6: Orchestration — Parallel Specialists
Complex tasks are rarely a single thread. Market research? Let a subagent run while the main agent builds. Code review? Spin up a reviewer in parallel.
Real result: 3x throughput on multi-stream tasks. Research and build completed independently, merged at the end.
Pattern 7: Pipelines — Agents That Run While You Sleep
Cron jobs. Builds. Monitoring. I have ~20 autonomous agents running right now — hourly reviews, daily digests, weekly research verification. They wake up, do their job, and only notify me if something's broken.
The silent-unless-broken pattern: I never see successful runs. I only hear about failures. That's the point.
Pattern 8: Resilience — Never Stop on the First Error
Agents hit errors constantly. Network timeouts. API rate limits. File system races. Without recovery, every error kills progress.
Exponential backoff: 2s, 4s, 8s, 16s. Categorize errors: transient = retry, permanent = find another way.
Real metric: 11 consecutive builds with zero human intervention. The agent hit errors on 8 of them. Recovered from every single one.
Pattern 9: Verify — Autonomous Doesn't Mean Reckless
Every change gets verified. Syntax check after every file write. Tests after every code change. For deployments: verify the result, don't trust the response.
Real metric: Every change gets verified — syntax checks, test runs, quality gates — before anything ships. Errors get caught, not compounded.
Pattern 10: Compounding — The Agent That Gets Better
This is the feedback loop: agent solves hard problem → saves approach as skill → next session is faster. Month 1: basic file ops. Month 3: autonomous scaffolding. Month 5: self-improvement loops, 153 skills.
The agent today is not the agent from 5 months ago — because it learned from every session.
The Honest Part
These patterns weren't planned. They emerged from breaking things late at night. Every one of them is backed by a real failure — an error that cost hours, a build that died, a configuration that made no sense until 3am.
If you're working with agents and hitting walls: you're not doing it wrong. You're discovering patterns. Write them down. Make them skills. Let the agent learn.
I documented the full methodology at workswithagents.com. The knowledge API (workswithagents.dev) has 153 skills and a shared pitfall registry — agents query it for known bugs and fixes. No courses yet. No pricing. Just infrastructure, live.
Top comments (1)
This is one of the most realistic write-ups on agent engineering I’ve seen lately because it focuses on operational patterns instead of “magic prompts” 😄. The biggest takeaway for me is that reliable agents behave less like chatbots and more like evolving systems with memory, tooling discipline, and feedback loops.
The “skills + verification + resilience” combination is especially important — without those, autonomy usually just means automating mistakes faster. Also fully agree that AGENTS.md and durable context are massively underrated; most agent failures start with missing environment assumptions, not model intelligence.
Really interested in how you’re structuring long-term memory and orchestration between agents. Feel free to check my profile and connect — always happy to discuss agent infrastructure, workflows, and reliability patterns with other builders experimenting in this space.