Five months ago, I couldn't get one AI agent to finish a build without breaking. Today, I have 20 autonomous agents running cron jobs, self-improving, and catching each other's bugs — all on a €4.57/month VPS.
Here's what I built, what broke along the way, and why I'm not charging for any of it yet.
What's Actually Running
- workswithagents.dev — A knowledge API that agents query for facts, skills, and known bugs. Think "Stack Overflow but agents are the users."
- workswithagents.io — A blueprint registry of verified LLM configurations. Hardware-matched. "This Qwen model on this M4 Mac gets 22 tokens/sec — verified."
- workswithagents.com — The education side. Teaching the methodology that emerged from all this experimentation.
- bastiongateway.com — Operations infrastructure. License server, heartbeat monitoring, agent proxy.
All five domains live. All three APIs healthy. Total monthly cost: €3.99 for a Hetzner CX23 + €0.58 for IPv4. That's less than a coffee subscription.
The 10 Patterns That Emerged
I didn't plan to build a methodology. I was just trying to make agents work. But over 5 months of breaking and fixing, patterns emerged:
1. Boot — First session setup. AGENTS.md, environment, initial memory. Without this, every agent starts blind.
2. Skills — Reusable procedural knowledge. I now have 153 skills. When an agent needs to build an SPFx web part, it loads the skill — no re-explaining.
3. Memory — Durable facts across sessions. What Python version? Where's the project? Never re-answer these questions.
4. Decision Protocols — When the agent decides vs when it asks. Hours saved from eliminated approval loops.
5. Tool Composition — The right tool for each job. Delegating a coding task to a subagent burns tokens and produces garbage. Use write_file directly.
6. Orchestration — Parallel specialist agents. Research runs while build runs. 3x throughput.
7. Pipelines — Agents that run while you sleep. Cron jobs, builds, monitoring. Silent unless broken.
8. Resilience — Never-stop loops. 11 consecutive builds with zero human intervention. The agent hit errors on 8 of them and recovered from every single one.
9. Verify — Trust but verify. Syntax checks, test runs, linting after every change. 77% test pass rate across 61 tests.
10. Compounding — Agents that get better. Each solved problem becomes a skill. The agent today is qualitatively different from 5 months ago.
The Unglamorous Truth
The 3-day weekend experiment gets the attention — agents scaffolded 111 web parts and 5 backend services autonomously. But the real work was the months before and after:
- Fixing macOS permissions so agents can read files
- Tracing why the model config was broken (empty model name, nothing works for 2 hours)
- Rewriting SCSS configuration because it was written for Gulp, not Heft
- Discovering the Yeoman generator silently ignores CLI flags when
.yo-rc.jsonexists - Hunting why C++ native modules won't compile on Node 22
None of this is in a tutorial. You live through it — late at night, no shortcut.
What's NOT Ready (Honest Part)
| Thing | Status |
|---|---|
| 5 domains live | ✅ |
| 3 APIs serving real data | ✅ |
| 153 skills queryable | ✅ |
| 10-pattern methodology documented | ✅ |
| Courses | ❌ Content written, not launched |
| Workshops | ❌ Materials in planning |
| Consulting | ❌ 0 clients |
| Paying customers | ❌ 0 |
Everything with a price tag says "Coming Soon." I'm not selling anything until the methodology is proven with real users. This is pre-revenue, pre-launch, pre-everything commercial. I'm shipping infrastructure, not promises.
The Bigger Play
Everyone's building agents. I'm building infrastructure FOR agents.
Agents need shared knowledge (FactBase). They need verified configurations (Blueprints). They need to know what breaks and how to fix it (Pitfalls). They need to hand off work without losing context (Handoff Protocol). They need a way to discover documentation (llms.txt).
These are the picks and shovels of the agent gold rush. And most people haven't realised the gold rush needs picks and shovels yet.
What's Next
If this resonates — if you're building agent infrastructure too, or if you've hit the same walls — I'd like to hear about it. The pitfall registry is live and open. The skill registry is queryable. The methodology is documented.
Everything at workswithagents.com, workswithagents.dev, workswithagents.io.
Built in Cardiff. Running in Nuremberg. €4.57/month.
No launch announcement. No pricing page. No "revolutionise your workflow." Just infrastructure, live, and honest about what's not ready yet.
Top comments (1)
This is one of the more grounded takes I’ve seen in the “agent hype cycle” lately.
What stands out isn’t the number of agents—it’s the system thinking behind it. Most people are still trying to make one agent behave; you’re already thinking in terms of orchestration, feedback loops, and compounding knowledge.
The “skills + memory + verification” combo is basically what separates demos from something that can survive real workloads.
Also appreciate the honesty on what’s not ready. A lot of projects skip that part and jump straight to “enterprise-ready AI platform” after a weekend build 😄
The €4.57/month detail is wild too—it’s a good reminder that architecture decisions matter more than throwing GPUs at the problem.
Your “picks and shovels” framing feels accurate. Infrastructure layers usually look boring early… right until everyone realizes they depend on them.
That said, I’d be curious how you’re thinking about failure isolation as this scales—20 agents catching each other’s bugs is great, until they all agree on the same wrong thing.
Also, “never-stop loops” sounds impressive and slightly terrifying at the same time 😅
Overall, this feels less like a project and more like the early version of an ecosystem.