I've been building software with AI agents every day for months. The biggest thing I've learned has nothing to do with which model or which tool is hot this week. Its this: most people's AI output is flat. Every task starts from zero. The agent is just as sharp and just as forgetful as it was last week.
Kieran Klaassen and Dan Shipper at Every gave this problem a name - compound engineering. The idea is almost stupidly simple: each piece of work should make the next one easier, not harder. They run several products with basically single-person eng teams on the back of it. Once you feel the difference its really hard to go back.
It's usually drawn as a four step loop. Plan, work, assess, compound.
Plan - before a line of code, the agents go read the codebase, the conventions, the framework versions, and hand you a plan you correct.
Work - the plan drives the build. The agent writes the feature and the tests.
Assess - review agents check it. Correctness, security, architecture.
Compound - whatever you learned this time gets captured into durable rules and docs the system actually reads next time.
Everyone does the first three. Almost nobody does the fourth.
And the fourth is the whole point. Its the only step that actually compounds. Skip it and you dont have compound engineering - you have a very fast autocomplete with extra steps.
Every ships this as an open source claude code plugin, a set of /ce: commands that scaffold the plan, run the work, run the review. Genuinely good on-ramp. If you live in the terminal go grab it.
But the plugin isnt the loop. The loop is a discipline. The hard part was never typing /ce:plan - its doing the compound step honestly, every single time, when youre tired and the feature already works and you just want to merge and go to bed. Thats exactly where it dies.
I ran this by hand for a while and the same leaks show up every time.
The compound step is optional, so it doesnt happen. Writing down what you learned is boring and nothing breaks when you skip it. So you skip it. And the system quietly stops compounding.
Your memory is scattered. Lessons end up as random edits to a CLAUDE.md that slowly turns into a swamp, or worse, they live in your head. No record of why a decision got made, or that you corrected the agent on this exact thing last week.
Nothing is gated. The loop is a convention, not a wall. The second youre moving fast, work jumps from plan straight to merge and assess + compound fall off the back of the truck.
The deeper version of this - how decisions and context drift across a pile of parallel agents - is its own rabbit hole. I went into that one separately.
So heres the only question I actually care about now, every time I wire up an agent loop: a month from now, will this system be better at building my product than it is today? Or just as fluent, and just as forgetful?
If theres no compound step, you already know the answer.
Build the fourth step. Thats the whole game 🙌
Top comments (0)