No production traffic. No customers. Just me testing a few simple workflows.
The culprit wasn't one big request. It was four things compounding:
Context creep. I kept adding "just one more thing" to prompts such as prior decisions, more context and more detail. Each run got bigger without feeling bigger.
Tool output bloat. Logs, diffs, API responses flowing straight into the next step. Output becomes input becomes output. It adds up fast.
Scheduled job overhead. Cron jobs re-establishing a large prompt footprint on every run. Not catastrophic — just quietly expensive, repeatedly.
Duplicate triggers. A couple of retries running the same bloated job twice.
The fix was kinda boring: smaller context windows, trim tool outputs aggressively, fresh session boundaries on scheduled jobs, and, most importantly, stop using the most expensive model by default for everything.
That last one alone cut most of the cost.
After a while I got bored of switching models and decided to turn this into a research opportunity. I published RoBC on Github and then trained it and put it into prod on clawpane.co.
Let me know if you check the repo out or if you need help with your AI spending.
Top comments (0)