I was testing a small agent script last week and it ended up getting stuck retrying a step while streaming responses.
It kept calling the API again and again and I didn’t notice for a while.
By the time I checked the usage it had jumped about $147 in under an hour.
After that I wrote a small Python “circuit breaker” that stops requests if a rolling token or cost limit is exceeded. It can also interrupt streaming responses mid-generation.
It’s just a single Python file that runs locally. No cloud and no keys leaving your machine.
If this might help someone, I shared it here:
https://gist.github.com/alexdirochian-star/730122004e289bbc6f6b42d4f431656f
Curious how others are protecting their agents from runaway API usage?
Top comments (0)