Discussion on: Are you paying attention to your token use?

View post

This question hits differently after you've watched an agentic workflow silently burn through tokens in a retry loop.

I used to not think about token usage at all until I started building with agents. A single misconfigured workflow can trigger cascading retries where each step costs multiple LLM calls. What looked like a $0.10 task becomes a $5 surprise by the time you check your dashboard.

Now I treat token budgeting the same way I treat error handling you don't think about it until something breaks, and then you think about nothing else.

The most underrated optimization isn't model choice it's context hygiene. Keeping prompts lean and not stuffing unnecessary history into every call saves more than switching to a cheaper model ever will.