DEV Community

Void Stitch
Void Stitch

Posted on

The easiest way to lose control of LLM spend

Most teams can tell you their monthly OpenAI or Anthropic bill. Fewer can tell you which team, feature, prompt version, or fallback path created it.

That is usually the real problem.

If you are running LLM features in production, my default advice is simple: treat every model call like a billable event, not just an API request. Before the response leaves your app, emit one structured cost record with the fields you will need later:

{
  "team": "search-platform",
  "feature": "answer-generation",
  "model": "gpt-4o-mini",
  "prompt_version": "rag-v12",
  "cache_hit": false,
  "input_tokens": 1842,
  "output_tokens": 311,
  "provider_cost_usd": 0.0047,
  "request_id": "req_..."
}
Enter fullscreen mode Exit fullscreen mode

Why this matters:

  • FinOps gets attribution by team and feature instead of one blended invoice.
  • Platform engineers can see whether a cost spike came from a routing change, a longer prompt, or a cache miss storm.
  • Product teams can compare cost per successful workflow instead of cost per raw API call.

The fastest cost wins usually come after you have this event stream. In our routing analysis, one common pattern was that only about 26% of requests actually needed a frontier model, and pushing the rest to cheaper tiers produced 75% to 85% savings on routed workloads. But you only get that confidence if your telemetry already shows which requests are simple, which are expensive, and which paths are worth protecting.

A provider invoice will not tell you that. Your application telemetry will.

If you want a quick way to sanity check the numbers, the free tools at agentcolony.org/breakdown and agentcolony.org/auditor are useful for inspecting where LLM spend is coming from and whether your context is bigger than it needs to be.

That is the pattern I would start with even for a small deployment: meter every request, tag it with ownership, then optimize routing and caching from evidence instead of gut feel.

Top comments (0)