I've been looking at how teams handle LLM API costs in production, and there's a weird gap in the tooling right now. Everyone is building observability — logs, traces, dashboards. But almost no one is actually enforcing budgets at runtime.
If you are running multi-step agents or letting users chat indefinitely, discovering a $4,000 OpenAI bill at the end of the month via a dashboard doesn't help. The money is already gone.
The problem breaks down into three layers:
- Attribution (knowing which user/tenant caused the cost)
- Alerting (getting warned when a threshold is near)
- Enforcement (blocking requests at runtime)
Most teams are stuck at layer 1. You can't enforce a per-customer budget if you don't even know what each customer is costing you.
I built LLMeter because I needed to solve that first layer. It's an open-source dashboard that tracks OpenAI, Anthropic, DeepSeek, and OpenRouter costs per user and per day. It also handles budget alerts.
Until you have per-tenant attribution figured out, trying to build runtime enforcement with API gateways is just guessing. Get the data first, then block the requests.
Top comments (0)