DEV Community

John Medina
John Medina

Posted on

LLM prices dropped 80% — but are you actually saving money?

veryone is cheering about Anthropic and OpenAI dropping API prices by 80%.
It sounds great on Twitter. But if you look at your actual billing dashboard, your costs probably haven't moved that much.

Why? Because cheaper tokens usually just mean you start wasting more tokens.

Here is the thing:

1- Context bloat
When GPT-4 was expensive, we carefully truncated histories and compressed prompts. Now that it's cheap, devs just throw the entire 128k context window at it on every single retry. The cost per token dropped, but you are sending 10x more tokens per request.

2- Agent loops
Cheaper models make agentic workflows viable, but a poorly configured while loop can still burn through your budget in minutes. When an agent gets stuck and retries 40 times, cheaper tokens don't save you—you still bleed cash.

3- Lack of per-customer attribution
It's easy to see your total OpenAI bill. But if you don't know which specific tenant or user is driving the cost, you can't optimize it. You just eat the cost.

tbh, the raw price per token is only half the story. If you can't attribute the cost per-user or per-model, you're still flying blind.

fwiw I built LLMeter to fix this for my own projects. It tracks costs per model and per user, and sets budget alerts—without a proxy in the middle. It's open-source (AGPL).

Check it out if you're tired of guessing your AI bills: https://llmeter.org?utm_source=devto&utm_medium=article&utm_campaign=2026-04-16-llm-prices-dropped-are-you-saving

Top comments (0)