DEV Community

Cover image for How a Missing Config Line Cost Me 38x More for the Same Model
Dmitry Platonov
Dmitry Platonov

Posted on

How a Missing Config Line Cost Me 38x More for the Same Model

I'm building a custom coding agent harness that uses DeepSeek models through OpenRouter. The models offer a good price/performance balance (especially with the current Pro discount).

While editing the config, I accidentally removed the provider preference. My OpenRouter workspace had only two providers enabled: the Official and ProviderA. And, of course, OpenRouter picked ProviderA! My harness displays session cost in real time, and within about 30 minutes the session cost jumped up enough to catch my attention. I pulled the usage breakdown:

Model Provider Calls Prompt tokens Cached tokens Cache % Completion tokens Cost
flash ProviderA 115 3,150,998 2,790,656 88.6% 66,176 $0.1471
flash Official 718 28,198,126 27,187,072 96.4% 378,485 $0.3236
pro ProviderA 11 306,246 93,952 30.7% 8,796 $0.3994
pro Official 1341 54,105,295 51,849,600 95.8% 737,680 $1.8110

Normalized per 1M total tokens:

  • Flash: ProviderA $0.046, Official $0.011 -> 4x more expensive
  • Pro: ProviderA $1.27, Official $0.033 -> 38x more expensive

Why a "small" cache gap hurts so much

Prompt caching makes uncached input tokens an order of magnitude pricier (two orders in case of Official provider). Even the 8-point cache difference for Flash (88.6% vs 96.4%) means ProviderA processed 3x more uncached tokens. Combine that with a higher base price and you get the 4x multiplier. For Pro the cache gap is extreme (30.7% vs 95.8%) -- that's the core of the 38x blow-up.

Currently, official pricing for DeepSeek V4 Pro cached tokens is just $0.0036/M (yes, that's right, decimal point is in the right place!). For agentic workloads it massively drives cost down.

The fix: explicitly set provider preference

Always include the provider object in your HTTP request:

{
  "model": "deepseek/deepseek-v4-pro",
  "messages": [...],
  "provider": {
    "order": ["deepseek"],
    "allow_fallbacks": false
  }
}
Enter fullscreen mode Exit fullscreen mode

Or at least create separate workspace/key guardrails on the OpenRouter for each model family.

Did something like that ever happen to you?

Top comments (0)