DEV Community

4663437Mehdi
4663437Mehdi

Posted on • Originally published at 4663437mehdi.github.io

Token Ledger Digest – 2026-05-31

Token Ledger Digest – 2026-05-31

Cost‑impacting change

Qwen: Qwen3 235B A22B Thinking 2507

  • Prompt: fell from $0.1495/1M to $0.10/1M (‑33%).
  • Completion: fell from $1.495/1M to $0.10/1M (‑93%).
  • Who should care: Teams running large‑scale generation workloads where output token cost dominates; this cut reduces per‑million completion expense by ~$1.40.

Other price changes

MiniMax: MiniMax M2.7

  • Prompt: $0.279/1M → $0.26/1M (‑7%). Completion unchanged at $1.20/1M.
  • Relevant for users prioritizing prompt‑heavy tasks.

OpenAI: gpt-oss-20b

  • Prompt: $0.03/1M → $0.029/1M (‑3%). Completion unchanged at $0.14/1M.
  • Minor saving for latency‑sensitive apps using this model.

Model removals (6)

Model Context Prompt ($/1M) Completion ($/1M)
MiniMax: MiniMax M2.5 (free) 204,800 0.00 0.00
Upstage: Solar Pro 3 128,000 0.15 0.60
Baidu: ERNIE 4.5 21B A3B Thinking 131,072 0.07 0.28
Baidu: ERNIE 4.5 21B A3B 131,072 0.07 0.28
AlfredPros: CodeLLaMa 7B Instruct Solidity 4,096 0.80 1.20
Mistral: Mistral 7B Instruct v0.1 4,096 0.11 0.19

Developers relying on any of these models must migrate to alternatives; none of the removed entries offered a free tier except the MiniMax M2.5 variant.

Cheapest models today (per‑million)

  1. inclusionAI: Ling-2.6-flash – Prompt $0.01, Completion $0.03
  2. IBM: Granite 4.0 Micro – Prompt $0.017, Completion $0.112
  3. Meta: Llama 3.1 8B Instruct – Prompt $0.02, Completion $0.05

Total models tracked: 350. No other meaningful changes recorded.


Originally published at The Token Ledger. Subscribe for the daily digest.

Top comments (0)