The Token Ledger Digest – 2026-06-12
Most cost‑impacting change
- Qwen: Qwen2.5 VL 72B Instruct – Prompt price rose from $0.25/1M to $0.80/1M (+$0.55/1M); completion price rose from $0.75/1M to $1.00/1M (+$0.25/1M). Who should care: Teams using this vision‑language model for high‑volume inference will see per‑million‑token costs increase by up to 80%.
Model removed
- NVIDIA: Nemotron Nano 9B V2 – Deleted from the catalog. Prior pricing: prompt $0.04/1M, completion $0.16/1M. Who should care: Anyone relying on this model must migrate to an alternative.
Other price changes (prompt/completion, $/1M)
| Model | Old → New Prompt | Δ Prompt | Old → New Completion | Δ Completion |
|---|---|---|---|---|
| Qwen: Qwen3.7 Plus | 0.40 → 0.32 | –0.08 | 1.60 → 1.28 | –0.32 |
| MoonshotAI Kimi Latest | 0.68 → 0.67 | –0.01 | 3.41 → 3.39 | –0.02 |
| Qwen: Qwen3.6 27B | 0.289 → 0.287 | –0.002 | 2.40 → 3.10 | +0.70 |
| DeepSeek V4 Flash | 0.0983 → 0.0980 | –0.0003 | 0.1966 → 0.1960 | –0.0006 |
| MoonshotAI Kimi K2.6 | 0.68 → 0.67 | –0.01 | 3.41 → 3.39 | –0.02 |
| Google Gemma 4 31B | 0.12 → 0.12 | 0.00 | 0.36 → 0.35 | –0.01 |
| MiniMax M2.7 | 0.27 → 0.25 | –0.02 | 1.08 → 1.00 | –0.08 |
| MoonshotAI Kimi K2.5 | 0.40 → 0.35 | –0.05 | 1.90 → 1.89 | –0.01 |
| DeepSeek R1 Distill Llama 70B | 0.70 → 0.80 | +0.10 | 0.80 → 0.80 | 0.00 |
Who should care: Developers optimizing cost for batch or real‑time workloads should review the above adjustments, especially the notable completion‑price increase for Qwen3.6 27B and the prompt‑price rise for DeepSeek R1 Distill Llama 70B.
Total models tracked: 337.
Originally published at The Token Ledger. Subscribe for the daily digest.
Top comments (0)