The Token Ledger — May 17, 2026
Three providers raised completion prices today; NVIDIA’s Nemotron 3 Super saw the largest absolute increase. No new models were added or removed.
NVIDIA: Nemotron 3 Super (120B A12B)
Prompt: $0.09/1M → $0.10/1M (+11.1%)
Completion: $0.45/1M → $0.50/1M (+$0.05, +11.1%)
Impact: Heaviest token-cost increase today. Relevant for agents and reasoning workflows.
Mistral: Mistral Nemo
Prompt: unchanged at $0.02/1M
Completion: $0.03/1M → $0.04/1M (+33.3%)
Relative jump is steep, but absolute cost remains low. Relevant for lightweight local-style deployments.
Google: Gemma 4 26B A4B
Prompt: $0.06/1M → $0.07/1M (+16.7%)
Completion: $0.33/1M → $0.34/1M (+3%)
Smaller absolute impact vs. Nemotron; still a 17% prompt hike.
OpenAI: gpt-oss-120b
Prompt: unchanged at $0.039/1M
Completion: $0.18/1M → $0.19/1M (+5.6%)
Marginal; likely overlooked in volume.
Cheapest models today (by prompt price):
- inclusionAI: Ling-2.6-flash — $0.01/1M prompt, $0.03/1M completion
- IBM: Granite 4.0 Micro — $0.017/1M prompt, $0.112/1M completion
- Meta: Llama 3.1 8B Instruct — $0.02/1M prompt, $0.05/1M completion
Total tracked models: 356.
Originally published at The Token Ledger. Subscribe for the daily digest.
Top comments (0)