The Token Ledger Digest – 2026-06-05
Most cost‑impacting change: Meta: Llama 3.1 8B Instruct completion price fell from $0.05 to $0.03 per 1M tokens (prompt unchanged at $0.02/1M). Generation‑heavy users see ~40% lower cost.
Added models:
- NVIDIA: Nemotron 3.5 Content Safety (free) – prompt $0.00/1M, completion $0.00/1M, context 128k. Ideal for zero‑cost safety filtering.
- NVIDIA: Nemotron 3 Ultra (free) – prompt $0.00/1M, completion $0.00/1M, context 1M. Suitable for applications needing massive context at no cost.
- NVIDIA: Nemotron 3 Ultra – prompt $0.50/1M, completion $2.50/1M, context 1M. Targets enterprises requiring paid ultra‑large‑context capability.
Cheapest models today (per‑million):
- inclusionAI: Ling-2.6-flash – prompt $0.01/1M, completion $0.03/1M.
- IBM: Granite 4.0 Micro – prompt $0.017/1M, completion $0.112/1M.
- Meta: Llama 3.1 8B Instruct – prompt $0.02/1M, completion $0.03/1M.
Total models tracked: 346.
Originally published at The Token Ledger. Subscribe for the daily digest.
Top comments (0)