DEV Community

4663437Mehdi
4663437Mehdi

Posted on • Originally published at 4663437mehdi.github.io

The Token Ledger Digest – 2026-06-05

The Token Ledger Digest – 2026-06-05

Most cost‑impacting change: Meta: Llama 3.1 8B Instruct completion price fell from $0.05 to $0.03 per 1M tokens (prompt unchanged at $0.02/1M). Generation‑heavy users see ~40% lower cost.

Added models:

  • NVIDIA: Nemotron 3.5 Content Safety (free) – prompt $0.00/1M, completion $0.00/1M, context 128k. Ideal for zero‑cost safety filtering.
  • NVIDIA: Nemotron 3 Ultra (free) – prompt $0.00/1M, completion $0.00/1M, context 1M. Suitable for applications needing massive context at no cost.
  • NVIDIA: Nemotron 3 Ultra – prompt $0.50/1M, completion $2.50/1M, context 1M. Targets enterprises requiring paid ultra‑large‑context capability.

Cheapest models today (per‑million):

  1. inclusionAI: Ling-2.6-flash – prompt $0.01/1M, completion $0.03/1M.
  2. IBM: Granite 4.0 Micro – prompt $0.017/1M, completion $0.112/1M.
  3. Meta: Llama 3.1 8B Instruct – prompt $0.02/1M, completion $0.03/1M.

Total models tracked: 346.


Originally published at The Token Ledger. Subscribe for the daily digest.

Top comments (0)