DEV Community: 4663437Mehdi

The Token Ledger – 2026-07-17

4663437Mehdi — Fri, 17 Jul 2026 09:20:27 +0000

The Token Ledger – 2026-07-17

Most cost‑impacting change

MoonshotAI Kimi Latest (~moonshotai/kimi-latest) – Prompt price rose from $0.66 to $3.00 per 1M tokens (+$2.34); completion price rose from $3.41 to $15.00 per 1M tokens (+$11.59). Who should care: Anyone using this model for long‑form generation; per‑million cost jumps ~14×.

Added models

MoonshotAI: Kimi K3 (moonshotai/kimi-k3) – Prompt $3.00/1M, Completion $15.00/1M, 1M‑token context.
Meta: Muse Spark 1.1 (meta/muse-spark-1.1) – Prompt $1.25/1M, Completion $4.25/1M, 1M‑token context. Who should care: Teams needing large‑context models; compare pricing against existing 1M‑context options.

Price changes (per‑million token values)

Z.ai: GLM 5.2 (z-ai/glm-5.2) – Prompt $0.96 → $0.96 (−$0.001); Completion $3.01 → $3.01 (−$0.004).
MoonshotAI: Kimi K2.7 Code (moonshotai/kimi-k2.7-code) – Prompt $0.72 → $0.75 (+$0.03); Completion $3.49 → $3.50 (+$0.01).
MiniMax: MiniMax M2.7 (minimax/minimax-m2.7) – Prompt $0.30 → $0.25 (−$0.05); Completion $1.20 → $1.00 (−$0.20).
Qwen: Qwen3.5 397B A17B (qwen/qwen3.5-397b-a17b) – Prompt $0.45 → $0.39 (−$0.06); Completion $3.00 → $2.34 (−$0.66).
Qwen: Qwen3 14B (qwen/qwen3-14b) – Prompt $0.12 → $0.10 (−$0.02); Completion unchanged $0.24.
Google: Gemma 3 27B (google/gemma-3-27b-it) – Prompt $0.08 → $0.10 (+$0.02); Completion $0.45 → $0.30 (−$0.15).
Meta: Llama 3.2 3B Instruct (meta-llama/llama-3.2-3b-instruct) – Prompt $0.05 → $0.051 (+$0.001); Completion $0.33 → $0.335 (+$0.005).
Mistral: Mistral Nemo (mistralai/mistral-nemo) – Prompt $0.02 → $0.019 (−$0.001); Completion $0.04 → $0.03 (−$0.01).

Who should care: Developers monitoring cost trends; note the modest decreases for several efficient models (MiniMax, Qwen families) and slight increases for others. No other meaningful changes reported.

Originally published at The Token Ledger. Subscribe for the daily digest.

The Token Ledger Digest – 2026-07-16

4663437Mehdi — Thu, 16 Jul 2026 09:26:06 +0000

The Token Ledger Digest – 2026-07-16

Lead change – biggest cost impact:

Qwen: Qwen3.7 Max – Prompt rose from $1.25/M to $1.475/M (+$0.225/M); completion rose from $3.75/M to $4.425/M (+$0.675/M).

Who should care: Teams running high‑volume inference on this model will see ~9% higher per‑million‑token spend.

Other price changes

MoonshotAI: Kimi K2.6 – Prompt $0.66/M → $0.95/M (+$0.29/M); Completion $3.41/M → $4.00/M (+$0.59/M).

Who should care: Cost‑sensitive apps using long completions.
Z.ai: GLM 5.2 – Prompt $0.8876/M → $0.959/M (+$0.0714/M); Completion $2.7896/M → $3.014/M (+$0.2244/M).

Who should care: Moderate‑traffic workloads.
Google: Gemma 4 31B – Prompt $0.12/M → $0.22/M (+$0.10/M); Completion $0.37/M → $0.55/M (+$0.18/M).

Who should care: Users of this mid‑size model.
Qwen: Qwen3 Coder Next – Prompt $0.12/M → $0.11/M (−$0.01/M); Completion unchanged $0.80/M.

Who should care: Slight savings for prompt‑heavy tasks.
Z.ai: GLM 4.7 Flash – Prompt $0.0605/M → $0.0600/M (−$0.0005/M); Completion unchanged $0.40/M.

Who should care: Negligible impact.
Qwen: Qwen3 14B – Prompt $0.10/M → $0.12/M (+$0.02/M); Completion unchanged $0.24/M.

Who should care: Small uptick for prompt‑driven use.
Meta: Llama 3.2 3B Instruct – Prompt $0.0509/M → $0.0500/M (−$0.0009/M); Completion $0.335/M → $0.330/M (−$0.005/M).

Who should care: Minor cost reduction.

Removed model

Arcee AI: Coder Large – Prompt $0.50/M, Completion $0.80/M, 32k context.

Who should care: Anyone relying on this model must migrate to an alternative.

No other changes reported. Total models tracked: 342.

Originally published at The Token Ledger. Subscribe for the daily digest.

The Token Ledger Digest – 2026-07-15

4663437Mehdi — Wed, 15 Jul 2026 09:20:40 +0000

The Token Ledger Digest – 2026-07-15

Most cost‑impacting change

NVIDIA: Nemotron 3 Ultra – Completion price rose from $2.20 / 1M to $3.60 / 1M (+$1.40); prompt price rose from $0.50 / 1M to $0.60 / 1M (+$0.10). Who should care: Users running long‑form generation workloads on this model will see ~64% higher per‑token cost.

Other notable price moves

Model	Change	Old ($/1M)	New ($/1M)	Δ ($/1M)
Z.ai: GLM 5	Prompt ↑	0.60	0.95	+0.35
	Completion ↑	1.92	3.15	+1.23
MoonshotAI: Kimi K2.5	Prompt ↑	0.38	0.57	+0.19
	Completion ↑	2.03	2.85	+0.83
Qwen: Qwen3.5 397B A17B	Prompt ↑	0.39	0.45	+0.07
	Completion ↑	2.45	3.00	+0.55
Tencent: Hy3	Prompt ↑	0.14	0.20	+0.06
	Completion ↑	0.58	0.80	+0.22
MiniMax: MiniMax M2.7	Prompt ↑	0.24	0.30	+0.06
	Completion ↑	0.96	1.20	+0.24
Qwen: Qwen3.6 27B	Prompt ↑	0.29	0.45	+0.16
	Completion ↑	2.40	2.70	+0.30
Z.ai: GLM 5.2	Prompt ↓	0.93	0.89	–0.04
	Completion ↓	2.92	2.79	–0.13
Google: Gemma 4 26B A4B	Prompt ↑	0.06	0.10	+0.04
	Completion ↓	0.33	0.30	–0.03

Removed model

Sao10K: Llama 3.1 70B Hanami x1 – Deleted; previously $3.00 / 1M for both prompt and completion. Who should care: Anyone relying on this specific 70B variant must migrate to an alternative.

Cheapest models today (per‑million‑token rates)

inclusionAI: Ling-2.6-flash – $0.01 prompt / $0.03 completion
IBM: Granite 4.0 Micro – $0.017 prompt / $0.112 completion
Mistral: Mistral Nemo – $0.02 prompt / $0.04 completion

Total models tracked: 343.

Originally published at The Token Ledger. Subscribe for the daily digest.

Token Ledger Digest – 2026-07-14

4663437Mehdi — Tue, 14 Jul 2026 09:17:38 +0000

Token Ledger Digest – 2026-07-14

Added Models

Kwaipilot: KAT-Coder-Air V2.5 – New offering. Prompt $0.15/1M, Completion $0.60/1M. Who should care: Teams needing a large‑context (256k) coding assistant at mid‑range cost.
Kwaipilot: KAT-Coder-Pro V2.5 – New offering. Prompt $0.74/1M, Completion $2.96/1M. Who should care: Users prioritizing higher throughput and willing to pay premium for the same 256k context.

Removed Models (Free Tier)

LiquidAI: LFM2.5-1.2B-Thinking (free) – Removed.
LiquidAI: LFM2.5-1.2B-Instruct (free) – Removed.
OpenAI: gpt-oss-120b (free) – Removed. Who should care: Anyone relying on zero‑cost experimentation; must migrate to paid alternatives or other free models.

Price Changes

Model	Change	Old Prompt ($/1M)	New Prompt ($/1M)	Old Completion ($/1M)	New Completion ($/1M)	Who should care
Z.ai: GLM 5.2	↓ Completion –$0.0828; slight Prompt ↓ –$0.0018	0.93	0.9282	3.00	2.9172	Cost‑sensitive users of GLM 5.2; savings on long completions.
Qwen: Qwen3.6 27B	↑ Prompt +$0.004; Completion unchanged	0.285	0.289	2.40	2.40	Developers using Qwen3.6 for prompt‑heavy workloads.
DeepSeek: V4 Flash	↑ Prompt +$0.013; ↑ Completion +$0.026	0.077	0.090	0.154	0.18	Teams balancing low‑cost flash inference; modest cost rise.
Google: Gemma 4 31B	↓ Prompt –$0.060; Completion unchanged	0.12	0.06	0.35	0.35	Users of Gemma 4 seeing halved prompt cost.
DeepSeek: V3.1	↑ Prompt +$0.040; ↑ Completion +$0.160	0.21	0.25	0.79	0.95	Heavy‑usage applications; notable increase for completion‑intensive tasks.
OpenAI: gpt-oss-120b	↓ Prompt –$0.006; ↓ Completion –$0.030	0.036	0.030	0.18	0.15	Existing users benefit from lower overall cost.

Note: Per‑million‑token prices are derived by multiplying the per‑token rate by 1,000,000 and rounding to five decimal places for readability.

Cheapest models today (for reference: inclusionAI: Ling-2.6-flash ($0.01/1M prompt, $0.03/1M completion), IBM: Granite 4.0 Micro ($0.017/1M, $0.112/1M), Meta: Llama 3.1 8B Instruct ($0.02/1M, $0.03/1M).

Originally published at The Token Ledger. Subscribe for the daily digest.

The Token Ledger – 2026-07-13

4663437Mehdi — Mon, 13 Jul 2026 10:34:56 +0000

The Token Ledger – 2026-07-13

The most cost‑impacting change today is the completion price jump for Z.ai: GLM 5.2, which more than doubles its cost per million tokens.

Price changes

Model	What changed	Old price (/1M)	New price (/1M)	Δ (/1M)	Who should care
Z.ai: GLM 5.2	Prompt	$0.42	$0.93	+$0.51	Teams using GLM 5.2 for prompt‑heavy tasks
	Completion	$1.32	$3.00	+$1.68	Anyone generating long completions; cost impact is largest of the day
MoonshotAI: Kimi K2.7 Code	Prompt	$0.72	$0.72	–$0.001	Minor savings for prompt‑heavy workloads
	Completion	$3.50	$3.49	–$0.01	Negligible effect
Z.ai: GLM 4.6	Prompt	$0.43	$0.43	$0.00	No change
	Completion	$1.74	$1.75	+$0.01	Slight uptick for completion‑focused use
Qwen: Qwen3 235B A22B Instruct 2507	Prompt	$0.09	$0.09	$0.00	No change
	Completion	$0.10	$0.55	+$0.45	Significant cost rise for completion‑heavy applications
Meta: Llama 4 Maverick	Prompt	$0.15	$0.20	+$0.05	Moderate increase for prompt‑heavy tasks
	Completion	$0.60	$0.80	+$0.20	Noticeable rise for completion‑heavy tasks

Cheapest models today (per‑million‑token cost)

inclusionAI: Ling-2.6-flash – Prompt $0.01, Completion $0.03
IBM: Granite 4.0 Micro – Prompt $0.02, Completion $0.11
Meta: Llama 3.1 8B Instruct – Prompt $0.02, Completion $0.03

No models were added or removed today.

Originally published at The Token Ledger. Subscribe for the daily digest.

The Token Ledger – 2026-07-12

4663437Mehdi — Sun, 12 Jul 2026 09:14:39 +0000

The Token Ledger – 2026-07-12

Price Changes

Qwen: Qwen2.5 VL 72B Instruct – Prompt price dropped from $0.80/1M to $0.25/1M; completion price dropped from $1.00/1M to $0.75/1M.

Who should care: Developers building vision‑language applications see roughly a 68% reduction in input token cost and a 25% cut in output token cost.
Z.ai: GLM 5.2 – Prompt price increased from $0.41/1M to $0.42/1M; completion price increased from $1.28/1M to $1.32/1M.

Who should care: Users of this model will notice a modest (~3‑4%) rise in per‑token expenses.

Originally published at The Token Ledger. Subscribe for the daily digest.

The Token Ledger Digest – 2026-07-11

4663437Mehdi — Sat, 11 Jul 2026 09:00:09 +0000

The Token Ledger Digest – 2026-07-11

Price Changes

Z.ai: GLM 5.2 – Prompt price ↓ 52% (0.84 → 0.406 $/1M); Completion price ↓ 52% (2.64 → 1.276 $/1M). Who should care: Developers using GLM 5.2 for long‑form generation see per‑token cost cut roughly in half.
MoonshotAI: Kimi K2.7 Code – Prompt unchanged (0.72 $/1M); Completion price ↑ 0.3% (3.49 → 3.50 $/1M). Who should care: Negligible impact; mainly relevant for cost‑sensitive completion workloads.
DeepSeek: DeepSeek V4 Flash – Prompt price ↓ 14% (0.09 → 0.077 $/1M); Completion price ↓ 14% (0.18 → 0.154 $/1M). Who should care: Users of V4 Flash benefit from modest savings on both prompt and completion.

Model Removals

Arcee AI: Trinity Mini – Removed. Was 0.045 $/1M prompt, 0.15 $/1M completion. Who should care: Anyone relying on this model must migrate to an alternative.
Meta: Llama 3 8B Instruct – Removed. Was 0.14 $/1M prompt and completion. Who should care: Users of this 8B instruct variant need to switch to other Llama 3 versions or comparable models.

Cheapest Models Today (reference)

inclusionAI: Ling-2.6-flash – 0.0.0. 0. 0. 0. 0. 0.01 $/1M prompt, 0.03 $/1M completion
IBM: Granite 4.0 Micro – 0.017 $/1M prompt, 0.112 $/1M completion
Meta: Llama 3.1 8B Instruct – 0.02 $/1M prompt, 0.03 $/1M completion

Total models tracked: 345.

Originally published at The Token Ledger. Subscribe for the daily digest.

The Token Ledger Digest – 2026-07-10

4663437Mehdi — Fri, 10 Jul 2026 10:29:29 +0000

The Token Ledger Digest – 2026-07-10

Most cost‑impacting change

MiniMax M2.5 – completion price rose from $0.48 to $0.90 per 1M tokens (prompt $0.12→$0.15).

Who should care: Teams running high‑volume completions on this model will see per‑token costs nearly double; consider alternatives or usage caps.

Other price changes

Model	Change	Old ($/1M)	New ($/1M)	Note
Tencent Hy3	Prompt ↓	0.20	0.14	–30%
	Completion ↓	0.80	0.58	–27.5%
Z.ai GLM 5.2	Prompt ↓	0.93	0.84	–9.7%
	Completion ↓	3.00	2.64	–12%
MoonshotAI Kimi K2.7 Code	Prompt ↓	0.74	0.72	–2.7%
	Completion ↓	3.50	3.49	–0.3%
MiniMax M2.7	Prompt ↑	0.18	0.24	+33%
	Completion ↑	0.72	0.96	+33%
DeepSeek V3.2	Prompt ↓	0.2288	0.2145	–6.3%
	Completion ↓	0.3432	0.32175	–6.3%
OpenAI gpt-oss-120b	Prompt ↑	0.03	0.036	+20%
	Completion ↑	0.15	0.18	+20%

Added models

Model	Prompt ($/1M)	Completion ($/1M)	Context
OpenAI GPT-5.6 Luna Pro	1.00	6.00	1,050,000
OpenAI GPT-5.6 Luna	1.00	6.00	1,050,000
OpenAI GPT-5.6 Terra Pro	2.50	15.00	1,050,000
OpenAI GPT-5.6 Terra	2.50	15.00	1,050,000
OpenAI GPT-5.6 Sol Pro	5.00	30.00	1,050,000
OpenAI GPT-5.6 Sol	5.00	30.00	1,050,000
xAI Grok 4.5	2.00	6.00	500,000
xAI Grok Latest	2.00	6.00	500,000
Venice Uncensored	0.20	0.90	128,000

Removed models

Poolside Laguna XS.2 (free) – was $0/$0.
Poolside Laguna XS.2 – Prompt $0.10, Completion $0.20.
LiquidAI LFM2-24B-A2B – Prompt $0.03, Completion $0.12.
Google Gemini 2.5 Flash Lite Preview 09-2025 – Prompt $0.10, Completion $0.40.
Switchpoint Router – Prompt $0.85, Completion $3.40.

Who should care: Users of the removed models must migrate to alternatives; added models offer new options, with Venice Uncensored being the most inexpensive among them.

Originally published at The Token Ledger. Subscribe for the daily digest.

Token Ledger Digest – 2026-07-08

4663437Mehdi — Wed, 08 Jul 2026 09:34:54 +0000

Token Ledger Digest – 2026-07-08

AionLabs model swap (largest cost impact)

Removed: AionLabs: Aion-1.0 (prompt $4.00/1M, completion $8.00/1M) and Aion-1.0‑Mini (prompt $0.70/1M, completion $1.40/1M).
Added: AionLabs: Aion-3.0 (prompt $3.00/1M, completion $6.00/1M) and Aion-3.0‑Mini (prompt $0.70/1M, completion $1.40/1M).
Impact: Net saving of ~$1.00/1M prompt and $2.00/1M completion versus the retired Aion‑1.0; the mini models are unchanged.
Who should care: Teams migrating from Aion‑1.0 to the newer Aion‑3.0 series will see lower per‑token costs, especially for completion‑heavy workloads.

Price update

Model: Z.ai: GLM 5.2
Prompt: rose from $0.90/1M to $0.93/1M (+$0.03/1M).
Completion: rose from $2.86/1M to $3.00/1M (+$0.14/1M).
Impact: Modest cost increase; relevant for users tightly budgeting on this model.

No other meaningful changes were recorded today.

Originally published at The Token Ledger. Subscribe for the daily digest.

Token Ledger — 2026-07-07

4663437Mehdi — Tue, 07 Jul 2026 10:31:27 +0000

AI API Pricing Digest – 2026-07-07

Most cost‑impacting change: Z.ai’s GLM 5.2 prompt price dropped, while its completion price rose slightly.

Z.ai: GLM 5.2
- What changed: Prompt price ↓, completion price ↑.
- Numbers: Prompt $0.9086 → $0.90/1M tokens (−$0.0086/1M). Completion $2.8556 → $2.86/1M tokens (+$0.0044/1M).
- Who should care: Teams using GLM 5.2 for prompt‑heavy workloads see modest savings; completion‑heavy jobs incur a tiny extra cost.

Other price adjustments:

Meta: Llama 3.2 3B Instruct
- What changed: Both prompt and completion prices decreased.
- Numbers: Prompt $0.0509 → $0.05/1M (−$0.0009/1M). Completion $0.335 → $0.33/1M (−$0.005/1M).
- Who should care: Cost‑conscious users of this model benefit from lower per‑token fees.

Newly added models:

Tencent: Hy3 (free) – Prompt $0.00/1M, Completion $0.00/1M.
Tencent: Hy3 – Prompt $0.20/1M, Completion $0.80/1M.
Nex AGI: Nex‑N2‑Mini – Prompt $0.025/1M, Completion $0.10/1M.
- Who should care: Developers evaluating zero‑cost options or ultra‑cheap alternatives for high‑volume tasks.

Cheapest models today (per‑million tokens):

inclusionAI: Ling‑2.6‑flash – Prompt $0.01, Completion $0.03
IBM: Granite 4.0 Micro – Prompt $0.017, Completion $0.112
Meta: Llama 3.1 8B Instruct – Prompt $0.02, Completion $0.03
Mistral: Mistral Nemo – Prompt $0.02, Completion $0.03
Nex AGI: Nex‑N2‑Mini – Prompt $0.025, Completion $0.10

Total models tracked: 343.

Originally published at The Token Ledger. Subscribe for the daily digest.

Token Ledger — 2026-07-06

4663437Mehdi — Mon, 06 Jul 2026 11:43:12 +0000

The Token Ledger Digest – 2026-07-06

No meaningful changes today.

Cheapest models (per‑million tokens):

inclusionAI: Ling-2.6-flash – Prompt $0.01 / 1M, Completion $0.03 / 1M.

Who should care: Cost‑sensitive applications needing low‑latency flash inference.
IBM: Granite 4.0 Micro – Prompt $0.02 / 1M, Completion $0.11 / 1M.

Who should care: Enterprises seeking a balanced micro‑model with cheap prompts.
Meta: Llama 3.1 8B Instruct – Prompt $0.02 / 0.02 / 1M, Completion $0.03 / 1M.

Who should care: Developers needing a capable 8B model at minimal cost.

Originally published at The Token Ledger. Subscribe for the daily digest.

Token Ledger — 2026-07-05

4663437Mehdi — Sun, 05 Jul 2026 09:46:23 +0000

The Token Ledger Digest – 2026-07-05

No meaningful changes today (no additions, removals, or price updates).

Three cheapest models (per‑million‑token pricing):

inclusionai/ling-2.6-flash
- Prompt: $0.01 / 1M tokens
- Completion: $0.03 / 1M tokens
meta-llama/llama-3.1-8b-instruct
- Prompt: $0.02 / 1M tokens
- Completion: $0.03 / 1M tokens
mistralai/mistral-nemo
- Prompt: $0.02 / 1M tokens
- Completion: $0.03 / 1M tokens

Total models tracked: 340.

Originally published at The Token Ledger. Subscribe for the daily digest.