Skip to content

DEV Community

4663437Mehdi

Posted on Jun 2 • Originally published at 4663437mehdi.github.io

The Token Ledger Digest – 2026-06-02

#ai #llm #api #news

The Token Ledger Digest – 2026-06-02

Most Impactful Change

Qwen: Qwen3 30B A3B Instruct 2507 – Prompt price fell from $0.0900 to $0.0428 /1M tokens; completion price fell from $0.3000 to $0.1716 /1M tokens. Who should care: Teams running cost‑sensitive inference on this model see ~42% lower prompt and ~43% lower completion costs.

Price Changes

Tencent: Hy3 preview – Prompt $0.0660 → $0.0630 /1M; Completion $0.2600 → $0.2100 /1M. Who should care: Users of Hy3 preview benefit from modest savings on both prompt and completion.
MiniMax: MiniMax M2.7 – Prompt $0.2600 → $0.2790 /M (↑$0.0190); Completion unchanged at $1.2000 /1M. Who should care: Slight increase in prompt cost; completion cost stable.
DeepSeek: DeepSeek V3.2 – Prompt $0.2520 → $0.2288 /1M; Completion $0.3780 → $0.3432 /1M. Who should care: Moderate reductions across both prompt and completion.
DeepSeek: DeepSeek V3 – Prompt $0.2288 → $0.2002 /1M; Completion $0.9144 → $0.8001 /1M. Who should care: Notable completion‑cost drop (~12%) for chat workloads.

Removed Models

Baidu: ERNIE 4.5 300B A47B – No longer available; previously $0.2800 prompt /1M, $1.1000 completion /1M. Who should care: Users needing this large Baidu model must migrate to alternatives.
Google: Gemini 2.0 Flash Lite – Removed; previously $0.0750 prompt /1M, $0.3000 completion /1M. Who should care: Applications relying on this low‑latency model need replacement.
Google: Gemini 2.0 Flash – Removed; previously $0.1000 prompt /1M, $0.4000 completion /1M. Who should care: Users of the standard Flash model must adjust their provider list.

Originally published at The Token Ledger. Subscribe for the daily digest.

Top comments (0)

Subscribe