DeepSeek V4 Price: Pro vs Flash API Costs

#deepseek #ai #llm #pricing

DeepSeek V4 pricing is split across two API models: deepseek-v4-pro and deepseek-v4-flash.

The official pricing page lists separate rates for cache-hit input, cache-miss input, and output tokens. That matters because repeated system prompts, reused context, and stable templates can make cache-hit pricing materially cheaper.

Think of Flash and Pro as two pricing lanes: Flash handles volume, while Pro is reserved for prompts where failure cost is higher.

Official API prices

Model	Cache-hit input	Cache-miss input	Output
DeepSeek V4 Flash	$0.028 / 1M tokens	$0.14 / 1M tokens	$0.28 / 1M tokens
DeepSeek V4 Pro	$0.145 / 1M tokens	$1.74 / 1M tokens	$3.48 / 1M tokens

Source: DeepSeek API pricing.

How to choose

Use DeepSeek V4 Flash when the workload is high-volume: chat, summaries, extraction, classification, routing, and first-pass analysis.

Use DeepSeek V4 Pro when the task has a higher failure cost: difficult code repair, long reasoning, advanced math, agent planning, or final answer synthesis after cheaper models have prepared context.

Credit mapping on this site

This site uses a simple credit layer above the official API:

Flash chat: 1 credit
Pro chat: 4 credits
Thinking: +1 credit
Web search: +2 credits

This is not DeepSeek's official billing model. It is a product-level abstraction so users can compare Flash, Pro, Thinking, and web search in one interface.

Practical cost advice

Keep reusable instructions stable so prompt caching can work. Route cheap, repetitive prompts to Flash. Escalate to Pro only when the answer needs the stronger reasoning ceiling.

Source article: Read the original post

Homepage: Visit the site

Model pages: