Per-Request Cost Breakdown
LLM pricing is typically quoted per 1 million tokens, which makes it hard to intuit what a single API call actually costs. This table breaks it down for a typical request: 1,000 input tokens (a moderate system prompt + user message) and 500 output tokens (a paragraph-length response).
Model
Input Cost
Output Cost
Total Per Request
Mistral Nemo
$0.00002
$0.00002
$0.00004
GPT-4o-mini
$0.00015
$0.00030
$0.00045
Gemini 2.5 Flash
$0.00015
$0.00030
$0.00045
DeepSeek V3
$0.00028
$0.00021
$0.00049
GPT-5-mini
$0.00025
$0.00100
$0.00125
Claude Haiku 3.5
$0.00080
$0.00200
$0.00280
Gemini 2.5 Pro
$0.00125
$0.00500
$0.00625
GPT-4o
$0.00250
$0.00500
$0.00750
Mistral Large
$0.00200
$0.00300
$0.00500
Claude Sonnet 4
$0.00300
$0.00750
$0.01050
Claude Opus 4.6
$0.00500
$0.01250
$0.01750
GPT-5
$0.01000
$0.01500
$0.02500
Token Landing Hybrid
~$0.00080 – $0.00150
~$0.00150 – $0.00300
~$0.00230 – $0.00450
Based on 1,000 input + 500 output tokens per request. Actual costs vary with prompt length. Prices approximate, April 2026.
What These Numbers Mean in Practice
At the budget end, a single GPT-4o-mini call costs less than a twentieth of a cent. At the premium end, a GPT-5 call costs two and a half cents. These sound trivially small, but they add up fast at scale:
- 1,000 requests/day: GPT-4o costs $7.50/day ($225/month). DeepSeek V3 costs $0.49/day ($15/month).
- 10,000 requests/day: GPT-4o costs $75/day ($2,250/month). Claude Sonnet 4 costs $105/day ($3,150/month).
- 100,000 requests/day: Even GPT-4o-mini costs $45/day ($1,350/month). GPT-5 would cost $2,500/day ($75,000/month).
Why Per-Request Thinking Matters
Thinking about cost per request (rather than per million tokens) helps you make better architectural decisions:
- Is this request worth a premium model? A simple classification does not need a $0.025 GPT-5 call when a $0.00045 GPT-4o-mini call works.
- What is the cost of a retry? If a cheap model fails and requires a retry on a premium model, the total cost might exceed just using the premium model once.
- Where are the cost hotspots? That one endpoint doing 50,000 requests/day at $0.0105 each is costing $15,750/month. Route it through hybrid routing and save $6,000-10,000/month.
Optimizing Per-Request Costs
The most effective way to reduce per-request cost is routing each request to the right model tier. Token Landing does this automatically. Simple requests go to budget models ($0.0005/request), medium complexity goes to mid-tier ($0.003-0.006/request), and only complex tasks use premium models ($0.01-0.025/request).
Combined with prompt caching and batch processing, most teams can achieve an effective per-request cost of $0.002-0.005 — premium-quality results at budget prices.
Originally published on Token Landing
Top comments (0)