AB AB

Posted on Apr 14 • Originally published at token-landing.com

AI API Pricing Per Request: What Does Each Call Actually Cost?

#openai #ai #api #webdev

Per-Request Cost Breakdown

LLM pricing is typically quoted per 1 million tokens, which makes it hard to intuit what a single API call actually costs. This table breaks it down for a typical request: 1,000 input tokens (a moderate system prompt + user message) and 500 output tokens (a paragraph-length response).

Model

Input Cost

Output Cost

Total Per Request

Mistral Nemo

$0.00002

$0.00002
$0.00004

GPT-4o-mini

$0.00015

$0.00030
$0.00045

Gemini 2.5 Flash

$0.00015

$0.00030
$0.00045

DeepSeek V3

$0.00028

$0.00021
$0.00049

GPT-5-mini

$0.00025

$0.00100
$0.00125

Claude Haiku 3.5

$0.00080

$0.00200
$0.00280

Gemini 2.5 Pro

$0.00125

$0.00500
$0.00625

GPT-4o

$0.00250

$0.00500
$0.00750

Mistral Large

$0.00200

$0.00300
$0.00500

Claude Sonnet 4

$0.00300

$0.00750
$0.01050

Claude Opus 4.6

$0.00500

$0.01250
$0.01750

GPT-5

$0.01000

$0.01500
$0.02500

Token Landing Hybrid

~$0.00080 – $0.00150

~$0.00150 – $0.00300
~$0.00230 – $0.00450

Based on 1,000 input + 500 output tokens per request. Actual costs vary with prompt length. Prices approximate, April 2026.

What These Numbers Mean in Practice

At the budget end, a single GPT-4o-mini call costs less than a twentieth of a cent. At the premium end, a GPT-5 call costs two and a half cents. These sound trivially small, but they add up fast at scale:

        - 1,000 requests/day: GPT-4o costs $7.50/day ($225/month). DeepSeek V3 costs $0.49/day ($15/month).

        - 10,000 requests/day: GPT-4o costs $75/day ($2,250/month). Claude Sonnet 4 costs $105/day ($3,150/month).

        - 100,000 requests/day: Even GPT-4o-mini costs $45/day ($1,350/month). GPT-5 would cost $2,500/day ($75,000/month).

Why Per-Request Thinking Matters

Thinking about cost per request (rather than per million tokens) helps you make better architectural decisions:

        - Is this request worth a premium model? A simple classification does not need a $0.025 GPT-5 call when a $0.00045 GPT-4o-mini call works.

        - What is the cost of a retry? If a cheap model fails and requires a retry on a premium model, the total cost might exceed just using the premium model once.

        - Where are the cost hotspots? That one endpoint doing 50,000 requests/day at $0.0105 each is costing $15,750/month. Route it through hybrid routing and save $6,000-10,000/month.

Optimizing Per-Request Costs

The most effective way to reduce per-request cost is routing each request to the right model tier. Token Landing does this automatically. Simple requests go to budget models ($0.0005/request), medium complexity goes to mid-tier ($0.003-0.006/request), and only complex tasks use premium models ($0.01-0.025/request).

Combined with prompt caching and batch processing, most teams can achieve an effective per-request cost of $0.002-0.005 — premium-quality results at budget prices.

Originally published on Token Landing