I kept losing track of what my LLM API calls actually cost. Claude, GPT, Gemini, Grok, DeepSeek all price per token, all split input from output, and every provider lays its pricing page out differently. So before I shipped anything to production I wanted one screen that answered a simple question: for MY token volume, what does each model actually cost per month?
I could not find a clean, no-signup version of that. So I built one, and I am giving away the embed code at the bottom so you can drop it into your own docs or blog.
The thing I kept getting wrong
Per-token pricing hides two traps:
- Output costs more than input. Usually 2x to 6x more. A model with a cheap input price that loves to write long answers can quietly cost more than a "pricier" model that answers tersely. So the headline number on a pricing page tells you almost nothing on its own.
- A token is not a word. Roughly 4 characters of English, so about 750 words is ~1,000 tokens. People eyeball "1,000 calls a day" and forget the prompt, the system message, and the context window all bill on the input side too.
The only honest way to compare is to plug in your own input/output token volume and let it do the multiplication. That is the whole tool.
What it does
A sortable table of current per-million-token list prices across the Claude, GPT, Gemini, Grok and DeepSeek families, plus a live estimator: type your monthly input and output tokens and it ranks every model by your actual monthly bill, cheapest first.
A few things that surprised me building it:
- On raw list price the open-weight budget tiers (DeepSeek, the "Flash"/"mini"/"Fast" variants) are cheaper by a wide margin than the frontier models.
- But cost per token is the wrong metric. What you care about is cost per solved task. A frontier model that one-shots the job can be cheaper end to end than a budget model you have to call three times. The estimator is most useful once you measure your real token-per-task numbers, not before.
- Prompt caching is the most underused lever. If your workload reuses a big fixed system prompt, cached input can run up to ~90% cheaper. Batch jobs are often ~50% off. List price is the ceiling, not the bill.
It is client-side, nothing is uploaded, no login, no email gate:
LLM API Pricing Comparison + token cost estimator
Always confirm the live number on the provider's own page before you commit a budget. Prices move and caching/batch discounts are not in the list price.
Embed it on your own site (free)
This is the part I actually want to share. If you write docs or a blog and want a live pricing table that updates when I update it, paste this in:
<iframe src="https://aitoolsinsiderhq.com/llm-api-pricing.html?embed=1" width="100%" height="900" style="border:1px solid #e8e1d2;border-radius:14px;max-width:760px" loading="lazy" title="LLM API Pricing Comparison"></iframe>
The ?embed=1 strips the site chrome and drops in just the table and estimator with a small credit link back. Free to use.
A few other free no-signup tools I built alongside it
Same rule for all of these: client-side, no account, no email wall.
- AI Stack Cost Calculator - tick the SaaS AI tools you pay for, get your monthly and annual spend plus where the free-tier savings are.
- Schema Markup Generator (JSON-LD) - Article/FAQ/Product/Organization/HowTo/Breadcrumb, copy-paste output.
- On-Page SEO Content Analyzer - paste a page, get a /100 score and a priority fix list.
- AI Search Visibility Checker - scores how citable your page is for ChatGPT/Perplexity/Gemini.
If you build LLM features, the pricing tool is the one I reach for weekly. Embed it, fork the idea, or just steal the layout. Happy to answer questions on how any of it is built.
Top comments (0)