LLM Token Costs: Why Your Prompt Might Cost 10x More Than You Think
If you're building with LLM APIs, you've probably wondered: how many tokens is this prompt actually using?
I built a free tool to answer that: LLM Token Counter
What it does
- Paste any text and instantly see the token count
- Compares costs across GPT-4o, GPT-3.5 Turbo, Claude 3 Haiku, and Gemini 1.5 Flash
- No login, no backend — runs entirely in your browser
- Uses the actual cl100k_base tokenizer (same as OpenAI) for accurate GPT counts
The cost gap is real
Consider a typical system prompt + user message (about 500 tokens):
| Model | Cost per 1M tokens | Cost for 500 tokens |
|---|---|---|
| GPT-4o | $2.50 | $0.00125 |
| GPT-3.5 Turbo | $0.50 | $0.00025 |
| Claude 3 Haiku | $0.25 | $0.000125 |
| Gemini 1.5 Flash | $0.075 | $0.0000375 |
That's a 33x cost difference between GPT-4o and Gemini 1.5 Flash for the same prompt.
Why client-side matters
Most token counters require you to send your prompt to a server. This one runs entirely in your browser using js-tiktoken — your prompts never leave your machine.
Try it
👉 https://code-two-delta.vercel.app
Source code: github.com/SolvoHQ/llm-token-counter
Would love feedback — what models or features would make this more useful for your workflow?
Top comments (1)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.