DEV Community

Cover image for Stop Wasting Tokens: How to Cut Your LLM Costs by 97%
Midhun Sekhar
Midhun Sekhar

Posted on

Stop Wasting Tokens: How to Cut Your LLM Costs by 97%

The hidden tax in your AI pipeline

If you're building with GPT or Claude, you’ve probably done this:

  • Call an API

  • Get a big JSON response

  • Send the whole thing to your LLM

Seems harmless, right?

It’s not.

You’re quietly burning money on something you don’t even use.


💸 The "metadata tax"

Let’s say your API returns this:

{
  "order": {
    "id": 123,
    "user": {
      "name": "Midhun",
      "email": "midhun@email.com"
    },
    "items": [ ... 100 objects ... ],
    "metadata": { ... tons of fields ... }
  }
}
Enter fullscreen mode Exit fullscreen mode

Now ask yourself:

👉 What does your LLM actually need?

Probably just this:

{
  "name": "Midhun",
  "email": "midhun@email.com"
}
Enter fullscreen mode Exit fullscreen mode

🤯 Here’s the problem

LLMs don’t care what’s useful.

They charge you for everything.

  • Full JSON → ~1500 tokens

  • Useful data → ~60 tokens

👉 You’re paying ~25x more than necessary.

And this happens on every request.


🧠 “I’ll just parse it manually”

Sure… you can do this:

user = data.get("order", {}).get("user", {})
email = user.get("email")
Enter fullscreen mode Exit fullscreen mode

Now imagine:

  • 10+ fields

  • deeply nested structures

  • multiple APIs

You end up writing:

  • defensive null checks

  • brittle parsing logic

  • repeated boilerplate everywhere

It’s not hard… just annoying and error-prone.


⚡ The smarter approach: preprocess your data

Instead of sending raw JSON to your LLM:

👉 clean it first

Use a small extraction step to pull only what you need.

For example:

{
  "data": {...},
  "queries": {
    "email": ".order.user.email",
    "name": ".order.user.name"
  }
}
Enter fullscreen mode Exit fullscreen mode

Output:

{
  "email": "midhun@email.com",
  "name": "Midhun"
}
Enter fullscreen mode Exit fullscreen mode

💰 Why this matters more than you think

Let’s do rough math:

Payload Tokens Cost (per 1k calls)
Raw JSON 1500 ~$45
Cleaned JSON 60 ~$1

👉 That’s a 97% reduction

Now multiply that by:

  • daily requests

  • production scale

This is not optimization.

This is cost control.


🔧 So how do you actually do this?

You have 2 options:

Option 1: Local parsing

  • Use JSONPath libraries

  • Write custom logic

  • Maintain it across services

Option 2: Use a preprocessing layer

I ended up building a small tool for this after getting tired of repeating the same parsing logic.

It’s basically:

“JSON query engine as a service”

You send:

  • raw JSON

  • a query

You get:

  • clean, minimal payload

No setup, no dependencies.


🔗 Real-world use cases

This pattern is surprisingly useful:

🤖 AI pipelines

Reduce token usage before sending data to LLMs

🔔 Webhooks

Clean payloads from:

  • Stripe

  • Shopify

  • GitHub

📊 Logs & analytics

Extract only relevant fields from large datasets


⚡ A small shift, big impact

Most developers optimize:

  • prompts

  • model selection

But ignore:

the data they send

That’s where the real waste is.


🚀 Final thought

In the AI era:

Efficiency = profit

Before optimizing your prompts,

try optimizing your input.

You might be surprised how much you save.


👇 If you're curious

I turned this idea into a simple API you can try:

JSON PowerExtract (available on RapidAPI)

I included a Free Tier (500 requests/month) so you can test the token savings in your own pipeline today.


Top comments (0)