DEV Community

Cover image for I Built a Token Compressor That Cuts LLM Context Size by 60%
Manas Mishra
Manas Mishra

Posted on

I Built a Token Compressor That Cuts LLM Context Size by 60%

Every token you send to an LLM costs money and eats into your context window. If you're stuffing structured data - JSON arrays, database records, API responses - into your prompts, you're probably wasting more than half your tokens on repeated keys, redundant values, and verbose formatting.

I built ctx-compressor( https://www.npmjs.com/package/ctx-compressor ) to fix that.

The Problem

Say you have 100 user records that look like this:

  {
    "name": "Adeel Solangi",
    "language": "Sindhi",
    "id": "V59OF92YF627HFY0",
    "bio": "Donec lobortis eleifend condimentum...",
    "version": 6.1
  },
  {
    "name": "Afzal Ghaffar",
    "language": "Sindhi",
    "id": "ENTOCR13RSCLZ6KU",
    "bio": "...",
    "version": 3.2
  }
  // ... 98 more records
]
Enter fullscreen mode Exit fullscreen mode

That JSON blob eats up 19,338 tokens and 63,732 characters when you feed it to a model. Every single object repeats "name", "language", "id", "bio", "version" - that's pure waste.

The Solution

ctx-compressor analyzes your data, extracts a schema, deduplicates repeated values, and encodes everything into a compact format that LLMs can still understand:

CTX/2
$name:s|language:s|id:s|bio:t|version:n2
&0=Sindhi
&1=Uyghur
&2=Galician
&3=Maltese
&4=Sesotho sa Leboa
&5=isiZulu
&6=Icelandic
Adeel Solangi|&0|V59OF92YF627HFY0|Donec lobortis...|6.1
Afzal Ghaffar|&0|ENTOCR13RSCLZ6KU|...|3.2
Enter fullscreen mode Exit fullscreen mode

The same data compresses down to 7,442 tokens and 13,746 characters.

That's a 61.5% reduction in tokens and a 78.4% reduction in characters.

How It Works

The compression relies on a few key ideas:

Schema extraction: Instead of repeating key names in every object, define them once at the top with type hints (s for string, n for number, t for text, etc.).

Value deduplication: Values that appear frequently (like "Sindhi" across many records) get assigned short aliases (&0, &1, ...) and referenced by alias instead of repeated in full.

Pipe-delimited rows: Each record becomes a single line with values separated by |, positionally mapped to the schema. No braces, no quotes, no colons.

The result is a format that's dense for the tokenizer but still perfectly readable by the LLM.

Usage

npm install ctx-compressor
Enter fullscreen mode Exit fullscreen mode
import { compress, decompress } from 'ctx-compressor';

const data = [
  { name: "Adeel Solangi", language: "Sindhi", id: "V59OF92YF627HFY0", bio: "...", version: 6.1 },
  // ... more records
];

const compressed = compress(data);
// Send `compressed` to your LLM instead of JSON.stringify(data)

// Need the original structure back?
const original = decompress(compressed);
Enter fullscreen mode Exit fullscreen mode

When To Use This

ctx-compressor shines when you're sending arrays of similarly-shaped objects to an LLM, think database query results, API responses, CSV-style data, user lists, product catalogs, log entries, etc.

It's particularly effective when:

  • Your data has repeated field names across many objects
  • Certain field values repeat frequently (categories, statuses, languages)
  • You're hitting context window limits or trying to reduce API costs
  • You're doing RAG and want to pack more retrieved documents into context

Benchmark Results

Tested on a dataset of 100 user records with name, language, ID, and version fields:

  • Raw JSON: 19,338 tokens
  • ctx-compressor: 7,442 tokens
  • Savings: 61.5%

LLM Compatibility

The compressed format includes the schema definition and alias table right in the output, so the LLM has everything it needs to interpret the data. In my testing, models handle it well; you can ask questions about the data, filter, aggregate, and reason over it just like you would with raw JSON.

You can also include a brief instruction in your system prompt, like: "The following data is in CTX/2 compressed format. The schema and alias definitions are included at the top." - though most modern models figure it out without any extra prompting.

What's Next

I'm actively working on this and would love feedback. Some directions I'm exploring:

  • Nested object support
  • Streaming compression for large datasets
  • Built-in integrations with popular LLM SDKs
  • A CLI tool for quick compression from the terminal
  • Handling text prompt compression

Check out the package on npm and let me know what you think. If you're spending too much on tokens for structured data, give it a try; that 60% savings adds up fast


If you found this useful, consider giving the repo a star, as it helps more people discover the project.

Top comments (0)