Tags: #ai #webdev #javascript #opensource #ChatGPT #Claude #Copilot #Gemini
Every time I pasted a spreadsheet or XML file into Claude or ChatGPT, something bothered me.
I was burning tokens on things the AI didn't need.
A 120-row sales CSV with 10 columns repeats the column names 120 times. An XML response from an enterprise API can spend 40% of its tokens on namespace declarations like xmlns:ns0="http://schemas.example.com/data" before a single byte of real data appears. A JSON array of 200 objects repeats every key 200 times.
The AI doesn't need any of that. It needs the data.
So I built TokenPinch — a tool that converts files to a compact .pinch format before you paste them into any AI. And in the process, I ended up designing what I think could be a useful open format for token-efficient data exchange with LLMs.
The problem in numbers
Take a typical sales CSV with verbose headers:
transaction_date,customer_identifier,product_category,product_name,unit_price,quantity_ordered,discount_percentage,revenue_total,warehouse_location,sales_representative
2024-01-01,CUST1000,Electronics,Laptop,999.99,2,0,1999.98,Miami-FL,Maria Lopez
2024-01-02,CUST1001,Apparel,Jacket,59.99,1,10,53.99,Dallas-TX,Carlos Rivera
...120 more rows
That's approximately 3,400 tokens. The column headers alone — repeated 120 times — account for about 30% of that.
Now look at the same data in .pinch format:
[AIX-FORMAT v1.0]
Source: sales_data.csv | Rows: 120 | Strategy: TOON tabular + header aliasing
Decode: expand [ALIASES] to reconstruct column names, then parse [DATA] rows in header order.
Generated-by: TokenPinch (tokenpinch.com)
[/AIX-FORMAT]
[ALIASES]
transaction_date:td | customer_identifier:ci | product_category:pc | product_name:pn | unit_price:up | quantity_ordered:qo | discount_percentage:dp | revenue_total:rt | warehouse_location:wl | sales_representative:sr
[/ALIASES]
[DATA]
sales_data[120]{td,ci,pc,pn,up,qo,dp,rt,wl,sr}:
2024-01-01,CUST1000,Electronics,Laptop,999.99,2,0,1999.98,Miami-FL,Maria Lopez
2024-01-02,CUST1001,Apparel,Jacket,59.99,1,10,53.99,Dallas-TX,Carlos Rivera
...
[/DATA]
That's approximately 900 tokens. Same data, 74% fewer tokens.
At GPT-4o prices ($2.50/1M tokens), that's $0.006 saved per file. If you're processing 1,000 files a month, that's $6 back. If you're building an application that feeds files to an LLM at scale, the savings compound fast.
How .pinch works
The format has three compression strategies, applied automatically based on the source file type.
Strategy 1: TOON tabular + header aliasing (CSV, Excel, JSON arrays)
Inspired by TOON (Token-Oriented Object Notation), this strategy:
- Declares column headers once at the top with short aliases
- Uses aliases in all data rows instead of full names
- Wraps everything in a self-describing block the LLM can decode
Alias generation follows a priority chain:
- First, try initials:
transaction_date→td,customer_full_name→cfn - Then, first 2 chars of word 1 + first char of word 2:
product_name→prn - Then, first 2 characters:
status→st - Fallback: letter + number:
a1,a2
All aliases are guaranteed unique within a file.
Strategy 2: XML namespace stripping (XML, SOAP)
Enterprise XML is brutal for token consumption. A SOAP response can have 8 namespace declarations adding hundreds of tokens before the actual data starts.
.pinch strips:
-
<?xml ?>declarations - All
xmlns:*attribute declarations - Namespace prefixes from element names:
<ns0:Response>→<Response> - Namespace prefixes from attributes:
ns1:ID="X"→ID="X" - Empty elements and redundant whitespace
The result is clean, standard XML with all content and attributes preserved.
Real example: I tested this with a large enterprise XML API response with 8 namespace declarations. The .pinch version was 60% smaller, and when uploaded to ChatGPT as a file attachment, it interpreted it perfectly — identifying the data structure, fields, and values without any special instructions.
Strategy 3: Text normalization (Word, PDF)
For .docx and .pdf files, the strategy is simpler:
- Extract plain text (ignoring images and formatting)
- Normalize whitespace: collapse multiple spaces, remove trailing spaces, max 2 consecutive blank lines
- For PDFs: prefix each page with
[Page N]
This typically saves 15–30% on documents with irregular whitespace — which is almost every PDF converted from a scanned or formatted source.
The self-describing header
The most important design decision in .pinch is the [AIX-FORMAT] header.
Every .pinch file starts with a block that tells any LLM exactly how to decode it:
[AIX-FORMAT v1.0]
Source: filename.ext | Strategy: <strategy used>
Decode: <plain English instruction>
Generated-by: TokenPinch (tokenpinch.com)
[/AIX-FORMAT]
This means .pinch works without any system prompt changes, without native platform support, and without any configuration. You paste it or upload it, and the AI reads the instructions and works with the data normally.
Claude, ChatGPT, Gemini, and Copilot have all handled it correctly in testing.
The no-compression rule
One thing I learned building this: not every file benefits from compression.
If the compressed output would consume more tokens than the original, TokenPinch skips the conversion and tells the user to paste the original file directly. This happens with very small files where the format overhead outweighs the savings, or with files that already have short column names and short values.
Honesty about when the tool doesn't help is part of the value.
Implementation
TokenPinch is a single optimize webapp using followin stack:
- SheetJS for Excel parsing
- mammoth.js for Word document text extraction
- pdf.js for PDF text extraction
- Vanilla JS for everything else
The entire tool is ~900 lines of HTML/CSS/JS.
The .pinch spec is open
I've published the format specification on GitHub. The spec covers:
- Complete file structure and block syntax
- All three compression strategies with examples
- Alias generation algorithm
- Decoder instructions for LLMs
- A reference Python implementation
The goal is for .pinch to become a standard interchange format for feeding structured data to LLMs — something any tool or pipeline can implement, regardless of language or platform.
If you're building something that sends files to LLMs and want to reduce token costs, you can implement the format from the spec without using the TokenPinch tool at all.
What's next
-
Developer API — to integrate
.pinchcompression into pipelines programmatically - More formats — YAML, SQL dumps, Parquet
- Prompt templates — pre-built prompts for common tasks (summarize, find anomalies, convert to JSON)
- Compression strategic improvements — look at the compression algorithms errors reported and improve the current beta version.
Try it
Tool: tokenpinch.com — free, no signup, runs entirely in your browser
Spec: github.com/javiercast/pinch-format
If you work with structured data and LLMs regularly, I'd love to hear whether this solves a real problem for you — or what's missing.
Built over a few sessions with Claude's help. Yes, I used an AI to build a tool for using AI more efficiently. The irony is not lost on me.

Top comments (0)