DEV Community: Om Kawale

KODA Format: A Schema-First Data Format to Reduce LLM Token Usage ( 40%)

Om Kawale — Mon, 04 May 2026 08:28:00 +0000

When building applications with large language models (LLMs), one of the most overlooked costs is how structured data is represented.

Most systems use JSON.

And JSON is inefficient for LLM input.

What is KODA?

KODA (Knowledge-Oriented Data Abstraction) is a schema-first data format designed to reduce token usage when sending structured data to LLMs.

It works by:

Defining structure once (schema-first)
Encoding values positionally
Eliminating repeated keys found in JSON

KODA is optimized for:

RAG pipelines
Tool calling systems
Agent workflows
High-volume structured LLM input

The Problem with JSON in LLM Pipelines

JSON repeats field names for every record.

Example:

[
  {"id": 1, "title": "Bug", "state": "open"},
  {"id": 2, "title": "Fix", "state": "closed"}
]

Each object repeats:

id
title
state

If you send 1000 records:

those keys are repeated 1000 times
tokens are wasted
costs increase
context window shrinks

KODA Equivalent

KODA/1
@META
schemas:issue
counts:issue=3

@SCHEMA
issue:id title state

@DATA:issue
1|Bug|open
2|Fix|closed

No repeated keys.

Only structure + values.

Token Reduction Benchmark

Measured using a gpt-4o-mini tokenizer on real datasets.

Case	JSON Tokens	KODA Tokens	Reduction
Repetitive Logs	3202	1233	61.5%
GitHub Issues	4137	2576	37.7%
Small Dataset	26	35	-34.6%

Key insight

KODA performs best on large, repetitive structured data.

For small datasets, schema overhead can outweigh benefits.

Why This Matters

In LLM systems:

Tokens = cost
Tokens = latency
Tokens = context capacity

Reducing tokens by ~30–40%:

lowers API costs
increases usable context
improves system efficiency

How KODA Works

KODA separates:

Schema → defined once
Data → streamed positionally

This removes structural redundancy.

Quick Python Example

from koda import Schema, Field, encode

schema = Schema("user", [
    Field("id"),
    Field("name"),
    Field("email", optional=True),
    Field("active", default="true")
])

data = [
    {"id": 1, "name": "Alice", "email": "alice@example.com"},
    {"id": 2, "name": "Bob"}
]

koda_str = encode(data, schema)
print(koda_str)

KODA vs JSON vs YAML vs TOON

Format	Token Efficiency	Readability	Best Use Case
JSON	Low	High	APIs
YAML	Medium	Medium	Config files
TOON	High	Medium	LLM structured data
KODA	High	Low	LLM pipelines

When to Use KODA

Use KODA if you are:

sending large structured datasets to LLMs
building RAG pipelines
working with tool calls or agents
optimizing token usage in production systems

When NOT to Use KODA

Do not use KODA for:

small datasets (1–2 records)
irregular or deeply nested JSON
human-authored configuration files

JSON is better in those cases.

Design Principles

Schema-first design
Positional encoding
Deterministic parsing
No repeated keys
Optimized for LLM input

Is KODA a JSON Replacement?

No.

KODA is a transport format for LLM pipelines.

Typical workflow:

JSON → KODA → LLM

FAQ

What is KODA?

KODA is a schema-first data format that reduces token usage for structured data in LLM systems.

Is KODA better than JSON?

For LLM input, yes. For general use, JSON is still better.

Does KODA always reduce tokens?

No. It works best on large structured datasets.

Where should I use KODA?

RAG pipelines, tool calls, and structured LLM input.

Try It

GitHub: https://github.com/Om7035/koda

pip install koda

Final Thoughts

If you're sending structured data to LLMs, you're likely wasting tokens.

KODA is a simple way to reduce that overhead.

It’s not a replacement for JSON it’s an optimization layer for LLM pipelines.

Feedback and contributions are welcome.

Stop Building Stale RAG: Meet Sentinel, the "Self-Healing" Knowledge Graph

Om Kawale — Tue, 06 Jan 2026 19:28:14 +0000

We all know the dirty secret of RAG (Retrieval-Augmented Generation) applications: They are great on Day 1, and broken on Day 30.

Why? Data Staleness.

You scrape your documentation, embed it into a Vector DB, and build a chatbot. It works perfectly. But two weeks later, the documentation changes. A price updates. A policy is rewritten.

Your Vector DB doesn't know. It happily retrieves the old chunks, and your LLM confidently hallucinates an answer based on outdated facts.

Re-indexing everything is expensive and slow. Building custom "update scripts" is boring.

I got tired of this problem, so I built Sentinel.

🛡️ What is Sentinel?

Sentinel is an open-source, autonomous ETL pipeline that treats your RAG data as a Living Knowledge Graph.

Instead of a "snapshot" vector store, Sentinel:

Watches your source URLs for changes.
Detects differences (byte-level hashing).
Heals the graph by extracting only the new facts using LLMs.
Maintains History using "Time Travel" edges.

It is pip-installable, model-agnostic (works with Ollama, OpenAI, Anthropic), and runs locally or in the cloud.

⚡ Real-World Example: SaaS Pricing Update

Imagine you are tracking a competitor's pricing page.

Day 1: The page says "Pro Plan is $29/mo".

Sentinel Graph: (Pro Plan) --[COSTS {valid_from: Day 1}]--> ($29)

Day 15: They silently raise the price to $49/mo.

Sentinel Graph:
- Sentinel detects the hash change.
- It retires the old edge: (Pro Plan) --[COSTS {valid_to: Day 15}]--> ($29)
- It creates a new edge: (Pro Plan) --[COSTS {valid_from: Day 15}]--> ($49)

The Result:

Standard RAG: Returns both $29 and $49, confusing the user.
Sentinel: Knows exactly which price is current, AND knows the price history.

⏳ The Killer Feature: "Time Travel"

Most RAG systems overwrite old data. Sentinel uses Bitemporal Versioning in Neo4j. This unlocks a whole new class of questions your AI can answer:

"How has the pricing structure changed since last month?"
"What were the safety guidelines before the 2024 update?"
"Show me the evolution of this compliance policy."

🎯 Top Use Cases

We built Sentinel for developers who need high-accuracy retrieval over changing data.

1. Legal & Compliance Tech

Laws and company policies change constantly. Sentinel ensures your bot never cites a repealed law or an outdated HR policy.

2. Market Intelligence

Track competitor websites, earnings reports, or news feeds. Sentinel builds a timeline of events automatically, allowing you to query "What happened to Competitor X in Q3?"

3. Developer Documentation Bots

APIs change. If a library deprecates a function, Sentinel updates the graph so your bot stops recommending broken code.

⚙️ How It Works (The Loop)

Sentinel runs an autonomous "Healing Loop" in the background:

Monitor: It checks the content hash of watched URLs. If the hash matches the database, it sleeps. Cost: $0.
Diff: If the hash changes, it scrapes the new content.
Extract: It uses an LLM (via LiteLLM + Instructor) to extract nodes and relationships.
Upsert: It updates the Graph Database, handling the temporal logic automatically.

🚀 Quick Start

You can add this to your existing Python project in minutes.

pip install sentinel-core

from sentinel_core import Sentinel

# Initialize (uses standard env vars for Neo4j & LLM)
sentinel = Sentinel()

# Start watching a URL
# Sentinel will scrape, extract, and build the initial graph
await sentinel.process_url("[https://docs.example.com/pricing](https://docs.example.com/pricing)")

# Run the autonomous healing loop
# It will check for updates every 24 hours
await sentinel.run_healing_loop(interval_hours=24)

That's it. You now have a self-updating knowledge graph.

🤝 Open Source & Roadmap

I built the core engine, but there is so much potential here. I am looking for contributors to help with:

Entity Resolution: Smarter merging of duplicate nodes (e.g., "Tesla" vs "Tesla Inc").
UI Dashboard: We have a basic API, but a visualization of the graph "healing" in real-time would be epic.
More Scrapers: Adding support for Playwright or Selenium.

🌟 Support the Project

If you think "Self-Healing RAG" is a cool concept, please consider starring the repo! It helps us gain visibility and attracts more contributors to make the tool better for everyone.

👉 Star Sentinel on GitHub

I'm active in the comments—let me know what you think about the "Temporal Graph" approach vs standard Vector Stores!

Stop Building Auth From Scratch: Try AuthKit Instead

Om Kawale — Tue, 21 Oct 2025 08:54:11 +0000

Spend 30 seconds setting up production-ready authentication instead of 3+ weeks of headaches.

Auth is a pain. JWT token rotation, OAuth flows, secure password hashing, CORS issues, XSS vulnerabilities… it never ends. AuthKit fixes all of that in one setup.

git clone https://github.com/Om7035/AuthKit.git
cd AuthKit
docker-compose up -d
# Visit http://localhost:3000
# Login: demo@authkit.com / password

Why AuthKit?

Enterprise security: JWT + httpOnly cookies, bcrypt, token refresh, rate limiting, SQL injection protection
Developer-friendly: One-command Docker setup, React+Tailwind UI, Google OAuth & Firebase ready, PostgreSQL included
Flexible backend: Choose Google OAuth for simplicity or Firebase Auth for full features
Complete API docs and live demo

Who should use AuthKit?

Solo devs & startups: Ship MVPs faster
Enterprise teams: Secure, scalable, production-ready
Learning devs: Study, contribute, and build on real auth

Get involved! AuthKit is open source and thrives on community support.

Report bugs
Suggest features
Improve documentation
Submit pull requests
Star the repo to show your support

Visit and contribute:

GitHub: https://github.com/Om7035/AuthKit

AuthKit — Stop building, start shipping.