Dor Amir

Posted on Mar 2

NadirClaw: Getting Started in 5 Minutes

#ai #tutorial #llm #opensource

NadirClaw is an open-source LLM router that cuts your AI API costs by 40-70%. It routes simple prompts to cheap models and complex ones to premium models, automatically. Zero code changes.

This guide gets you running in under 5 minutes.

What You'll Need

Python 3.10+
A Gemini API key (free tier works, 20 requests/day: https://aistudio.google.com/apikey)

That's it. No Docker, no database, no extra services.

Install

pip install nadirclaw

Or from source:

curl -fsSL https://raw.githubusercontent.com/doramirdor/NadirClaw/main/install.sh | sh

Configure

Set your Gemini API key:

nadirclaw auth add --provider google --key AIza...

Or export it:

export GEMINI_API_KEY=AIza...

Start the Router

nadirclaw serve --verbose

NadirClaw starts on http://localhost:8856 with sensible defaults:

Simple prompts → Gemini 2.5 Flash (cheap, fast)
Complex prompts → Gemini 2.5 Pro (powerful)

You'll see logs like this:

[NadirClaw] Starting on http://localhost:8856
[NadirClaw] Simple model: gemini-2.5-flash
[NadirClaw] Complex model: gemini-2.5-pro
[NadirClaw] Ready.

Test It

Send a request:

curl http://localhost:8856/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "What is 2+2?"}]}'

NadirClaw classifies the prompt in ~10ms and routes it to the right model. Simple question? Gemini Flash. Complex refactoring? Gemini Pro.

Use It with Your Tools

NadirClaw is OpenAI-compatible. Point any tool at it:

Claude Code:

export ANTHROPIC_BASE_URL=http://localhost:8856/v1
export ANTHROPIC_API_KEY=local
claude

Cursor:

In Cursor settings, add a custom model:

Base URL: http://localhost:8856/v1
Model: auto

OpenClaw:

nadirclaw openclaw onboard

Python:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8856/v1",
    api_key="local",
)

response = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "Explain async/await"}],
)
print(response.choices[0].message.content)

Check Your Savings

After using NadirClaw for a bit, run:

nadirclaw report

You'll see:

Total requests
Tier distribution (how many were simple vs. complex)
Cost breakdown
Token usage

Then:

nadirclaw savings

This shows exactly how much money you saved compared to routing everything to the expensive model.

What Just Happened?

Every request goes through a lightweight classifier:

Prompt comes in
NadirClaw computes a sentence embedding (~10ms)
Routes to the right model based on complexity
Forwards the request and returns the response

Simple prompts (reading files, quick questions, small edits) hit the cheap model. Complex prompts (refactoring, architecture, multi-step changes) hit the premium model. You get the savings without compromising quality.

Next Steps

Add more providers: nadirclaw auth add --provider anthropic --key sk-ant-...
Use better models: Set NADIRCLAW_COMPLEX_MODEL=claude-sonnet-4-5 in ~/.nadirclaw/.env
Go fully local: Install Ollama, then NADIRCLAW_SIMPLE_MODEL=ollama/llama3.1:8b nadirclaw serve
Monitor in real time: Run nadirclaw dashboard for a live terminal dashboard

Troubleshooting

"Rate limit exceeded" on Gemini free tier?

You hit the 20 requests/day limit. Either:

Wait a day
Add another provider: nadirclaw auth add --provider openai --key sk-...
Use Ollama (local, free): NADIRCLAW_SIMPLE_MODEL=ollama/llama3.1:8b nadirclaw serve

Classifier taking too long on first request?

The first request downloads the embedding model (~80 MB) and loads it into memory. Takes 2-3 seconds once. After that, classification is ~10ms per request.

Want to force a specific model for a request?

Set model in your request:

model: "premium" → always use the complex model
model: "eco" → always use the simple model
model: "sonnet" → use Claude Sonnet (model alias)

Why NadirClaw?

Most LLM usage doesn't need a $15/M-token model. 60-70% of prompts in typical coding sessions are simple enough for a $0.50/M-token model. But without classification, everything hits the expensive default.

NadirClaw fixes that. It's a local proxy, not a middleman service. Your API keys never leave your machine. No third-party tokens, no subsidized pricing that disappears in six months, no platform risk.

You keep control. You cut costs.

Full disclosure: I'm the author. NadirClaw is open source (MIT) and lives at https://github.com/doramirdor/NadirClaw. If you find it useful, give it a star.

DEV Community