chenxiao5580-cmd

Posted on Jun 23

Use a flat-priced, auto-routing LLM API in Aider or Cline — one npx command

#ai #opensource #tutorial #llm

Coding assistants like Aider, Cline, and Continue all speak the OpenAI wire protocol — point them at a base_url, give them an API key, done. That makes swapping in a different LLM backend trivial... if that backend uses Authorization: Bearer.

The flat-priced, auto-routing API I'd been using doesn't. It's distributed through RapidAPI, which authenticates with an X-RapidAPI-Key header instead of Bearer. So I couldn't just drop it into Aider. The fix turned out to be ~120 lines, so I open-sourced it.

modelis-openai

A zero-dependency local proxy (MIT, Node 18+). It listens on 127.0.0.1, speaks plain OpenAI, rewrites the auth header, and forwards to the upstream gateway. Streaming (stream: true) is piped straight through, so token-by-token output works exactly as with the OpenAI API.

your tool ──OpenAI(Bearer)──▶ modelis-openai (localhost) ──X-RapidAPI-Key──▶ upstream ──▶ best model

Quickstart

npx modelis-openai

Then point any OpenAI-compatible tool at it:

Setting	Value
Base URL	`http://127.0.0.1:8787/v1`
API key	your RapidAPI key
Model	`modelis-auto`

Drop it into your tool

Aider

export OPENAI_API_BASE=http://127.0.0.1:8787/v1
export OPENAI_API_KEY=<your-rapidapi-key>
aider --model openai/modelis-auto

Cline / Roo Code — API Provider OpenAI Compatible, Base URL http://127.0.0.1:8787/v1, Model ID modelis-auto.

Continue (~/.continue/config.yaml)

models:
  - name: Modelis
    provider: openai
    model: modelis-auto
    apiBase: http://127.0.0.1:8787/v1
    apiKey: <your-rapidapi-key>

Any OpenAI SDK

from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:8787/v1", api_key="<your-rapidapi-key>")
print(client.chat.completions.create(
    model="modelis-auto",
    messages=[{"role": "user", "content": "Hello"}],
).choices[0].message.content)

How it works

Reads the key from Authorization: Bearer <key> (or MODELIS_RAPIDAPI_KEY).
Rewrites the request model to modelis-auto (configurable).
Forwards to the RapidAPI gateway with X-RapidAPI-Key / X-RapidAPI-Host.
Relays the response — including SSE streams and rate-limit headers — unchanged.

It also answers GET /v1/models and GET /health so tools that probe on startup don't choke.

Honest notes

It routes to a paid API (there's a free tier to start). The point of the proxy is to remove the integration friction, not to give anything away.
Cursor isn't supported — it sends requests from its own servers, so a localhost endpoint can't be reached. This is for tools that call the API from your machine.

Top comments (1)

Luis • Jun 23

This setup is basically about routing Aider / Cline through a cheap or unified LLM gateway instead of calling OpenAI/Anthropic directly.

A clean, practical version of what that post is likely showing:

You can run Aider or Cline through an OpenAI-compatible proxy (like OpenRouter-style routing, LiteLLM, RelayPlane, or similar) by setting:

export OPENAI_API_BASE="localhost:PORT"
export OPENAI_API_KEY="your-key-or-proxy-key"

Then launch tools normally:

aider --model gpt-4o

or for Cline, just point the API settings to the same base URL in VS Code settings.

What this enables:

One endpoint → multiple models (Claude, GPT, Gemini, DeepSeek, etc.)
Automatic routing (cheap model for simple edits, stronger model for refactors)
Cost control per task instead of per tool
Works with CLI tools like Aider and IDE tools like Cline without changes

If you want a more “production-grade” version of this idea, people usually combine:

OpenRouter or LiteLLM proxy (model aggregation layer)
Aider / Cline (client tools)
optional routing rules (cost/quality-based)

So the real value here isn’t the npx command itself — it’s the abstraction layer that makes all coding agents model-agnostic and cheaper to run.

Good practical post overall—this is the direction most AI dev tooling is heading 🤝

DEV Community