Coding assistants like Aider, Cline, and Continue all speak the OpenAI wire protocol — point them at a base_url, give them an API key, done. That makes swapping in a different LLM backend trivial... if that backend uses Authorization: Bearer.
The flat-priced, auto-routing API I'd been using doesn't. It's distributed through RapidAPI, which authenticates with an X-RapidAPI-Key header instead of Bearer. So I couldn't just drop it into Aider. The fix turned out to be ~120 lines, so I open-sourced it.
modelis-openai
A zero-dependency local proxy (MIT, Node 18+). It listens on 127.0.0.1, speaks plain OpenAI, rewrites the auth header, and forwards to the upstream gateway. Streaming (stream: true) is piped straight through, so token-by-token output works exactly as with the OpenAI API.
your tool ──OpenAI(Bearer)──▶ modelis-openai (localhost) ──X-RapidAPI-Key──▶ upstream ──▶ best model
Quickstart
npx modelis-openai
Then point any OpenAI-compatible tool at it:
| Setting | Value |
|---|---|
| Base URL | http://127.0.0.1:8787/v1 |
| API key | your RapidAPI key |
| Model | modelis-auto |
Drop it into your tool
Aider
export OPENAI_API_BASE=http://127.0.0.1:8787/v1
export OPENAI_API_KEY=<your-rapidapi-key>
aider --model openai/modelis-auto
Cline / Roo Code — API Provider OpenAI Compatible, Base URL http://127.0.0.1:8787/v1, Model ID modelis-auto.
Continue (~/.continue/config.yaml)
models:
- name: Modelis
provider: openai
model: modelis-auto
apiBase: http://127.0.0.1:8787/v1
apiKey: <your-rapidapi-key>
Any OpenAI SDK
from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:8787/v1", api_key="<your-rapidapi-key>")
print(client.chat.completions.create(
model="modelis-auto",
messages=[{"role": "user", "content": "Hello"}],
).choices[0].message.content)
How it works
- Reads the key from
Authorization: Bearer <key>(orMODELIS_RAPIDAPI_KEY). - Rewrites the request
modeltomodelis-auto(configurable). - Forwards to the RapidAPI gateway with
X-RapidAPI-Key/X-RapidAPI-Host. - Relays the response — including SSE streams and rate-limit headers — unchanged.
It also answers GET /v1/models and GET /health so tools that probe on startup don't choke.
Honest notes
- It routes to a paid API (there's a free tier to start). The point of the proxy is to remove the integration friction, not to give anything away.
-
Cursor isn't supported — it sends requests from its own servers, so a
localhostendpoint can't be reached. This is for tools that call the API from your machine.
Links
- GitHub: https://github.com/modelishub/modelis-openai
- npm: https://www.npmjs.com/package/modelis-openai
- The API it bridges: https://rapidapi.com/chenxiao5580/api/modelis-auto-chat
If you try it in a tool I didn't list, I'd love to hear how it goes.
Top comments (1)
This setup is basically about routing Aider / Cline through a cheap or unified LLM gateway instead of calling OpenAI/Anthropic directly.
A clean, practical version of what that post is likely showing:
You can run Aider or Cline through an OpenAI-compatible proxy (like OpenRouter-style routing, LiteLLM, RelayPlane, or similar) by setting:
export OPENAI_API_BASE="localhost:PORT"
export OPENAI_API_KEY="your-key-or-proxy-key"
Then launch tools normally:
aider --model gpt-4o
or for Cline, just point the API settings to the same base URL in VS Code settings.
What this enables:
One endpoint → multiple models (Claude, GPT, Gemini, DeepSeek, etc.)
Automatic routing (cheap model for simple edits, stronger model for refactors)
Cost control per task instead of per tool
Works with CLI tools like Aider and IDE tools like Cline without changes
If you want a more “production-grade” version of this idea, people usually combine:
OpenRouter or LiteLLM proxy (model aggregation layer)
Aider / Cline (client tools)
optional routing rules (cost/quality-based)
So the real value here isn’t the npx command itself — it’s the abstraction layer that makes all coding agents model-agnostic and cheaper to run.
Good practical post overall—this is the direction most AI dev tooling is heading 🤝