Codex custom provider: a practical base_url setup for cheaper AI coding runs

#ai #llm #api #productivity

There is a very practical reason developers care about custom providers in Codex-style workflows:

Cost.

Not because it is fun to collect API providers. Not because every team wants another dashboard. The reason is simpler: once an AI coding agent becomes useful, people use it more, and then the bill starts to matter.

The best custom provider setup should not force you to rewrite your tooling. It should preserve the same OpenAI-compatible shape and only change the route.

The minimum useful config

For most experiments, I want something this boring:

OPENAI_BASE_URL=https://incat.ai/v1
OPENAI_API_KEY=sk_incat_your_key_here
OPENAI_MODEL=incat-smarter

That is the whole idea:

keep the client shape
keep the coding workflow
swap the backend route
measure whether the bill gets smaller

JavaScript example

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://incat.ai/v1",
  apiKey: process.env.OPENAI_API_KEY,
});

const response = await client.chat.completions.create({
  model: "incat-smarter",
  messages: [
    { role: "user", content: "Review this small refactor." }
  ],
});

console.log(response.choices[0].message.content);

Python example

from openai import OpenAI
import os

client = OpenAI(
    base_url="https://incat.ai/v1",
    api_key=os.getenv("OPENAI_API_KEY"),
)

response = client.chat.completions.create(
    model="incat-smarter",
    messages=[{"role": "user", "content": "Explain this stack trace."}],
)

print(response.choices[0].message.content)

What to route cheaper

Do not send everything to the cheapest model and call it optimization. That usually backfires.

Good cheaper-route candidates:

boilerplate generation
test scaffolding
log and stack trace explanation
simple code summaries
low-risk refactors
first drafts of scripts

Keep expensive models for:

final review
security work
complex architecture
high-risk migrations
ambiguous product logic

Why prepaid matters

For AI coding, prepaid credits are underrated.

Monthly subscriptions feel clean until usage patterns get weird. A busy coding week, a runaway agent loop, or a few large repo scans can make the real cost hard to see.

With prepaid routing, you get a simple constraint: when the balance moves, something actually ran. That makes experiments easier to trust.