There is a very practical reason developers care about custom providers in Codex-style workflows:
Cost.
Not because it is fun to collect API providers. Not because every team wants another dashboard. The reason is simpler: once an AI coding agent becomes useful, people use it more, and then the bill starts to matter.
The best custom provider setup should not force you to rewrite your tooling. It should preserve the same OpenAI-compatible shape and only change the route.
The minimum useful config
For most experiments, I want something this boring:
OPENAI_BASE_URL=https://incat.ai/v1
OPENAI_API_KEY=sk_incat_your_key_here
OPENAI_MODEL=incat-smarter
That is the whole idea:
- keep the client shape
- keep the coding workflow
- swap the backend route
- measure whether the bill gets smaller
JavaScript example
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://incat.ai/v1",
apiKey: process.env.OPENAI_API_KEY,
});
const response = await client.chat.completions.create({
model: "incat-smarter",
messages: [
{ role: "user", content: "Review this small refactor." }
],
});
console.log(response.choices[0].message.content);
Python example
from openai import OpenAI
import os
client = OpenAI(
base_url="https://incat.ai/v1",
api_key=os.getenv("OPENAI_API_KEY"),
)
response = client.chat.completions.create(
model="incat-smarter",
messages=[{"role": "user", "content": "Explain this stack trace."}],
)
print(response.choices[0].message.content)
What to route cheaper
Do not send everything to the cheapest model and call it optimization. That usually backfires.
Good cheaper-route candidates:
- boilerplate generation
- test scaffolding
- log and stack trace explanation
- simple code summaries
- low-risk refactors
- first drafts of scripts
Keep expensive models for:
- final review
- security work
- complex architecture
- high-risk migrations
- ambiguous product logic
Why prepaid matters
For AI coding, prepaid credits are underrated.
Monthly subscriptions feel clean until usage patterns get weird. A busy coding week, a runaway agent loop, or a few large repo scans can make the real cost hard to see.
With prepaid routing, you get a simple constraint: when the balance moves, something actually ran. That makes experiments easier to trust.
A useful way to test it
Take one workflow you already run:
- Ask your current setup to generate tests for a small module.
- Run a similar task through a cheaper OpenAI-compatible route.
- Compare output quality and cost.
- Keep the expensive model for final review if needed.
If the cheaper route saves money without creating extra cleanup work, it is useful. If not, skip it.
The point is not ideology. The point is the receipt.
Tooling
I made a small config generator for this pattern:
https://incat.ai/codex-config-generator.html
And if you want to estimate whether this is even worth trying:
https://incat.ai/codex-cost.html
inCat is the gateway behind these examples. The positioning is intentionally narrow:
Keep Codex-style workflows. Route suitable work cheaper.
Top comments (0)