DEV Community

inCat.ai
inCat.ai

Posted on

Codex custom provider: a practical base_url setup for cheaper AI coding runs

There is a very practical reason developers care about custom providers in Codex-style workflows:

Cost.

Not because it is fun to collect API providers. Not because every team wants another dashboard. The reason is simpler: once an AI coding agent becomes useful, people use it more, and then the bill starts to matter.

The best custom provider setup should not force you to rewrite your tooling. It should preserve the same OpenAI-compatible shape and only change the route.

The minimum useful config

For most experiments, I want something this boring:

OPENAI_BASE_URL=https://incat.ai/v1
OPENAI_API_KEY=sk_incat_your_key_here
OPENAI_MODEL=incat-smarter
Enter fullscreen mode Exit fullscreen mode

That is the whole idea:

  • keep the client shape
  • keep the coding workflow
  • swap the backend route
  • measure whether the bill gets smaller

JavaScript example

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://incat.ai/v1",
  apiKey: process.env.OPENAI_API_KEY,
});

const response = await client.chat.completions.create({
  model: "incat-smarter",
  messages: [
    { role: "user", content: "Review this small refactor." }
  ],
});

console.log(response.choices[0].message.content);
Enter fullscreen mode Exit fullscreen mode

Python example

from openai import OpenAI
import os

client = OpenAI(
    base_url="https://incat.ai/v1",
    api_key=os.getenv("OPENAI_API_KEY"),
)

response = client.chat.completions.create(
    model="incat-smarter",
    messages=[{"role": "user", "content": "Explain this stack trace."}],
)

print(response.choices[0].message.content)
Enter fullscreen mode Exit fullscreen mode

What to route cheaper

Do not send everything to the cheapest model and call it optimization. That usually backfires.

Good cheaper-route candidates:

  • boilerplate generation
  • test scaffolding
  • log and stack trace explanation
  • simple code summaries
  • low-risk refactors
  • first drafts of scripts

Keep expensive models for:

  • final review
  • security work
  • complex architecture
  • high-risk migrations
  • ambiguous product logic

Why prepaid matters

For AI coding, prepaid credits are underrated.

Monthly subscriptions feel clean until usage patterns get weird. A busy coding week, a runaway agent loop, or a few large repo scans can make the real cost hard to see.

With prepaid routing, you get a simple constraint: when the balance moves, something actually ran. That makes experiments easier to trust.

A useful way to test it

Take one workflow you already run:

  1. Ask your current setup to generate tests for a small module.
  2. Run a similar task through a cheaper OpenAI-compatible route.
  3. Compare output quality and cost.
  4. Keep the expensive model for final review if needed.

If the cheaper route saves money without creating extra cleanup work, it is useful. If not, skip it.

The point is not ideology. The point is the receipt.

Tooling

I made a small config generator for this pattern:

https://incat.ai/codex-config-generator.html

And if you want to estimate whether this is even worth trying:

https://incat.ai/codex-cost.html

inCat is the gateway behind these examples. The positioning is intentionally narrow:

Keep Codex-style workflows. Route suitable work cheaper.

Top comments (0)