OpenAI's Codex CLI is one of the best terminal-based coding agents available. It reads your codebase, runs commands, edits files, and iterates on code -- all from your terminal.
But here is what most developers miss: you are not locked into GPT models. Codex CLI supports custom OpenAI-compatible API endpoints, which means you can route it through any provider that speaks the OpenAI wire protocol. Claude, DeepSeek, Gemini, Mistral -- all fair game.
This guide shows you exactly how to set it up.
Why Use Other Models with Codex?
Different models have different strengths. Sticking to one model for every task is leaving performance (and money) on the table:
- Claude Sonnet 4.6 / Opus 4.7 -- Superior at multi-step reasoning, complex refactors, and understanding large codebases. Fewer hallucinated function calls.
- DeepSeek V3 -- Extremely cost-effective for bulk operations: test generation, boilerplate, documentation, translations. ~90% cheaper than GPT-5.5.
- Gemini 2.5 Pro -- Strong at multimodal tasks and long-context analysis. 1M+ token context window.
- GPT-5.5 -- Still the default and a solid all-rounder. Best Codex integration since it is the native model.
The play: use a gateway that gives you one API key for all models, then swap models in Codex depending on the task.
Setup: Two Methods
Method 1: Environment Variables (Quick)
The fastest way. Set two environment variables and launch Codex:
# In your ~/.zshrc or ~/.bashrc
export OPENAI_API_KEY="your-gateway-api-key"
export OPENAI_BASE_URL="https://api.futurmix.ai/v1"
Reload your shell and run Codex with a specific model:
source ~/.zshrc
codex --model claude-sonnet-4-6 "refactor this function to use async/await"
That is it. Codex sends requests to your gateway instead of OpenAI directly.
Method 2: config.toml (Recommended for Multiple Providers)
For a more permanent setup, edit ~/.codex/config.toml. This lets you define named providers and switch between them:
# ~/.codex/config.toml
# Default model
model = "claude-sonnet-4-6"
# Custom provider pointing to your gateway
model_provider = "gateway"
[model_providers.gateway]
name = "FuturMix Gateway"
base_url = "https://api.futurmix.ai/v1"
wire_api = "responses"
env_key = "FUTURMIX_API_KEY"
Then set the API key in your shell:
export FUTURMIX_API_KEY="sk-your-key-here"
Now codex uses Claude Sonnet 4.6 by default. Override per-session with --model:
codex --model deepseek-chat "generate unit tests for src/utils/"
codex --model claude-opus-4-7 "find and fix the race condition in the worker pool"
Method 3: Quick Override Without Editing Config
If you just want to try it once without changing any config files:
OPENAI_BASE_URL="https://api.futurmix.ai/v1" \
OPENAI_API_KEY="sk-your-key" \
codex --model claude-sonnet-4-6 "explain this codebase"
Best Models for Different Codex Tasks
Not every task needs the most expensive model. Here is a practical breakdown:
| Task | Recommended Model | Input/Output Cost | Why |
|---|---|---|---|
| Complex refactoring | Claude Opus 4.7 | $4.50 / $22.50 | Best multi-step reasoning |
| General coding | Claude Sonnet 4.6 | $2.70 / $13.50 | Strong balance of speed + quality |
| Quick fixes, linting | Claude Haiku 4.5 | $0.90 / $4.50 | Fast and cheap |
| Bulk test generation | DeepSeek V3 | $0.19 / $0.77 | 90%+ cheaper, good enough quality |
| Boilerplate / docs | DeepSeek V3 | $0.19 / $0.77 | No need to pay premium for templates |
| Code review | GPT-5.5 | $2.10 / $8.40 | Solid all-rounder |
| Long file analysis | Gemini 2.5 Pro | Varies | 1M+ context window |
Prices shown are per million tokens through a gateway (discounted).
Cost Comparison: Direct vs. Gateway
Using models through a gateway like FuturMix is cheaper than going direct to each provider. Here is the math:
| Model | Direct (In/Out per 1M) | Gateway (In/Out per 1M) | Savings |
|---|---|---|---|
| Claude Sonnet 4.6 | $3.00 / $15.00 | $2.70 / $13.50 | 10% off |
| Claude Opus 4.7 | $5.00 / $25.00 | $4.50 / $22.50 | 10% off |
| Claude Haiku 4.5 | $1.00 / $5.00 | $0.90 / $4.50 | 10% off |
| GPT-5.5 | $3.00 / $12.00 | $2.10 / $8.40 | 30% off |
| DeepSeek V3 | $0.27 / $1.10 | $0.19 / $0.77 | 30% off |
On a typical coding session burning 500K input + 100K output tokens, switching from GPT-5.5 direct ($1.50 + $1.20 = $2.70) to DeepSeek V3 via gateway ($0.095 + $0.077 = $0.17) saves you 94%.
Pro Tips for Cost Optimization
1. Use model aliases in your shell
# Add to ~/.zshrc
alias codex-cheap='codex --model deepseek-chat'
alias codex-smart='codex --model claude-sonnet-4-6'
alias codex-max='codex --model claude-opus-4-7'
Now run codex-cheap "add docstrings to all functions in src/" for bulk tasks.
2. Match model to task complexity
Do not use Opus for generating boilerplate. Do not use DeepSeek for complex architectural decisions. The 10x price difference exists for a reason.
3. Use sandbox mode for safety
When running with less-tested models, tighten the sandbox:
codex --model deepseek-chat --sandbox read-only "analyze this codebase"
4. Set a budget-friendly default
In config.toml, set your default to a mid-tier model and only escalate when needed:
model = "claude-sonnet-4-6"
Works With Other AI Coding Tools Too
The same gateway setup works across the entire AI coding tool ecosystem. One API key, every tool:
| Tool | Config Method | What to Set |
|---|---|---|
| Codex CLI |
config.toml or env vars |
OPENAI_BASE_URL + OPENAI_API_KEY
|
| Aider |
--openai-api-base flag |
OPENAI_API_BASE env var |
| Claude Code | Direct API key |
ANTHROPIC_API_KEY + ANTHROPIC_BASE_URL
|
| Cursor | Settings > Models | Custom OpenAI-compatible endpoint |
| Continue |
config.json provider block |
apiBase field in provider config |
| Roo Code | Settings > Provider | Custom API URL + key |
| Cline | Settings > API Provider | OpenAI-compatible endpoint |
Set up the gateway once, use it everywhere.
Troubleshooting
"Model not found" error
The model name you pass to --model must match the gateway's model ID exactly. Check your provider's model list. Common mistake: using claude-3.5-sonnet instead of the correct identifier like claude-sonnet-4-6.
"Authentication failed"
Make sure OPENAI_API_KEY (or the env_key you defined in config.toml) is set and exported in your current shell session. Run echo $OPENAI_API_KEY to verify.
Responses API vs. Chat Completions
Codex CLI prefers the Responses API (/v1/responses). If your gateway only supports Chat Completions, set wire_api = "chat" in your provider config:
[model_providers.gateway]
base_url = "https://api.futurmix.ai/v1"
wire_api = "chat"
env_key = "FUTURMIX_API_KEY"
Slow responses with large codebases
Some models have lower throughput than GPT. If Codex feels slow, try a faster model for the initial scan and switch to a smarter model for the actual edit.
Config not loading
Codex reads config from ~/.codex/config.toml. Make sure the directory exists:
mkdir -p ~/.codex
Configuration priority: CLI flags > profile settings > config.toml defaults.
Get Started
FuturMix gives you one API key for 22+ models -- Claude, GPT, DeepSeek, Gemini, Mistral, and more. OpenAI-compatible endpoint, so it works with Codex CLI out of the box. Models are 10-30% cheaper than going direct.
- Sign up at futurmix.ai
- Grab your API key
- Set
OPENAI_BASE_URL=https://api.futurmix.ai/v1and your key - Run
codex --model claude-sonnet-4-6 "your task here"
Stop paying full price for one model. Use the right model for every task.
Top comments (0)