DEV Community

FuturMix
FuturMix

Posted on

How to Use OpenAI Codex CLI with Multiple AI Models (Not Just GPT)

OpenAI's Codex CLI is one of the best terminal-based coding agents available. It reads your codebase, runs commands, edits files, and iterates on code -- all from your terminal.

But here is what most developers miss: you are not locked into GPT models. Codex CLI supports custom OpenAI-compatible API endpoints, which means you can route it through any provider that speaks the OpenAI wire protocol. Claude, DeepSeek, Gemini, Mistral -- all fair game.

This guide shows you exactly how to set it up.

Why Use Other Models with Codex?

Different models have different strengths. Sticking to one model for every task is leaving performance (and money) on the table:

  • Claude Sonnet 4.6 / Opus 4.7 -- Superior at multi-step reasoning, complex refactors, and understanding large codebases. Fewer hallucinated function calls.
  • DeepSeek V3 -- Extremely cost-effective for bulk operations: test generation, boilerplate, documentation, translations. ~90% cheaper than GPT-5.5.
  • Gemini 2.5 Pro -- Strong at multimodal tasks and long-context analysis. 1M+ token context window.
  • GPT-5.5 -- Still the default and a solid all-rounder. Best Codex integration since it is the native model.

The play: use a gateway that gives you one API key for all models, then swap models in Codex depending on the task.

Setup: Two Methods

Method 1: Environment Variables (Quick)

The fastest way. Set two environment variables and launch Codex:

# In your ~/.zshrc or ~/.bashrc
export OPENAI_API_KEY="your-gateway-api-key"
export OPENAI_BASE_URL="https://api.futurmix.ai/v1"
Enter fullscreen mode Exit fullscreen mode

Reload your shell and run Codex with a specific model:

source ~/.zshrc
codex --model claude-sonnet-4-6 "refactor this function to use async/await"
Enter fullscreen mode Exit fullscreen mode

That is it. Codex sends requests to your gateway instead of OpenAI directly.

Method 2: config.toml (Recommended for Multiple Providers)

For a more permanent setup, edit ~/.codex/config.toml. This lets you define named providers and switch between them:

# ~/.codex/config.toml

# Default model
model = "claude-sonnet-4-6"

# Custom provider pointing to your gateway
model_provider = "gateway"

[model_providers.gateway]
name = "FuturMix Gateway"
base_url = "https://api.futurmix.ai/v1"
wire_api = "responses"
env_key = "FUTURMIX_API_KEY"
Enter fullscreen mode Exit fullscreen mode

Then set the API key in your shell:

export FUTURMIX_API_KEY="sk-your-key-here"
Enter fullscreen mode Exit fullscreen mode

Now codex uses Claude Sonnet 4.6 by default. Override per-session with --model:

codex --model deepseek-chat "generate unit tests for src/utils/"
codex --model claude-opus-4-7 "find and fix the race condition in the worker pool"
Enter fullscreen mode Exit fullscreen mode

Method 3: Quick Override Without Editing Config

If you just want to try it once without changing any config files:

OPENAI_BASE_URL="https://api.futurmix.ai/v1" \
OPENAI_API_KEY="sk-your-key" \
codex --model claude-sonnet-4-6 "explain this codebase"
Enter fullscreen mode Exit fullscreen mode

Best Models for Different Codex Tasks

Not every task needs the most expensive model. Here is a practical breakdown:

Task Recommended Model Input/Output Cost Why
Complex refactoring Claude Opus 4.7 $4.50 / $22.50 Best multi-step reasoning
General coding Claude Sonnet 4.6 $2.70 / $13.50 Strong balance of speed + quality
Quick fixes, linting Claude Haiku 4.5 $0.90 / $4.50 Fast and cheap
Bulk test generation DeepSeek V3 $0.19 / $0.77 90%+ cheaper, good enough quality
Boilerplate / docs DeepSeek V3 $0.19 / $0.77 No need to pay premium for templates
Code review GPT-5.5 $2.10 / $8.40 Solid all-rounder
Long file analysis Gemini 2.5 Pro Varies 1M+ context window

Prices shown are per million tokens through a gateway (discounted).

Cost Comparison: Direct vs. Gateway

Using models through a gateway like FuturMix is cheaper than going direct to each provider. Here is the math:

Model Direct (In/Out per 1M) Gateway (In/Out per 1M) Savings
Claude Sonnet 4.6 $3.00 / $15.00 $2.70 / $13.50 10% off
Claude Opus 4.7 $5.00 / $25.00 $4.50 / $22.50 10% off
Claude Haiku 4.5 $1.00 / $5.00 $0.90 / $4.50 10% off
GPT-5.5 $3.00 / $12.00 $2.10 / $8.40 30% off
DeepSeek V3 $0.27 / $1.10 $0.19 / $0.77 30% off

On a typical coding session burning 500K input + 100K output tokens, switching from GPT-5.5 direct ($1.50 + $1.20 = $2.70) to DeepSeek V3 via gateway ($0.095 + $0.077 = $0.17) saves you 94%.

Pro Tips for Cost Optimization

1. Use model aliases in your shell

# Add to ~/.zshrc
alias codex-cheap='codex --model deepseek-chat'
alias codex-smart='codex --model claude-sonnet-4-6'
alias codex-max='codex --model claude-opus-4-7'
Enter fullscreen mode Exit fullscreen mode

Now run codex-cheap "add docstrings to all functions in src/" for bulk tasks.

2. Match model to task complexity

Do not use Opus for generating boilerplate. Do not use DeepSeek for complex architectural decisions. The 10x price difference exists for a reason.

3. Use sandbox mode for safety

When running with less-tested models, tighten the sandbox:

codex --model deepseek-chat --sandbox read-only "analyze this codebase"
Enter fullscreen mode Exit fullscreen mode

4. Set a budget-friendly default

In config.toml, set your default to a mid-tier model and only escalate when needed:

model = "claude-sonnet-4-6"
Enter fullscreen mode Exit fullscreen mode

Works With Other AI Coding Tools Too

The same gateway setup works across the entire AI coding tool ecosystem. One API key, every tool:

Tool Config Method What to Set
Codex CLI config.toml or env vars OPENAI_BASE_URL + OPENAI_API_KEY
Aider --openai-api-base flag OPENAI_API_BASE env var
Claude Code Direct API key ANTHROPIC_API_KEY + ANTHROPIC_BASE_URL
Cursor Settings > Models Custom OpenAI-compatible endpoint
Continue config.json provider block apiBase field in provider config
Roo Code Settings > Provider Custom API URL + key
Cline Settings > API Provider OpenAI-compatible endpoint

Set up the gateway once, use it everywhere.

Troubleshooting

"Model not found" error
The model name you pass to --model must match the gateway's model ID exactly. Check your provider's model list. Common mistake: using claude-3.5-sonnet instead of the correct identifier like claude-sonnet-4-6.

"Authentication failed"
Make sure OPENAI_API_KEY (or the env_key you defined in config.toml) is set and exported in your current shell session. Run echo $OPENAI_API_KEY to verify.

Responses API vs. Chat Completions
Codex CLI prefers the Responses API (/v1/responses). If your gateway only supports Chat Completions, set wire_api = "chat" in your provider config:

[model_providers.gateway]
base_url = "https://api.futurmix.ai/v1"
wire_api = "chat"
env_key = "FUTURMIX_API_KEY"
Enter fullscreen mode Exit fullscreen mode

Slow responses with large codebases
Some models have lower throughput than GPT. If Codex feels slow, try a faster model for the initial scan and switch to a smarter model for the actual edit.

Config not loading
Codex reads config from ~/.codex/config.toml. Make sure the directory exists:

mkdir -p ~/.codex
Enter fullscreen mode Exit fullscreen mode

Configuration priority: CLI flags > profile settings > config.toml defaults.

Get Started

FuturMix gives you one API key for 22+ models -- Claude, GPT, DeepSeek, Gemini, Mistral, and more. OpenAI-compatible endpoint, so it works with Codex CLI out of the box. Models are 10-30% cheaper than going direct.

  1. Sign up at futurmix.ai
  2. Grab your API key
  3. Set OPENAI_BASE_URL=https://api.futurmix.ai/v1 and your key
  4. Run codex --model claude-sonnet-4-6 "your task here"

Stop paying full price for one model. Use the right model for every task.

Top comments (0)