DEV Community

FuturMix
FuturMix

Posted on • Edited on

How to Use Claude Code with Any API Provider (Custom Base URL Guide)

Claude Code is one of the best AI coding tools available — but it's locked to Anthropic's API by default. What if you want cheaper rates, automatic failover, or access to GPT and Gemini alongside Claude?

Here's how to configure Claude Code with a custom API endpoint in under 2 minutes.

Why Use a Custom API Base?

There are several reasons to route Claude Code through a third-party API gateway:

  1. Cost savings — Some platforms negotiate volume discounts with Anthropic (10-30% off)
  2. Automatic failover — If Anthropic's API goes down, your coding session doesn't stop
  3. Unified billing — One bill instead of juggling multiple provider accounts
  4. Rate limit pooling — Higher effective rate limits through load balancing
  5. Usage tracking — Per-project cost breakdown that Anthropic doesn't provide

Setup: Claude Code with Custom Base URL

Method 1: Environment Variable

The simplest approach — set ANTHROPIC_BASE_URL in your shell config:

# Add to ~/.bashrc, ~/.zshrc, or ~/.config/fish/config.fish
export ANTHROPIC_BASE_URL="https://futurmix.ai/v1"
export ANTHROPIC_API_KEY="your-api-key"
Enter fullscreen mode Exit fullscreen mode

Restart your terminal, and Claude Code will route all requests through the custom endpoint.

Method 2: Project-Level Config

Create a .claude config file in your project root:

{
  "apiBaseUrl": "https://futurmix.ai/v1",
  "apiKey": "your-api-key"
}
Enter fullscreen mode Exit fullscreen mode

This keeps the configuration scoped to specific projects.

Method 3: VS Code Settings (if using the extension)

{
  "claude.apiBaseUrl": "https://futurmix.ai/v1",
  "claude.apiKey": "your-api-key"
}
Enter fullscreen mode Exit fullscreen mode

Verifying the Connection

After configuring, test that Claude Code is using the custom endpoint:

# Start Claude Code and check the connection
claude --version

# Run a simple test
claude "Write a hello world function in Python"
Enter fullscreen mode Exit fullscreen mode

If you see a response, the custom endpoint is working. If you get an authentication error, double-check your API key.

What About Other AI Coding Tools?

The same approach works with most AI coding tools that support custom base URLs:

Tool Config Method
Claude Code ANTHROPIC_BASE_URL env var
Cursor Settings → Models → Custom API Base
Aider --openai-api-base flag or .aider.conf.yml
Continue config.jsonapiBase field
Roo Code Settings → API Configuration
Cline Settings → API Provider → Custom

Example for Cursor:

# In Cursor Settings
API Base URL: https://futurmix.ai/v1
API Key: your-api-key
Model: claude-sonnet-4-6
Enter fullscreen mode Exit fullscreen mode

Example for Aider:

aider --openai-api-base https://futurmix.ai/v1 \
      --openai-api-key your-api-key \
      --model claude-sonnet-4-6
Enter fullscreen mode Exit fullscreen mode

Multi-Model Coding: Using GPT and Claude Together

One advantage of a multi-model API gateway is switching between models for different tasks:

# Use Claude for refactoring (better code quality)
export ANTHROPIC_BASE_URL="https://futurmix.ai/v1"
claude "Refactor this module to use async/await"

# Use GPT for generating test cases (good at structured output)
# Same API key, different model
curl https://futurmix.ai/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -d '{"model": "gpt-5.5", "messages": [{"role": "user", "content": "Generate pytest tests for this function..."}]}'
Enter fullscreen mode Exit fullscreen mode

Cost Comparison

Here's what you save by routing through a multi-model platform vs. direct API access:

Model Direct Price Via Gateway Savings
Claude Sonnet 4.6 $3 / $15 per 1M tokens $2.70 / $13.50 10%
Claude Opus 4.7 $5 / $25 $4.50 / $22.50 10%
Claude Haiku 4.5 $1 / $5 $0.90 / $4.50 10%
GPT-5.5 $3 / $12 $2.10 / $8.40 30%

For a developer using ~500K tokens/day with Claude Code, that's roughly $50-100/month in savings.

Troubleshooting

"Authentication failed"

  • Make sure you're using the gateway's API key, not your Anthropic key
  • Check that the base URL doesn't have a trailing slash

"Model not found"

  • Verify the model name matches exactly (e.g., claude-sonnet-4-6, not claude-sonnet-4.6)
  • Check the gateway's supported model list

"Connection refused"

  • Ensure the base URL is correct (https, not http)
  • Check if your firewall or VPN is blocking the connection

Slow responses

  • A good API gateway adds <10ms overhead. If responses are significantly slower, try a different region or check the gateway's status page

Getting Started

If you want to try this setup, FuturMix offers an OpenAI-compatible API with 22+ models including Claude, GPT, Gemini, and DeepSeek. Pay-as-you-go, no minimum commitment.

export ANTHROPIC_BASE_URL="https://futurmix.ai/v1"
export ANTHROPIC_API_KEY="your-futurmix-key"
Enter fullscreen mode Exit fullscreen mode

That's it. Two lines and you're saving 10-30% on every Claude Code session.


Using Claude Code with a custom API? Share your setup in the comments.


Related Guides

Top comments (1)

Collapse
 
kcarriedo profile image
Kyle Carriedo

Useful walkthrough — ANTHROPIC_BASE_URL as the single env-var pivot is the right abstraction, and the failover framing is genuinely underdiscussed.

One thing I'd be curious about from your setup — and it's where I keep tripping multi-provider Claude Code rigs in practice: observability through the intermediary. When the same long-running workflow makes 200+ requests across a mix of Anthropic-direct and proxy-routed providers, the per-request artifact that lets you reconstruct "which agent / which session / which orchestration step burned which tokens on which backend" disappears at the boundary. Anthropic's own usage endpoint gives you the half you can see; the proxy gives you the half it sees; neither gives you the join key.

A couple of questions that might be worth adding to the guide as a "you'll want this soon" section:

  1. Does your preferred routing layer surface a per-request provider_id / backend_route in the response headers (or in a sidecar log) that you can correlate against the Claude Code session's ~/.claude/projects/**/*.jsonl transcript?
  2. For automatic-failover specifically — when a request fails over from Provider A to Provider B mid-stream, do you have a way to know retrospectively that the failover happened, or does it look like one clean request from the Claude Code side?
  3. The "10-30% off via volume discount" benefit is real but conditional on the intermediary's pricing transparency — has your setup hit cases where the effective per-token cost diverged from the headline rate due to retries / proxy overhead / cache-miss accounting differences?

Asking because in my own multi-process workflows the cost-savings story is straightforwardly true at small N, but the observability story becomes the load-bearing concern at the N where you'd actually need failover — and that's the surface where the existing tooling thins out fastest.