DEV Community

Gerus Lab
Gerus Lab

Posted on

How to Migrate From Anthropic's Direct API to a Managed Proxy Without Breaking Your Stack

How to Migrate From Anthropic's Direct API to a Managed Proxy Without Breaking Your Stack

You've been calling Anthropic's API directly. It works. But now you're running multiple projects, costs are creeping up, your team needs access, and you're drowning in API key management. A managed proxy layer solves all of this — but the migration has real gotchas if you don't plan it right.

This guide covers the practical path from direct Anthropic calls to a proxy setup, using ShadoClaw as the reference implementation. Same principles apply to any Claude proxy.


What Actually Changes When You Add a Proxy

Before migrating, understand what's different at the protocol level.

Direct API call:

Your app → https://api.anthropic.com/v1/messages
Enter fullscreen mode Exit fullscreen mode

Through a proxy:

Your app → https://proxy.shadoclaw.com/v1/messages → Anthropic
Enter fullscreen mode Exit fullscreen mode

The request/response format is identical — the proxy is transparent to your application code. The only things that change:

  1. Base URL — points to the proxy, not Anthropic directly
  2. Auth token — your proxy-issued key, not your Anthropic key
  3. Rate limiting behavior — the proxy may have different limits or pooling logic
  4. Error codes — proxies add their own error layer on top of Anthropic's

That's it. If a migration requires more changes than this, something's wrong with the proxy design.


Step 1: Swap the Base URL (The Easy Part)

Most SDKs accept a custom base URL. Here's how to do it without touching anything else:

Python (anthropic SDK):

import anthropic

# Before
client = anthropic.Anthropic(api_key="sk-ant-...")

# After
client = anthropic.Anthropic(
    api_key="your-shadoclaw-key",
    base_url="https://api.shadoclaw.com"
)
Enter fullscreen mode Exit fullscreen mode

Node.js / TypeScript:

import Anthropic from "@anthropic-ai/sdk";

// Before
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

// After
const client = new Anthropic({
  apiKey: process.env.SHADOCLAW_API_KEY,
  baseURL: "https://api.shadoclaw.com",
});
Enter fullscreen mode Exit fullscreen mode

Direct HTTP (curl / fetch):

# Before
curl https://api.anthropic.com/v1/messages \
  -H "x-api-key: sk-ant-..." \
  -H "anthropic-version: 2023-06-01" \
  -d '...'

# After — same headers, different URL
curl https://api.shadoclaw.com/v1/messages \
  -H "x-api-key: your-shadoclaw-key" \
  -H "anthropic-version: 2023-06-01" \
  -d '...'
Enter fullscreen mode Exit fullscreen mode

The anthropic-version header stays the same. The proxy passes it through to Anthropic. Don't remove it.


Step 2: Handle Auth Without Leaking Keys

This is where teams get sloppy. A proxy is supposed to centralize auth — don't undermine it.

Common mistake: hardcoding keys in multiple places

# Don't do this
client = Anthropic(api_key="sc-live-abc123...", base_url="https://api.shadoclaw.com")
Enter fullscreen mode Exit fullscreen mode

Do this instead:

import os
from anthropic import Anthropic

client = Anthropic(
    api_key=os.environ["SHADOCLAW_API_KEY"],
    base_url=os.environ.get("CLAUDE_API_BASE", "https://api.shadoclaw.com")
)
Enter fullscreen mode Exit fullscreen mode

Keep CLAUDE_API_BASE as an env var so you can switch between proxy and direct without code changes. Useful for local dev vs production.

For teams: ShadoClaw's Pro and Team plans let you issue separate keys per project or team member. Issue project-specific keys so you can rotate or revoke them independently.


Step 3: Rate Limits — They're Not the Same

This is the #1 source of surprises post-migration.

Anthropic's rate limits are per-API-key at the org level. A proxy introduces a different rate limit layer:

  • Proxy-level limits: Set by your proxy plan (ShadoClaw Solo/Pro/Team)
  • Anthropic-level limits: Still exist, handled by the proxy's pooling logic
  • Per-key limits: The proxy may give each issued key its own sublimit

What this means for your retry logic:

Anthropic returns 529 (overloaded) or 429 (rate limited). A proxy might return these same codes or wrap them differently. Check your proxy's docs — ShadoClaw passes through Anthropic's status codes directly so your existing retry handlers still work.

Recommended retry pattern:

import time
import anthropic
from anthropic import RateLimitError, APIStatusError

def call_with_retry(client, **kwargs, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.messages.create(**kwargs)
        except RateLimitError as e:
            if attempt == max_retries - 1:
                raise
            # Exponential backoff: 2s, 4s, 8s
            wait = 2 ** (attempt + 1)
            print(f"Rate limited. Retrying in {wait}s...")
            time.sleep(wait)
        except APIStatusError as e:
            if e.status_code == 529:  # Overloaded
                wait = 5 * (attempt + 1)
                time.sleep(wait)
            else:
                raise
Enter fullscreen mode Exit fullscreen mode

This works identically whether you're hitting Anthropic directly or through a proxy.


Step 4: Model Routing — Don't Assume Availability

Direct API: you call claude-opus-4-5 and it either works or it doesn't.

Through a proxy: the proxy might route to a fallback model if the primary is unavailable, or it might enforce which models your plan allows.

Always specify the model explicitly. Never assume a default.

# Explicit is better than implicit
response = client.messages.create(
    model="claude-sonnet-4-5",  # Always specify
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}]
)
Enter fullscreen mode Exit fullscreen mode

If you're building a system where model selection matters for cost or capability, add a config layer:

MODEL_CONFIG = {
    "fast": os.environ.get("FAST_MODEL", "claude-haiku-3-5"),
    "balanced": os.environ.get("BALANCED_MODEL", "claude-sonnet-4-5"),
    "powerful": os.environ.get("POWERFUL_MODEL", "claude-opus-4-5"),
}

def get_model(tier="balanced"):
    return MODEL_CONFIG[tier]
Enter fullscreen mode Exit fullscreen mode

This lets you override models via env vars per environment without code changes.


Step 5: Logging and Observability

This is where a proxy actually adds value compared to direct API calls.

ShadoClaw provides request-level logging out of the box: token counts, latency, model used, status codes. You get a dashboard instead of having to wire up your own telemetry.

But you'll still want structured logging in your app for correlation:

import logging
import uuid

logger = logging.getLogger(__name__)

def tracked_completion(client, prompt, context=None):
    request_id = str(uuid.uuid4())[:8]

    logger.info(f"[{request_id}] Starting completion", extra={
        "request_id": request_id,
        "prompt_length": len(prompt),
        "context": context
    })

    response = client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=1024,
        messages=[{"role": "user", "content": prompt}]
    )

    logger.info(f"[{request_id}] Completed", extra={
        "request_id": request_id,
        "input_tokens": response.usage.input_tokens,
        "output_tokens": response.usage.output_tokens,
    })

    return response
Enter fullscreen mode Exit fullscreen mode

Combining your app-level logs with ShadoClaw's dashboard gives you full visibility without building your own analytics pipeline.


Common Migration Gotchas

1. SSL/TLS certificate pinning
If you've got cert pinning or custom CA bundles in your HTTP client, they'll reject the proxy's certificate. Either update the pinned cert or (better) remove pinning if you weren't intentionally using it.

2. Streaming responses
If you use stream=True, test this explicitly post-migration. Streaming through a proxy involves chunked transfer encoding — most proxies handle this fine, but verify with a quick test:

with client.messages.stream(
    model="claude-sonnet-4-5",
    max_tokens=256,
    messages=[{"role": "user", "content": "Count to 10"}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
Enter fullscreen mode Exit fullscreen mode

3. Timeout settings
Your HTTP client's timeout is now measuring proxy response time, not Anthropic's. If the proxy has cold-start latency or is under load, you might hit timeouts you didn't hit before. Add a reasonable buffer:

client = Anthropic(
    api_key=os.environ["SHADOCLAW_API_KEY"],
    base_url="https://api.shadoclaw.com",
    timeout=60.0  # Give it room
)
Enter fullscreen mode Exit fullscreen mode

4. Response headers
Your code might be reading Anthropic-specific response headers (like request-id for debugging). Proxy headers may differ. Audit any code that reads response headers directly.


Migration Checklist

Before you cut over:

  • [ ] Base URL updated to proxy endpoint
  • [ ] Auth token replaced with proxy-issued key
  • [ ] Old Anthropic key removed from all env configs
  • [ ] Streaming tested end-to-end
  • [ ] Retry logic tested against proxy error codes
  • [ ] Timeout values reviewed and adjusted
  • [ ] Logging/observability verified
  • [ ] Rollback plan ready (env var swap, not code deploy)

The rollback plan deserves emphasis: if you've set CLAUDE_API_BASE as an env var, rolling back is a one-line config change and a restart. No code, no deploy.


Why Do This at All?

If you're a solo dev on one project, direct API access is fine. The moment you have:

  • Multiple projects sharing Claude costs
  • Team members who need API access
  • Usage you need to track and cap per project
  • The need to rotate keys without touching code

...a managed proxy pays for itself immediately.

ShadoClaw, built by Gerus-lab, handles all of this. Solo plan at $29/mo covers one user. Pro at $79/mo handles 5 accounts. Team at $179/mo scales to 20. There's a free 3-day trial — worth running your current usage through it before committing.

The migration takes 20 minutes if you follow the steps above. The hard part isn't the code, it's making sure your retry logic and error handling are actually tested. Do that part before you call it done.


ShadoClaw is a managed Claude API proxy built by Gerus-lab. Start your free 3-day trial at shadoclaw.com.

Top comments (0)