DEV Community

cauqjbwkerl
cauqjbwkerl

Posted on

How to Fix AI API Failures That Look Like Rate Limits but Are Actually Network Issues

TL;DR

If your OpenAI, Claude, or Gemini API calls are failing with cryptic errors that look like rate limits, the real culprit is often your network — ISP routing, DNS pollution, or TCP RST injection. A real 429 has a JSON body and a Retry-After header; a network failure gives you an empty response, a connection reset, or a timeout. Here's how to tell them apart and fix it systematically.


I spent two frustrating days last month convinced I'd somehow blown through my OpenAI quota. My Python script kept dying with RateLimitError, but the OpenAI dashboard showed I'd barely touched my limits. Sound familiar? If you're working from Southeast Asia, mainland China, or certain parts of the Middle East, this is a surprisingly common trap.

Let me walk you through exactly how I diagnosed it and what I did to fix it.

Real 429 vs. Network Failure — Know the Difference

This is the first thing to nail down, because the fix is completely different depending on which one you're dealing with.

A genuine rate limit (HTTP 429) always has:

  • An HTTP status code of 429 in the response
  • A Retry-After header telling you how long to wait
  • A JSON body like {"error": {"type": "rate_limit_error", "message": "..."}}

A network-level failure looks like one or more of these:

  • ConnectionResetError or ConnectionRefusedError
  • requests.exceptions.ConnectionError with an empty response body
  • TimeoutError or ReadTimeout with no HTTP status at all
  • The Python OpenAI SDK raising APIConnectionError instead of RateLimitError

The SDK wraps both into similar-looking exceptions, which is why they're easy to confuse. The key is to look at the exception class and the response body, not just the error message string.

Step 1: Verbose curl to the API Endpoint

Before touching any code, go raw. Run this from your terminal:

curl -v --max-time 15 \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"ping"}],"max_tokens":5}' \
  https://api.openai.com/v1/chat/completions 2>&1 | head -80
Enter fullscreen mode Exit fullscreen mode

Watch the output carefully:

  • If you see * Connected to api.openai.com followed by TLS handshake lines and then an HTTP response — your network path is basically working.
  • If you see * Connection reset by peer or curl: (35) OpenSSL SSL_connect — you have a network-level block, likely TCP RST injection.
  • If it just hangs until timeout — routing issue or DNS resolution is hitting a poisoned record.

Do the same for Anthropic:

curl -v --max-time 15 https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{"model":"claude-haiku-20240307","max_tokens":5,"messages":[{"role":"user","content":"ping"}]}' 2>&1 | head -80
Enter fullscreen mode Exit fullscreen mode

Step 2: Check DNS Resolution

DNS pollution is a real thing in several regions. Your ISP may be returning a bogus IP for api.openai.com.

# What does your default resolver say?
nslookup api.openai.com

# Compare against a clean resolver
nslookup api.openai.com 8.8.8.8
nslookup api.openai.com 1.1.1.1
Enter fullscreen mode Exit fullscreen mode

If the IPs are different — especially if your default resolver returns a private IP range or a local redirect — you've found your DNS problem.

Step 3: Traceroute to See Where Traffic Dies

# Linux/macOS
traceroute -n api.openai.com

# Windows
tracert api.openai.com
Enter fullscreen mode Exit fullscreen mode

If the trace stops at a hop inside your ISP's network (typically hops 3–8) and never reaches the destination, that's ISP-level routing interference. If it reaches international exchange points but then drops, it's a peering or transit issue.

Step 4: Enable Python SDK Debug Logging

Once you suspect a network issue, enable verbose logging in the OpenAI Python SDK to see exactly what's happening at the HTTP layer:

import logging
import httpx
import openai

# Enable full HTTP request/response logging
logging.basicConfig(level=logging.DEBUG)
logging.getLogger("httpx").setLevel(logging.DEBUG)
logging.getLogger("openai").setLevel(logging.DEBUG)

client = openai.OpenAI()

try:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "hello"}],
        max_tokens=10
    )
    print(response.choices[0].message.content)
except openai.APIConnectionError as e:
    print(f"Network-level failure (not a rate limit): {e}")
except openai.RateLimitError as e:
    print(f"Genuine rate limit: {e}")
except openai.APIStatusError as e:
    print(f"API error {e.status_code}: {e.message}")
Enter fullscreen mode Exit fullscreen mode

The APIConnectionError vs RateLimitError distinction is your smoking gun.

The Fix: Route Through a Reliable Tunnel

Once I confirmed it was a network issue (my traceroute was dying at hop 6 inside my ISP), the solution was to route API traffic through a tunnel with better international connectivity.

You have a few options:

  1. Configure your system proxy if you already have a VPN or proxy running locally
  2. Use a purpose-built tunnel optimized for this kind of traffic

For option 1, here's how to configure the proxy in both Python and Node.js:

Python (OpenAI SDK):

import httpx
import openai

# If your local proxy is running on port 7890
proxy_url = "http://127.0.0.1:7890"

client = openai.OpenAI(
    http_client=httpx.Client(
        proxy=proxy_url,
        timeout=30.0
    )
)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "hello"}]
)
print(response.choices[0].message.content)
Enter fullscreen mode Exit fullscreen mode

Or via environment variable (works for most SDKs):

export HTTPS_PROXY="http://127.0.0.1:7890"
export HTTP_PROXY="http://127.0.0.1:7890"
python your_script.py
Enter fullscreen mode Exit fullscreen mode

Node.js (using the official OpenAI package):

import OpenAI from 'openai';
import { HttpsProxyAgent } from 'https-proxy-agent';

const proxyAgent = new HttpsProxyAgent('http://127.0.0.1:7890');

const client = new OpenAI({
  httpAgent: proxyAgent,
});

const response = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'hello' }],
});
console.log(response.choices[0].message.content);
Enter fullscreen mode Exit fullscreen mode

For option 2, after trying a few generic VPN services that added latency or had unreliable uptime, I ended up using TonBoVPN, which is specifically tuned for AI API traffic. The difference in latency and connection stability to api.openai.com and api.anthropic.com was noticeable — it also handles the DNS resolution cleanly, which matters if your ISP is doing DNS poisoning. You still configure it the same way via the local proxy port.

Putting It All Together: A Diagnostic Checklist

Here's the exact order I run through now whenever an AI API starts misbehaving:

  1. Check the exception typeAPIConnectionError = network, RateLimitError = real quota
  2. Check your dashboard — OpenAI, Anthropic, and Google all show real-time usage
  3. Run verbose curl to the endpoint and look for TLS handshake vs connection reset
  4. Compare DNS resolution between your default resolver and 8.8.8.8
  5. Run traceroute and see where packets stop
  6. Enable SDK debug logging for the full HTTP picture
  7. Configure a proxy and test again

One more thing: if you're building a production service that serves users in regions with connectivity issues, consider implementing a retry wrapper that distinguishes between these error types. Retrying a genuine rate limit with exponential backoff makes sense. Retrying a TCP RST injection 10 times in a row just wastes time and quota.

import time
import openai

def call_with_smart_retry(client, max_retries=3, **kwargs):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(**kwargs)
        except openai.RateLimitError as e:
            wait = int(e.response.headers.get("retry-after", 60))
            print(f"Rate limited. Waiting {wait}s...")
            time.sleep(wait)
        except openai.APIConnectionError as e:
            print(f"Connection error on attempt {attempt+1}: {e}")
            if attempt < max_retries - 1:
                time.sleep(2 ** attempt)  # brief backoff, then check your tunnel
            else:
                raise
    raise RuntimeError("Max retries exceeded")
Enter fullscreen mode Exit fullscreen mode

Conclusion

Most "rate limit" errors I've seen from developers in Asia are actually network failures in disguise — and the fix is completely different from what you'd do for a real 429. The diagnostic flow (curl, DNS check, traceroute, SDK logging) takes about five minutes and tells you exactly where the problem lives. Once you know it's a routing or DNS issue, configuring an HTTPS_PROXY in your SDK is a one-liner that usually solves it immediately. Don't spend hours tweaking retry logic or downgrading your API tier when the problem is three hops into your ISP's network.

Top comments (0)