Shib™ 🚀

Posted on Feb 8 • Originally published at apistatuscheck.com

Building a Multi-Provider AI Fallback System (OpenAI, Anthropic, Google)

#ai #devops #api #webdev

When OpenAI went down on December 11, 2025, thousands of AI applications stopped working. Chatbots froze. Content generators failed. Customer support systems crashed. If your entire business depends on a single AI provider, you're one outage away from disaster.

But it doesn't have to be that way.

In this guide, you'll learn how to build a production-ready multi-provider AI system that automatically fails over between OpenAI, Anthropic, and Google when one goes down—with complete code examples.

Why You Need Multi-Provider AI

The Outage Reality

AI APIs go down more often than you think:

OpenAI (December 2025): 4-hour outage affecting ChatGPT and API
Anthropic (November 2025): Degraded performance for 6+ hours
Google Gemini (October 2025): Complete API outage for 2 hours
OpenAI (March 2024): 3+ hour outage during business hours
Anthropic (June 2024): Rate limit issues affecting production apps

If you're only using one provider, your availability is capped at their uptime. With three providers and automatic failover, you can achieve 99.99%+ uptime even when individual providers fail.

The Business Case

Cost savings: Route to cheaper providers first, use premium as fallback
Reliability: 3 providers with 99.9% uptime each = 99.9999% combined uptime
Performance: Route to fastest provider based on real-time latency
Compliance: Some regions require data to stay local; multi-provider enables geo-routing

Architecture: Provider Abstraction Layer

The key to multi-provider AI is a unified interface that abstracts away provider differences:

Your Application
       ↓
AI Client (Unified Interface)
       ↓
Provider Router
    ↙  ↓  ↘
OpenAI  Anthropic  Google

Core Components

Unified Interface: Same method signatures regardless of provider
Provider Adapters: Translate between your interface and each provider's API
Router: Decides which provider to use and handles fallback
Health Monitor: Tracks which providers are healthy
Prompt Translator: Adapts prompts for provider-specific quirks

Building the System: Complete Code

Let's build a production-ready multi-provider AI client in TypeScript.

1. Define the Unified Interface

// types.ts
export interface Message {
  role: 'system' | 'user' | 'assistant';
  content: string;
}

export interface CompletionRequest {
  messages: Message[];
  maxTokens?: number;
  temperature?: number;
  stream?: boolean;
}

export interface CompletionResponse {
  content: string;
  provider: string;
  model: string;
  usage: {
    promptTokens: number;
    completionTokens: number;
    totalTokens: number;
  };
}

export interface AIProvider {
  name: string;
  complete(request: CompletionRequest): Promise<CompletionResponse>;
  isHealthy(): Promise<boolean>;
}

(Full code continues in the original article at apistatuscheck.com)

Cost Optimization

Route to the cheapest provider first, premium as fallback:

interface ProviderPricing {
  inputPer1M: number;  // USD per 1M input tokens
  outputPer1M: number; // USD per 1M output tokens
}

const pricing: Record<string, ProviderPricing> = {
  google: { inputPer1M: 1.25, outputPer1M: 5.00 },      // Gemini 1.5 Pro
  anthropic: { inputPer1M: 3.00, outputPer1M: 15.00 },  // Claude 3.5 Sonnet
  openai: { inputPer1M: 10.00, outputPer1M: 30.00 },    // GPT-4 Turbo
};

Result: You use Google (cheapest) by default, fail over to Anthropic if Google is down, and only use OpenAI (most expensive) as a last resort.

Provider Comparison Table

Provider	Best For	Strengths	Weaknesses	Cost (1M tokens)
OpenAI GPT-4	Complex reasoning, coding	Highest quality, best tool use	Most expensive, frequent outages	$10 input / $30 output
Anthropic Claude	Long context, analysis	200K context, reliable	Slower responses	$3 input / $15 output
Google Gemini	Cost efficiency, speed	1M context, cheapest	Less capable reasoning	$1.25 input / $5 output

Real-World Example: E-commerce Chatbot

Here's how a real e-commerce company uses multi-provider AI:

const ai = new AIRouter({
  providers: [
    new GoogleProvider(process.env.GOOGLE_KEY, 'gemini-1.5-flash'), // Fast, cheap
    new AnthropicProvider(process.env.ANTHROPIC_KEY), // Reliable
    new OpenAIProvider(process.env.OPENAI_KEY), // Last resort
  ],
});

// Customer support chatbot
app.post('/api/chat', async (req, res) => {
  const { message, history } = req.body;

  try {
    const response = await ai.complete({
      messages: [
        {
          role: 'system',
          content: 'You are a helpful e-commerce support assistant. Be concise and friendly.',
        },
        ...history,
        { role: 'user', content: message },
      ],
      maxTokens: 300,
      temperature: 0.7,
    });

    res.json({ reply: response.content });
  } catch (error) {
    res.json({ 
      reply: "I'm having trouble right now. Please try again in a moment.",
    });
  }
});

Result: 99.95% uptime, average cost reduced by 60% (by routing to Gemini first), and zero customer-facing outages in 6 months.

Get Alerted Before Your Users Notice

The best failover system is one that triggers before you even need it. API Status Check monitors OpenAI, Anthropic, Google, and 200+ other APIs in real-time.

Get notified the second an outage begins. Set up intelligent alerts at apistatuscheck.com and never be caught off guard again.

Ready to build bulletproof AI applications? Read the full implementation guide with complete code examples at apistatuscheck.com.

DEV Community