A simple way to test model fallbacks with RouterBase

#ai #llm #api #tutorial

Fallback logic is easier to reason about when the application has one request shape and the model choice stays configurable. That is especially useful for teams testing different AI providers, latency profiles, or cost envelopes.

RouterBase gives developers an OpenAI-compatible API surface at https://routerbase.com/v1, which makes it a good place to prototype fallback behavior before changing a larger production system.

A tiny fallback wrapper

const baseUrl = "https://routerbase.com/v1/chat/completions";

async function runPrompt(model, prompt) {
  const response = await fetch(baseUrl, {
    method: "POST",
    headers: {
      Authorization: `Bearer ${process.env.ROUTERBASE_API_KEY}`,
      "Content-Type": "application/json"
    },
    body: JSON.stringify({
      model,
      messages: [{ role: "user", content: prompt }]
    })
  });

  if (!response.ok) {
    throw new Error(`Model ${model} failed with ${response.status}`);
  }

  return response.json();
}

async function runWithFallback(prompt) {
  const primary = process.env.ROUTERBASE_PRIMARY_MODEL || "google/gemini-2.5-flash";
  const fallback = process.env.ROUTERBASE_FALLBACK_MODEL || "openai/gpt-4.1-mini";

  try {
    return await runPrompt(primary, prompt);
  } catch (primaryError) {
    console.warn(primaryError.message);
    return runPrompt(fallback, prompt);
  }
}

Keep the first test small

Start with a workflow where a fallback is useful but not risky:

Drafting internal release notes
Summarizing long tickets for triage
Creating first-pass documentation outlines
Classifying messages into broad support categories

For each run, record which model responded, whether fallback was used, total latency, and whether the output needed manual correction. That turns a routing experiment into something the team can evaluate instead of a vague model preference debate.