DEV Community

David C Cavalcante
David C Cavalcante Subscriber

Posted on

KeyMesh: Zero-Runtime-Dependency API Key Rotation, Circuit Breaker and Failover for Production LLM Applications in Node.js

KeyMesh: Zero-Runtime-Dependency API Key Rotation, Circuit Breaker and Failover for Production LLM Applications in Node.js

As a solo LLMOps engineer with over 25 years of experience building production AI systems, I constantly faced the same critical failure point: API key rate limits and transient errors breaking LLM-powered applications.

KeyMesh was created to solve exactly this problem.

The Problem

When a single OpenAI, Anthropic or Gemini API key hits a 429 Too Many Requests (or any transient 5xx/408 error), most applications fail immediately for the user. Manual key rotation or on-call intervention becomes necessary. Existing gateway solutions add network hops, latency, and extra operational complexity.

I needed a solution that lives inside the application code itself.

The Solution

KeyMesh (@takk/keymesh) is a universal, zero-runtime-dependency Node.js library and CLI that provides intelligent API key rotation, per-key circuit breakers, smart retries, health scoring, and automatic failover.

It works as a drop-in replacement for official SDKs and supports any HTTP-based API.

KeyMesh is fully TypeScript-first, has 93% test coverage (145 tests), zero runtime dependencies, and ships with SLSA provenance for supply-chain security.

Core Features

  • Automatic key rotation using multiple selection strategies (round-robin, least-used, weighted, sequential-then-rotate, and custom)
  • Per-key circuit breaker with three states (closed, open, half-open)
  • Smart retry with AWS full-jitter exponential backoff and Retry-After support
  • Health scoring system (0-100) that decays on failure and recovers on success
  • In-process telemetry with 8 typed events (no external OpenTelemetry dependency)
  • Pluggable state backends (memory by default, file backend included; Redis/Postgres planned)
  • Auth-failure cooldown (401 errors disable key for 24 hours)
  • Official adapters for OpenAI, Anthropic, Gemini, and a generic HTTP adapter
  • CLI proxy mode for easy testing and non-Node.js environments

Quickstart Examples

1. OpenAI SDK Adapter

import { createKeymesh } from '@takk/keymesh';
import { openaiAdapter } from '@takk/keymesh/openai';

const client = createKeymesh({
  provider: openaiAdapter,
  keys: process.env.OPENAI_API_KEYS?.split(',') ?? [],
  strategy: 'least-used',
  circuitBreaker: { threshold: 3, cooldownMs: 30_000 },
  retry: { max: 5, baseMs: 200, jitter: true },
  telemetry: { enabled: true },
});

// Use exactly like the official OpenAI client
const response = await client.chat.completions.create({
  model: 'gpt-4.1',
  messages: [{ role: 'user', content: 'Hello.' }],
});
Enter fullscreen mode Exit fullscreen mode

2. Generic HTTP Adapter (any API)

import { createKeymesh } from '@takk/keymesh';
import { httpAdapter } from '@takk/keymesh/http';

const tavily = createKeymesh({
  provider: httpAdapter({
    baseUrl: 'https://api.tavily.com',
    authHeader: (key) => ({ Authorization: `Bearer ${key}` }),
  }),
  keys: process.env.TAVILY_API_KEYS?.split(',') ?? [],
  strategy: 'round-robin',
});

const result = await tavily.post('/search', { query: 'AI infrastructure 2026' });
Enter fullscreen mode Exit fullscreen mode

3. CLI Proxy Mode

OPENAI_API_KEYS=key1,key2,key3 npx @takk/keymesh start \
  --port 8787 \
  --adapter openai \
  --strategy round-robin
Enter fullscreen mode Exit fullscreen mode

Then call it like a normal OpenAI endpoint on http://localhost:8787.

How It Works (Request Flow)

  1. Pick key using selected strategy
  2. Dispatch request through the provider adapter
  3. Classify response/error
  4. Update health score and circuit breaker state
  5. Retry with backoff or rotate to next healthy key
  6. Emit telemetry events

All keys remain hashed in state. Raw credentials are never logged or persisted.

Installation

pnpm add @takk/keymesh
# or
npm install @takk/keymesh
# or
yarn add @takk/keymesh
# or
bun add @takk/keymesh
Enter fullscreen mode Exit fullscreen mode

Optional provider SDKs only if using the typed adapters.

Why KeyMesh Exists

I built KeyMesh because I got tired of production incidents caused by rate limits. It turns a common point of failure into silent, automatic self-healing.

It is the first piece of a larger family of high-reliability open-source libraries for the AI infrastructure stack that I plan to maintain long-term.

Links

If you work with LLM applications in Node.js, Bun, Deno, or Edge runtimes, I would love your feedback and contributions.

Try KeyMesh today and let me know how it performs in your production environment.

Top comments (0)