Production AI applications fail in ways traditional software doesn't. Models go down, tokens run out, responses hallucinate, and rate limits hit at the worst moments. Here's how to build reliable AI-powered systems.
The AI Reliability Problem
Traditional APIs return consistent responses or clear errors. AI APIs introduce new failure modes:
- Model outages — The provider's model goes down
- Rate limits — You've exhausted your quota mid-request
- Token limits — Your prompt exceeds context window
- Hallucinations — Model returns plausible but wrong answers
- Timeout — Request takes too long and hangs
- Invalid JSON — Model returns malformed structured data
Retry Logic with Exponential Backoff
`typescript
async function withRetry(
fn: () => Promise,
options: {
maxRetries?: number;
baseDelay?: number;
maxDelay?: number;
onRetry?: (attempt: number, error: Error) => void;
} = {}
): Promise {
const {
maxRetries = 3,
baseDelay = 1000,
maxDelay = 30000,
onRetry
} = options;
for (let attempt = 1; attempt <= maxRetries + 1; attempt++) {
try {
return await fn();
} catch (error) {
const isRetryable = isRetryableError(error);
const isLastAttempt = attempt > maxRetries;
if (isLastAttempt || !isRetryable) {
throw error;
}
const delay = Math.min(baseDelay * Math.pow(2, attempt - 1), maxDelay);
onRetry?.(attempt, error as Error);
await sleep(delay);
}
}
throw new Error('Unreachable');
}
function isRetryableError(error: unknown): boolean {
if (error instanceof AIAPIError) {
// 429 = rate limit, 500 = server error, 503 = service unavailable
return [429, 500, 502, 503].includes(error.status);
}
// Network timeout
return (error as NodeJS.ErrnoException).code === 'ETIMEDOUT';
}
`
Handling Rate Limits Gracefully
`typescript
async function chatWithRateLimit(
client: ClaudeClient,
messages: ChatMessage[],
options: { maxWait?: number } = {}
): Promise {
const { maxWait = 60000 } = options;
const startTime = Date.now();
while (true) {
try {
return await client.chat(messages);
} catch (error) {
if (!(error instanceof AIAPIError) || error.status !== 429) throw error;
const retryAfter = error.retryAfter || 1000;
const elapsed = Date.now() - startTime;
if (elapsed + retryAfter > maxWait) {
throw new Error('Rate limit exceeded, max wait time reached');
}
console.log(Rate limited, waiting ${retryAfter}ms...);
await sleep(retryAfter);
}
}
}
`
Structured Output Validation
Models often return invalid JSON. Always validate:
`typescript
import { z } from 'zod';
const CodeReviewSchema = z.object({
bugs: z.array(z.object({
line: z.number(),
severity: z.enum(['low', 'medium', 'high']),
description: z.string()
})),
suggestions: z.array(z.string()),
score: z.number().min(0).max(10)
});
async function reviewCode(code: string): Promise {
const response = await client.chat([
{ role: 'user', content: Review this code:\n\n${code}\n\nReturn valid JSON. }
]);
try {
const parsed = JSON.parse(response.choices[0].message.content);
return CodeReviewSchema.parse(parsed);
} catch {
// Fallback: retry with stricter prompting
return reviewCodeWithFallback(code);
}
}
`
Circuit Breaker Pattern
Prevent cascading failures when AI API is degraded:
`typescript
class CircuitBreaker {
private failures = 0;
private lastFailure = 0;
private state: 'closed' | 'open' | 'half-open' = 'closed';
constructor(
private readonly threshold: number = 5,
private readonly timeout: number = 60000
) {}
async execute(fn: () => Promise): Promise {
if (this.state === 'open') {
if (Date.now() - this.lastFailure > this.timeout) {
this.state = 'half-open';
} else {
throw new Error('Circuit breaker is open');
}
}
try {
const result = await fn();
this.onSuccess();
return result;
} catch (error) {
this.onFailure();
throw error;
}
}
private onSuccess() {
this.failures = 0;
this.state = 'closed';
}
private onFailure() {
this.failures++;
this.lastFailure = Date.now();
if (this.failures >= this.threshold) {
this.state = 'open';
}
}
}
// Usage
const breaker = new CircuitBreaker(5, 60000);
const result = await breaker.execute(() => client.chat(messages));
`
Timeout Strategy
Set both connection and request timeouts:
typescript
const response = await fetch(url, {
method: 'POST',
headers: { 'Authorization': Bearer ${apiKey}, 'Content-Type': 'application/json' },
body: JSON.stringify(payload),
signal: AbortSignal.timeout(120000) // 2 minute timeout
});
Cost Control
Prevent runaway costs with token budgets:
`typescript
class TokenBudget {
private spent = 0;
constructor(
private readonly maxBudget: number,
private readonly costPerToken: number
) {}
async executeWithBudget(fn: () => Promise): Promise {
const estimatedCost = this.estimateCost(fn);
if (this.spent + estimatedCost > this.maxBudget) {
throw new Error(Budget exceeded. Spent: ${this.spent}, Max: ${this.maxBudget});
}
const result = await fn();
this.spent += estimatedCost;
return result;
}
private estimateCost(fn: () => Promise): number {
// Rough estimate based on input size
return 0; // Would need implementation
}
getSpent(): number {
return this.spent;
}
}
`
Building Reliable AI Applications
The key insight: AI APIs require defensive programming at a level traditional APIs don't. Layer retry logic, circuit breakers, validation, and cost controls to build systems that degrade gracefully rather than fail catastrophically.
Get started with reliable AI API access: ofox.ai
This article contains affiliate links.
Tags: api,error-handling,programming,developer,reliability
Canonical URL: https://dev.to/zny10289
Top comments (0)