DEV Community

Cover image for The TypeScript SDK - Making It Developer-Friendly
Tyson Cung
Tyson Cung

Posted on

The TypeScript SDK - Making It Developer-Friendly

My AI platform worked perfectly. Lambda tools, ECS agents, streaming responses. But integrating it into frontend applications? Pure agony.

Raw HTTP calls scattered everywhere. Manual JSON parsing breaking on edge cases. Zero type safety, so temperature: "0.5" passed silently instead of temperature: 0.5. Error handling that assumed network calls never fail.

Three developers tried to integrate the platform. All three gave up on streaming because parsing Server-Sent Events manually is a nightmare.

I needed an SDK that followed one rule: If your SDK needs documentation beyond IntelliSense, your SDK is wrong.

Why Build an SDK at All?

Working with AI APIs through raw HTTP is painful for several reasons:

Type Safety: Without types, developers make mistakes. They send temperature: "0.5" instead of temperature: 0.5 and wonder why their responses are weird.

Streaming: Server-Sent Events are a nightmare to implement correctly. I've seen developers give up on streaming entirely rather than deal with parsing SSE chunks.

Error Handling: AI APIs fail in creative ways. Network timeouts, rate limits, model overloads, context length exceeded - each needs different handling strategies.

Authentication: Managing API keys, rotating tokens, handling BYOK (Bring Your Own Key) scenarios.

Discoverability: Without good IntelliSense, developers resort to copy-pasting from docs and hope for the best.

I've used dozens of API SDKs over the years. The good ones feel invisible - you just write code and it works. The bad ones require constant trips to documentation.

Contract-First Development with OpenAPI

The first decision: generate everything from OpenAPI specs instead of hand-writing types. I learned this lesson the hard way when our API evolved and the SDK fell behind.

Here's my OpenAPI spec for the core completion endpoint:

/v1/complete:
  post:
    summary: Generate AI completion
    requestBody:
      required: true
      content:
        application/json:
          schema:
            type: object
            properties:
              messages:
                type: array
                items:
                  type: object
                  properties:
                    role:
                      type: string
                      enum: [system, user, assistant]
                    content:
                      type: string
              provider:
                type: string
                enum: [openai, anthropic, bedrock]
              model:
                type: string
              temperature:
                type: number
                minimum: 0
                maximum: 2
              stream:
                type: boolean
                default: false
    responses:
      200:
        description: Completion response
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/CompletionResponse'
          text/event-stream:
            schema:
              type: string
Enter fullscreen mode Exit fullscreen mode

I use @apidevtools/swagger-parser and custom templates to generate TypeScript interfaces:

npx swagger-codegen generate \
  -i openapi.yaml \
  -g typescript-fetch \
  -o src/generated \
  --additional-properties=typescriptThreePlus=true
Enter fullscreen mode Exit fullscreen mode

This generates perfect TypeScript types that stay in sync with my API automatically.

SDK Design Philosophy: Options Over Builders

I had to choose between two common patterns:

Builder Pattern:

const response = await client
  .complete()
  .withModel('gpt-4')
  .withTemperature(0.7)
  .withStream(true)
  .execute();
Enter fullscreen mode Exit fullscreen mode

Options Object:

const response = await client.complete({
  model: 'gpt-4',
  temperature: 0.7,
  stream: true
});
Enter fullscreen mode Exit fullscreen mode

I chose options objects for several reasons:

  1. Destructuring support - you can spread configuration objects
  2. Conditional parameters - easier to build options dynamically
  3. Less cognitive overhead - one method call instead of a chain
  4. Better TypeScript inference - the compiler can validate the entire options object at once

Here's the core client interface:

import type { CompletionRequest, CompletionResponse, EmbeddingRequest, EmbeddingResponse } from './types';

export interface AIGatewayOptions {
  baseUrl: string;
  apiKey?: string;
  timeout?: number;
}

export class AIGateway {
  private client: ReturnType<typeof createClient<paths>>;
  private baseUrl: string;
  private apiKey?: string;
  private timeout: number;

  constructor(options: AIGatewayOptions) {
    this.baseUrl = options.baseUrl.replace(/\/$/, '');
    this.apiKey = options.apiKey;
    this.timeout = options.timeout || 30000;

    this.client = createClient<paths>({
      baseUrl: this.baseUrl,
      headers: {
        'Content-Type': 'application/json',
        ...(this.apiKey ? { 'X-API-Key': this.apiKey } : {})
      }
    });
  }

  async complete(req: CompletionRequest): Promise<CompletionResponse> {
    const { data, error } = await this.client.POST('/v1/complete', {
      body: { ...req, stream: false },
      signal: AbortSignal.timeout(this.timeout)
    });
    if (error) throw new Error(`Gateway error: ${JSON.stringify(error)}`);
    return data as CompletionResponse;
  }

  async *stream(req: Omit<CompletionRequest, 'stream'>): AsyncGenerator<string> {
    const headers: Record<string, string> = {
      'Content-Type': 'application/json'
    };
    if (this.apiKey) {
      headers['X-API-Key'] = this.apiKey;
    }

    const response = await fetch(`${this.baseUrl}/v1/complete`, {
      method: 'POST',
      headers,
      body: JSON.stringify({ ...req, stream: true })
    });

    if (!response.ok) {
      const err = await response.json().catch(() => ({ error: response.statusText }));
      throw new Error(`Gateway error ${response.status}: ${err.error}`);
    }

    yield* parseSSEStream(response);
  }

  async embed(req: EmbeddingRequest): Promise<EmbeddingResponse> {
    const { data, error } = await this.client.POST('/v1/embed', {
      body: req,
      signal: AbortSignal.timeout(this.timeout)
    });
    if (error) throw new Error(`Gateway error: ${JSON.stringify(error)}`);
    return data as EmbeddingResponse;
  }
}

export interface CompletionRequest {
  messages: Array<{ role: 'system' | 'user' | 'assistant'; content: string }>;
  model?: string;
  temperature?: number;
  maxTokens?: number;
  provider?: 'openai' | 'anthropic' | 'bedrock';
  apiKey?: string; // BYOK support
}

export interface CompletionResponse {
  id: string;
  content: string;
  provider: string;
  model: string;
  usage: {
    promptTokens: number;
    completionTokens: number;
    totalTokens: number;
  };
  metadata: {
    latency: number;
    cost?: number;
    region: string;
  };
}
Enter fullscreen mode Exit fullscreen mode

Streaming That Actually Works

Streaming AI responses is hard to get right. I've seen developers give up because they couldn't parse Server-Sent Events correctly. My SDK handles all the complexity:

// streaming.ts - Server-Sent Event parsing
export async function* parseSSEStream(response: Response): AsyncGenerator<string> {
  const reader = response.body?.getReader();
  if (!reader) {
    throw new Error('Response body is not readable');
  }

  const decoder = new TextDecoder();
  let buffer = '';

  try {
    while (true) {
      const { done, value } = await reader.read();
      if (done) break;

      buffer += decoder.decode(value, { stream: true });
      const lines = buffer.split('\n');
      buffer = lines.pop() || '';

      for (const line of lines) {
        if (line.startsWith('data: ')) {
          const data = line.slice(6).trim();

          if (data === '[DONE]') {
            return;
          }

          try {
            const parsed = JSON.parse(data);
            if (parsed.content) {
              yield parsed.content;
            }
          } catch {
            // Skip malformed JSON chunks
            continue;
          }
        }
      }
    }
  } finally {
    reader.releaseLock();
  }
}

// Usage in AIGateway class
async *stream(req: Omit<CompletionRequest, 'stream'>): AsyncGenerator<string> {
  const headers: Record<string, string> = {
    'Content-Type': 'application/json'
  };
  if (this.apiKey) {
    headers['X-API-Key'] = this.apiKey;
  }

  const response = await fetch(`${this.baseUrl}/v1/complete`, {
    method: 'POST',
    headers,
    body: JSON.stringify({ ...req, stream: true })
  });

  if (!response.ok) {
    const err = await response.json().catch(() => ({ error: response.statusText }));
    throw new Error(`Gateway error ${response.status}: ${err.error}`);
  }

  yield* parseSSEStream(response);
}
Enter fullscreen mode Exit fullscreen mode

Usage becomes trivial:

// Stream a completion
for await (const chunk of client.stream({
  messages: [{ role: 'user', content: 'Write a story about...' }],
  model: 'gpt-4'
})) {
  process.stdout.write(chunk.content);
}
Enter fullscreen mode Exit fullscreen mode

Error Handling That Makes Sense

AI APIs fail in predictable ways. Instead of generic HTTP errors, I provide specific error types with actionable information:

export abstract class AIGatewayError extends Error {
  abstract readonly code: string;
  abstract readonly retryable: boolean;

  constructor(
    message: string,
    public readonly statusCode?: number,
    public readonly details?: Record<string, unknown>
  ) {
    super(message);
    this.name = this.constructor.name;
  }
}

export class RateLimitError extends AIGatewayError {
  readonly code = 'RATE_LIMIT_EXCEEDED';
  readonly retryable = true;

  constructor(
    public readonly retryAfter: number,
    details?: Record<string, unknown>
  ) {
    super(`Rate limit exceeded. Retry after ${retryAfter} seconds.`, 429, details);
  }
}

export class ContextLengthError extends AIGatewayError {
  readonly code = 'CONTEXT_LENGTH_EXCEEDED';
  readonly retryable = false;

  constructor(
    public readonly maxTokens: number,
    public readonly actualTokens: number
  ) {
    super(`Context length exceeded: ${actualTokens} > ${maxTokens} tokens`, 400);
  }
}

export class ModelUnavailableError extends AIGatewayError {
  readonly code = 'MODEL_UNAVAILABLE';
  readonly retryable = true;

  constructor(public readonly model: string) {
    super(`Model ${model} is currently unavailable`, 503);
  }
}

export class BudgetExceededError extends AIGatewayError {
  readonly code = 'BUDGET_EXCEEDED';
  readonly retryable = false;

  constructor(
    public readonly currentSpend: number,
    public readonly limit: number
  ) {
    super(`Monthly budget exceeded: $${currentSpend} > $${limit}`, 402);
  }
}
Enter fullscreen mode Exit fullscreen mode

The HTTP client automatically converts API errors to typed exceptions:

private async handleResponse<T>(response: AxiosResponse): Promise<T> {
  if (response.status >= 400) {
    const error = response.data;

    switch (error.code) {
      case 'rate_limit_exceeded':
        throw new RateLimitError(error.retry_after, error);
      case 'context_length_exceeded':
        throw new ContextLengthError(error.max_tokens, error.actual_tokens);
      case 'model_unavailable':
        throw new ModelUnavailableError(error.model);
      default:
        throw new AIPlatformError(error.message, response.status, error);
    }
  }

  return response.data;
}
Enter fullscreen mode Exit fullscreen mode

Retry Logic with Exponential Backoff

Retries are built into the SDK with sensible defaults:

export class RetryHandler {
  constructor(
    private maxRetries: number = 3,
    private baseDelay: number = 1000,
    private maxDelay: number = 30000
  ) {}

  async execute<T>(fn: () => Promise<T>): Promise<T> {
    let lastError: Error;

    for (let attempt = 0; attempt <= this.maxRetries; attempt++) {
      try {
        return await fn();
      } catch (error) {
        lastError = error;

        if (error instanceof AIPlatformError && !error.retryable) {
          throw error; // Don't retry non-retryable errors
        }

        if (attempt === this.maxRetries) {
          throw error; // Last attempt failed
        }

        const delay = Math.min(
          this.baseDelay * Math.pow(2, attempt),
          this.maxDelay
        );

        if (error instanceof RateLimitError) {
          // Respect the API's rate limit guidance
          await this.sleep(error.retryAfter * 1000);
        } else {
          await this.sleep(delay);
        }
      }
    }

    throw lastError;
  }

  private sleep(ms: number): Promise<void> {
    return new Promise(resolve => setTimeout(resolve, ms));
  }
}
Enter fullscreen mode Exit fullscreen mode

BYOK: Bring Your Own Key

Many users want to use their own API keys for cost control or compliance. The SDK supports this seamlessly:

// Global API key (uses platform credits)
const client = new AIPlatformClient({
  apiKey: 'your-platform-key'
});

// Per-request API key (BYOK)
await client.complete({
  messages: [{ role: 'user', content: 'Hello' }],
  provider: 'openai',
  apiKey: 'sk-user-openai-key' // Use user's own OpenAI key
});

// Environment-based keys
const client = new AIPlatformClient({
  apiKeys: {
    openai: process.env.OPENAI_API_KEY,
    anthropic: process.env.ANTHROPIC_API_KEY
  }
});
Enter fullscreen mode Exit fullscreen mode

The platform routes BYOK requests directly to the provider, so users get:

  • Their own rate limits
  • Direct billing relationship
  • Full control over their keys
  • Same SDK interface

Testing with Mock Providers

Testing AI applications is hard because real API calls are slow and expensive. I built a mock provider for unit tests:

export class MockProvider implements Provider {
  private responses: Map<string, any> = new Map();

  setMockResponse(input: string, response: CompletionResponse): void {
    this.responses.set(input, response);
  }

  async complete(options: CompleteOptions): Promise<CompletionResponse> {
    const key = this.hashInput(options.messages);
    const mockResponse = this.responses.get(key);

    if (!mockResponse) {
      throw new Error(`No mock response configured for input: ${key}`);
    }

    // Simulate API latency
    await new Promise(resolve => setTimeout(resolve, 100));

    return {
      ...mockResponse,
      metadata: {
        ...mockResponse.metadata,
        latency: 100
      }
    };
  }
}

// In tests
const mockProvider = new MockProvider();
mockProvider.setMockResponse('Hello', {
  id: 'test-123',
  content: 'Hello! How can I help you today?',
  provider: 'mock',
  model: 'test-model',
  usage: { promptTokens: 1, completionTokens: 8, totalTokens: 9 },
  metadata: { latency: 100, region: 'test' }
});

const client = new AIPlatformClient({ provider: mockProvider });
Enter fullscreen mode Exit fullscreen mode

Real-World Usage Examples

Here's how developers actually use the SDK in our applications:

Chat Application:

import { AIPlatformClient } from '@ai-platform/sdk';

const client = new AIPlatformClient({
  apiKey: process.env.AI_PLATFORM_KEY
});

export async function getChatResponse(messages: Message[]): Promise<string> {
  try {
    const response = await client.complete({
      messages,
      model: 'gpt-4',
      temperature: 0.7
    });

    return response.content;
  } catch (error) {
    if (error instanceof ContextLengthError) {
      // Truncate conversation history and retry
      const truncated = messages.slice(-10);
      return getChatResponse(truncated);
    }

    throw error;
  }
}
Enter fullscreen mode Exit fullscreen mode

Streaming Chat:

export async function* streamChatResponse(
  messages: Message[]
): AsyncGenerator<string, void, unknown> {
  try {
    for await (const chunk of client.stream({
      messages,
      model: 'gpt-4',
      temperature: 0.7
    })) {
      yield chunk.content;
    }
  } catch (error) {
    if (error instanceof RateLimitError) {
      yield `Rate limit exceeded. Retrying in ${error.retryAfter} seconds...`;
      await new Promise(resolve => setTimeout(resolve, error.retryAfter * 1000));
      yield* streamChatResponse(messages);
    } else {
      throw error;
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Agent Workflows:

export async function researchTopic(topic: string): Promise<ResearchResult> {
  const response = await client.agent.run({
    type: 'research',
    input: { topic },
    tools: ['search', 'summarize', 'extract'],
    humanApproval: true
  });

  return {
    summary: response.summary,
    sources: response.sources,
    confidence: response.metadata.confidence
  };
}
Enter fullscreen mode Exit fullscreen mode

Bundle Size and Tree Shaking

Modern applications care about bundle size. The SDK is designed for optimal tree shaking:

// Only import what you need
import { complete } from '@ai-platform/sdk/complete';
import { embed } from '@ai-platform/sdk/embed';

// Full client (if you need everything)
import { AIPlatformClient } from '@ai-platform/sdk';
Enter fullscreen mode Exit fullscreen mode

The package exports are configured for maximum tree shaking:

{
  "exports": {
    ".": {
      "import": "./dist/index.esm.js",
      "require": "./dist/index.cjs.js"
    },
    "./complete": {
      "import": "./dist/complete.esm.js",
      "require": "./dist/complete.cjs.js"
    },
    "./embed": {
      "import": "./dist/embed.esm.js", 
      "require": "./dist/embed.cjs.js"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Documentation That Developers Actually Read

I follow a simple documentation principle: Examples first, reference second.

Every method has a practical example in the JSDoc:

/**
 * Generate a text completion using AI models
 * 
 * @example
 * ```

typescript
 * const response = await client.complete({
 *   messages: [{ role: 'user', content: 'Write a haiku about TypeScript' }],
 *   model: 'gpt-4',
 *   temperature: 0.8
 * });
 * 
 * console.log(response.content); // AI-generated haiku
 *

Enter fullscreen mode Exit fullscreen mode
  • @example BYOK (Bring Your Own Key)
  • ``` typescript
  • const response = await client.complete({
  • messages: [{ role: 'user', content: 'Hello' }],
  • provider: 'openai',
  • apiKey: 'sk-your-openai-key' // Use your own key
  • }); * */ async complete(options: CompleteOptions): Promise<CompletionResponse>

The Results

After 6 months with the SDK in production:

Developer Experience Metrics:

  • Integration time: 2 hours down to 15 minutes
  • Support tickets: 60% reduction
  • Bug reports related to API usage: 85% reduction

Adoption:

  • 15 internal applications using the SDK
  • 3 external partners building on the platform
  • 95% of new integrations use the SDK vs raw HTTP

Performance:

  • Bundle size: 45KB gzipped (with tree shaking)
  • Streaming latency overhead: <5ms
  • Error recovery success rate: 92%

The SDK transformed our AI platform from infrastructure to product. Developers don't think about HTTP calls, error handling, or streaming complexity anymore. They just write business logic.

What's Next

The complete SDK code is available at:

Part 7 covers the production nightmares: cost tracking, authentication, security. Great developer tools mean nothing if they bankrupt your company.


Part 6 of 8 in "Building an AI Platform on AWS from Scratch". Everything I learned building production AI infrastructure - including the expensive mistakes.

Top comments (0)