Midas126

Posted on Apr 13

The AI Engineer's Toolkit: Building a Production-Ready Mocking Layer

#testing #api #ai #development

Why Your AI Project Needs a Mocking Strategy

You’ve just integrated a cutting-edge Large Language Model (LLM) into your application. The prototype works magically. But when you try to run your test suite, you hit a wall: latency, rate limits, and unpredictable costs from the AI provider's API. Your development velocity grinds to a halt, and testing in CI/CD becomes a financial and logistical nightmare. This is the reality of modern AI development, and it’s why a robust mocking strategy isn't a luxury—it's a necessity for production-grade systems.

While tools like AIMock offer a fantastic starting point, this guide dives deeper. We'll build a programmable, multi-purpose mocking layer from the ground up. This approach gives you fine-grained control for unit testing, integration testing, and local development, ensuring your AI features are as reliable and testable as any other part of your codebase.

Beyond Simple Stubs: The Anatomy of an AI Mock

A simple HTTP stub that returns a fixed JSON response is insufficient for AI APIs. We need to simulate their unique behaviors:

Structured Outputs: Mimicking JSON mode or function calling responses.
Streaming: Simulating Server-Sent Events (SSE) for token-by-token responses.
Non-Determinism: Injecting controlled randomness for testing edge cases.
Error Simulation: Reproducing specific API errors (rate limits, context overflows).

Let's architect a mock server that can handle these scenarios.

Core Concept: The Mock Router

We'll create a central router that intercepts requests to AI provider endpoints (like api.openai.com/v1/chat/completions) and delegates them to handler functions based on the request path and configured mode.

Here’s a conceptual setup using Node.js and Express, but the pattern applies to any stack:

// mockAIProvider.js
const express = require('express');
const app = express();
app.use(express.json());

const MOCK_MODE = process.env.AI_MOCK_MODE || 'dynamic'; // 'static', 'dynamic', 'error'

app.post('/v1/chat/completions', async (req, res) => {
  const { model, messages, stream } = req.body;

  // Route to the appropriate handler
  switch (MOCK_MODE) {
    case 'static':
      return handleStaticCompletion(req, res);
    case 'dynamic':
      return handleDynamicCompletion(req, res);
    case 'error':
      return handleErrorResponse(req, res);
    default:
      return handleDynamicCompletion(req, res);
  }
});

// Handler for static, predictable responses (ideal for unit tests)
function handleStaticCompletion(req, res) {
  const staticResponse = {
    id: 'mock_123',
    object: 'chat.completion',
    created: Date.now(),
    model: req.body.model || 'gpt-3.5-turbo',
    choices: [{
      index: 0,
      message: { role: 'assistant', content: 'This is a static mock response.' },
      finish_reason: 'stop'
    }],
    usage: { prompt_tokens: 10, completion_tokens: 5, total_tokens: 15 }
  };
  res.json(staticResponse);
}

// Handler for dynamic, context-aware responses (for integration tests)
function handleDynamicCompletion(req, res) {
  const lastMessage = req.body.messages?.slice(-1)[0]?.content || '';
  const dynamicContent = `Mock AI analyzed your request: "${lastMessage.substring(0, 50)}...". This is a dynamic response.`;

  const dynamicResponse = {
    id: `mock_${Date.now()}`,
    choices: [{
      message: { role: 'assistant', content: dynamicContent },
      finish_reason: 'stop'
    }],
  };
  res.json(dynamicResponse);
}

app.listen(3001, () => console.log('AI Mock Server running on port 3001'));

Leveling Up: Simulating Streaming Responses

Streaming is critical for UX in AI apps. Mocking it allows you to test your UI's loading states and chunk-rendering logic. We can simulate SSE:

// In your mock server, add a stream handler
function handleStreamingCompletion(req, res) {
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');

  const mockTokens = ['Hello', ',', ' world', '!', ' This', ' streams', '.'];
  let index = 0;

  const intervalId = setInterval(() => {
    if (index < mockTokens.length) {
      const chunk = {
        id: `mock_${Date.now()}`,
        choices: [{
          delta: { content: mockTokens[index] },
          index: 0,
          finish_reason: null
        }]
      };
      res.write(`data: ${JSON.stringify(chunk)}\n\n`);
      index++;
    } else {
      const doneChunk = {
        choices: [{ delta: {}, index: 0, finish_reason: 'stop' }]
      };
      res.write(`data: ${JSON.stringify(doneChunk)}\n\n`);
      clearInterval(intervalId);
      res.end();
    }
  }, 50); // Simulate 50ms per token
}

Implementing a "Scenario Registry" for Complex Testing

For integration tests, you need to orchestrate specific sequences of AI behavior. A scenario registry allows you to pre-program responses based on the input.

// scenarioRegistry.js
class AIMockScenarioRegistry {
  constructor() {
    this.scenarios = new Map();
  }

  register(scenarioId, handlerFunction) {
    this.scenarios.set(scenarioId, handlerFunction);
  }

  async handleRequest(scenarioId, request) {
    const handler = this.scenarios.get(scenarioId);
    if (handler) {
      return await handler(request);
    }
    // Default fallback behavior
    return {
      choices: [{ message: { content: 'Default mock response.' } }]
    };
  }
}

// Usage in your test suite
const registry = new AIMockScenarioRegistry();

// Define a scenario where the AI rejects a harmful query
registry.register('safety_filter_triggered', (req) => {
  const lastMessage = req.messages?.slice(-1)[0]?.content || '';
  if (lastMessage.includes('harmful instruction')) {
    return {
      choices: [{
        message: {
          role: 'assistant',
          content: 'I cannot comply with this request.'
        }
      }]
    };
  }
  return null; // Falls back to default
});

// In your test
const mockResponse = await registry.handleRequest('safety_filter_triggered', testRequest);
expect(mockResponse.choices[0].message.content).toContain('cannot comply');

Integrating the Mock into Your Development Workflow

The real power comes from seamlessly toggling between mock and live APIs.

Environment-Based Configuration: Use environment variables to switch endpoints.

# .env.local
OPENAI_BASE_URL=http://localhost:3001/v1
AI_MOCK_MODE=dynamic

# .env.production
OPENAI_BASE_URL=https://api.openai.com/v1

In Your Application Code:

// aiClient.js
import { OpenAI } from 'openai';

const baseURL = process.env.OPENAI_BASE_URL || 'https://api.openai.com/v1';

const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  baseURL: baseURL, // Key: Your mock or real API
});

export default client;

In Your Test Suite (Jest Example):

// jest.setup.js
if (process.env.NODE_ENV === 'test') {
  process.env.OPENAI_BASE_URL = 'http://localhost:3001/v1';
  // Start your mock server programmatically before all tests
  global.mockServer = startMockServer();
}

The Payoff: What You Gain

Blazing Fast Tests: Unit tests run in milliseconds, not seconds.
Deterministic Tests: No flakiness due to API variability.
Cost Elimination in CI/CD: Run thousands of tests for free.
Offline Development: Code on planes, trains, or anywhere.
Error Scenario Testing: Reliably test how your app handles API failures.

Your Action Plan

Start simple. Don't try to build a perfect mock on day one.

This Week: Implement a basic static mock for your most-used AI endpoint (like chat completions). Redirect your local dev environment to use it.
Next Week: Add a dynamic handler that varies responses based on the user's input message. Implement your first scenario for a critical integration test.
Next Month: Integrate streaming support and wrap your mock server in a Docker container for easy sharing across your team.

By investing in this mocking layer, you're not just avoiding API costs—you're building a foundation for robust, reliable, and rapid AI development. Your future self, and your teammates, will thank you when deployment day comes and everything works as tested.

What's the first AI interaction in your stack that you'll mock? Share your approach or your biggest mocking challenge in the comments below.

DEV Community