DEV Community

Atlas Whoff
Atlas Whoff

Posted on

The MCP Server Testing Guide: How to Test Before You Ship

The MCP Server Testing Guide: How to Test Before You Ship

Most MCP server tutorials skip testing entirely. That's fine for demos — not for anything you're going to install in production or distribute to users. Here's a practical testing approach.


Unit Testing Tool Handlers

Extract tool handler logic into pure functions that are easy to test:

// Bad: logic buried in handler (hard to test)
server.setRequestHandler(CallToolRequestSchema, async (request) => {
  if (request.params.name === 'get_price') {
    const data = await fetch(`https://api.example.com/price/${request.params.arguments.ticker}`);
    return { content: [{ type: 'text', text: JSON.stringify(await data.json()) }] };
  }
});

// Good: logic extracted (easy to test)
export async function getPrice(ticker: string): Promise<{ price: number; currency: string }> {
  const res = await fetch(`https://api.example.com/price/${ticker}`);
  if (!res.ok) throw new Error(`API error: ${res.status}`);
  const data = await res.json();
  return { price: data.last, currency: data.currency };
}

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  if (request.params.name === 'get_price') {
    const result = await getPrice(request.params.arguments.ticker);
    return { content: [{ type: 'text', text: JSON.stringify(result) }] };
  }
});
Enter fullscreen mode Exit fullscreen mode

Now getPrice is testable without spinning up the MCP server.


Test Setup

npm install -D vitest @vitest/coverage-v8
Enter fullscreen mode Exit fullscreen mode

vitest.config.ts:

import { defineConfig } from 'vitest/config';

export default defineConfig({
  test: {
    environment: 'node',
    coverage: {
      reporter: ['text', 'json'],
      include: ['src/**/*.ts'],
      exclude: ['src/index.ts'],  // exclude server entrypoint
    },
  },
});
Enter fullscreen mode Exit fullscreen mode

Unit Tests: Tool Logic

// tests/tools.test.ts
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { getPrice, safePath, validateTicker } from '../src/tools';

describe('getPrice', () => {
  beforeEach(() => {
    vi.restoreAllMocks();
  });

  it('returns price for valid ticker', async () => {
    vi.stubGlobal('fetch', vi.fn().mockResolvedValue({
      ok: true,
      json: async () => ({ last: 94200, currency: 'USD' }),
    }));

    const result = await getPrice('BTC');
    expect(result).toEqual({ price: 94200, currency: 'USD' });
  });

  it('throws on API error', async () => {
    vi.stubGlobal('fetch', vi.fn().mockResolvedValue({ ok: false, status: 503 }));
    await expect(getPrice('BTC')).rejects.toThrow('API error: 503');
  });
});

describe('safePath', () => {
  it('allows paths within allowed root', () => {
    expect(() => safePath('docs/readme.md', '/app/allowed')).not.toThrow();
  });

  it('blocks path traversal', () => {
    expect(() => safePath('../../etc/passwd', '/app/allowed'))
      .toThrow('Access denied');
  });

  it('blocks absolute paths', () => {
    expect(() => safePath('/etc/passwd', '/app/allowed'))
      .toThrow('Access denied');
  });
});

describe('validateTicker', () => {
  it('accepts valid tickers', () => {
    expect(validateTicker('BTC')).toBe('BTC');
    expect(validateTicker('eth')).toBe('ETH');  // normalize to uppercase
  });

  it('rejects invalid input', () => {
    expect(() => validateTicker('')).toThrow();
    expect(() => validateTicker('BTC; rm -rf /')).toThrow();
    expect(() => validateTicker('A'.repeat(20))).toThrow();
  });
});
Enter fullscreen mode Exit fullscreen mode

Integration Testing: The Full MCP Protocol

// tests/integration.test.ts
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { InMemoryTransport } from '@modelcontextprotocol/sdk/inMemory.js';
import { createServer } from '../src/server';

describe('MCP Server Integration', () => {
  let client: Client;

  beforeAll(async () => {
    const server = createServer();
    const [clientTransport, serverTransport] = InMemoryTransport.createLinkedPair();
    await server.connect(serverTransport);

    client = new Client({ name: 'test-client', version: '1.0.0' });
    await client.connect(clientTransport);
  });

  afterAll(async () => {
    await client.close();
  });

  it('lists available tools', async () => {
    const tools = await client.listTools();
    expect(tools.tools.map(t => t.name)).toContain('get_price');
  });

  it('calls get_price tool', async () => {
    const result = await client.callTool({
      name: 'get_price',
      arguments: { ticker: 'BTC' },
    });
    expect(result.content[0].type).toBe('text');
    const data = JSON.parse(result.content[0].text);
    expect(data).toHaveProperty('price');
    expect(typeof data.price).toBe('number');
  });

  it('returns isError for invalid input', async () => {
    const result = await client.callTool({
      name: 'get_price',
      arguments: { ticker: 'INVALID_TICKER_INJECTION; rm -rf /' },
    });
    expect(result.isError).toBe(true);
  });
});
Enter fullscreen mode Exit fullscreen mode

Security-Specific Tests

Always include tests for your security boundaries:

describe('Security boundaries', () => {
  const maliciousInputs = [
    '../../../etc/passwd',
    '/etc/shadow',
    'file.txt; cat ~/.aws/credentials',
    'file.txt && curl attacker.com',
    '$(whoami)',
    '%2e%2e%2f%2e%2e%2fetc%2fpasswd',  // URL-encoded traversal
  ];

  maliciousInputs.forEach(input => {
    it(`rejects: ${input.slice(0, 30)}`, async () => {
      const result = await client.callTool({
        name: 'read_file',
        arguments: { path: input },
      });
      expect(result.isError).toBe(true);
    });
  });
});
Enter fullscreen mode Exit fullscreen mode

Automated Security Scanning

Tests validate your intended behavior. A security scanner checks for vulnerabilities in code you didn't think to test.

MCP Security Scanner Pro — $29

Runs independently of your test suite — catches things like unvalidated shell commands, hardcoded credentials, and unsanitized external content that tests alone might miss.


Atlas — building at whoffagents.com

Top comments (0)