Paul Towers

Posted on Sep 15

Automated Test Generation with Custom Claude Commands: Architecting Scalable Testing for Modern Node.js Applications

#node #testing #claudecode #ai

Testing comprehensive Node.js applications with complex layered architectures presents a unique challenge: maintaining consistent patterns and architectural boundaries across hundreds of test files while ensuring each layer is tested appropriately. After architecting and implementing test suites across multiple production systems, generating over 200 test files with rigorous patterns, I've developed an approach that transforms testing from an ad-hoc craft into a systematic engineering discipline: AI-powered test generation through custom Claude commands that encode architectural testing knowledge.

This isn't about replacing human expertise with AI generation. It's about codifying years of testing architecture decisions into reusable, consistent patterns that scale beyond individual developer capacity while maintaining the rigor demanded by production systems. When done correctly, this approach ensures every test follows your exact architectural boundaries, error handling patterns, and quality standards, regardless of who writes the code or when it's written.

The Scaling Challenge: Consistency Across Architectural Layers

Modern Node.js applications following Domain-Driven Design principles typically implement a sophisticated layered architecture:

Routes → Controllers → Services → Functions → Database/External APIs
       ↓           ↓         ↓          ↓
   Configuration  HTTP    Orchestration  Business Logic

Each layer serves a distinct architectural purpose and requires fundamentally different testing strategies:

Routes: Configuration and dependency wiring validation
Controllers: HTTP protocol translation and service delegation
Services: Business transaction orchestration and error boundary management
Functions: Pure domain logic and data persistence operations
Validation Chains: Input sanitization and business rule enforcement

The challenge isn't writing individual tests, it's maintaining architectural consistency across hundreds of files while ensuring each test appropriately validates its layer's concerns without bleeding into adjacent layers' responsibilities.

The Consistency Problem at Scale

As teams grow and codebases expand, several challenges emerge:

Inconsistent Abstraction Boundaries: Different developers may inadvertently test business logic in controller tests or HTTP concerns in service tests, creating brittle, overlapping coverage.

Mock Strategy Fragmentation: Teams often lack clear, consistent guidelines for what to mock at each layer, leading to test suites where some files mock everything, others mock nothing, with no clear architectural rationale.

Error Handling Coverage Variations: Manual testing often under-tests error scenarios or implements different error testing patterns across files, particularly the complex error propagation patterns required for proper Express.js middleware chains.

Knowledge Transfer Gaps: Senior developers' testing wisdom doesn't automatically transfer to junior team members, leading to inconsistent quality and architectural understanding.

The Architecture-First Solution: Codifying Testing Knowledge

Instead of leaving testing patterns to individual developer judgment, I developed a system of 10 specialized Claude commands that encode deep architectural knowledge about testing layered Node.js applications. Each command represents years of production experience distilled into precise, reusable instructions that ensure every generated test follows exact architectural boundaries.

Let's examine how each command works and the specific instructions that make them effective.

Backend Unit Test Command Architecture

Controller Layer Testing: HTTP Protocol Boundaries

Controllers in a properly architected Express application serve as HTTP protocol adapters—they extract data from requests, delegate to services, and format responses. They should contain zero business logic.

Our controller testing command (create-backend-controller-test.md) includes these critical instructions:

# WHAT THIS TESTS:
1. **Happy Path** - Successful request/response flow
2. **Request Data Extraction** - Input extraction and validation
3. **Service Layer Integration** - Service method calls with correct parameters
4. **Response Formatting** - Correct status codes and JSON structure
5. **Error Handling** - Service error propagation (MANDATORY)
6. **Edge Cases** - Missing data, null/undefined inputs

# WHAT NOT TO TEST:
- Service implementation (that's service layer testing)
- Database operations (mock the service layer)
- Validation logic (that's validation chain testing)
- Middleware execution (that's middleware testing)

# CRITICAL REQUIREMENTS:
1. **ALWAYS use vi.hoisted() pattern** for module-level mocks
2. **Use vi.mock() with hoisted factories** for external dependencies  
3. **Use vi.fn(), vi.spyOn() inside tests** for local mocks only
4. **Never reference unhoisted functions** in mock factories
5. ALWAYS mock the service layer completely
6. Never make real service/database calls

The command enforces these architectural boundaries by generating tests that focus on data extraction verification:

// Generated pattern emphasizes data extraction verification
describe('Data Extraction', () => {
  it('should extract only required fields from request body', async () => {
    mockReq.body = {
      validField: 'data',
      extraField: 'should-be-ignored',
      maliciousField: '<script>alert("xss")</script>',
      _id: 'should-be-ignored'
    };

    await createController(mockReq, mockRes);

    expect(mockService).toHaveBeenCalledWith({
      validField: 'data'
      // Verification that only valid fields are passed
    }, userId, accountId);
  });
});

Critical Pattern: Error Propagation, Not Handling

The command includes specific instructions for testing error propagation—a crucial architectural distinction:

## Error Test Pattern:
await expect(controllerFunction(mockReq, mockRes)).rejects.toThrow('Error message');
expect(mockRes.status).not.toHaveBeenCalled();

This generates tests that verify controllers propagate service errors without modification:

it('should propagate service layer errors without modification', async () => {
  const serviceError = new Error('Database connection failed');
  serviceError.statusCode = 500;
  serviceError.errorCode = 'DB_CONNECTION_ERROR';

  mockService.mockRejectedValue(serviceError);

  // Controller should NOT catch this error
  await expect(createController(mockReq, mockRes))
    .rejects.toThrow('Database connection failed');

  // Controller should NOT respond when errors occur
  expect(mockRes.status).not.toHaveBeenCalled();
  expect(mockRes.json).not.toHaveBeenCalled();
});

Service Layer Testing: Orchestration and Transaction Boundaries

Services coordinate business operations and manage transaction boundaries. Our service testing command (create-backend-services-test.md) focuses on orchestration logic rather than business rule implementation:

# WHAT TO TEST:
1. **Happy Path**
2. **Function Orchestration**
3. **Transaction Logic**
4. **Error Handling**
5. **Business Flow Control**
6. **Edge Cases**

# WHAT NOT TO TEST:
- Function implementation (test functions separately)
- Database operations (mock the functions that call DB)
- External API calls (mock the functions that make calls)
- Controller logic (services are called by controllers)

The command generates service tests that emphasize call ordering and transaction management:

// Generated service tests emphasize call ordering and transaction management
describe('Business Logic Flow', () => {
  it('should execute steps in correct order', async () => {
    const callOrder = [];
    mockCheckPermissions.mockImplementation(async () => {
      callOrder.push('permissions');
    });
    mockWithTransaction.mockImplementation(async (callback) => {
      callOrder.push('transactionStart');
      const result = await callback(mockSession);
      callOrder.push('transactionEnd');
      return result;
    });

    await createService(data, userId, accountId);

    expect(callOrder).toEqual([
      'permissions',
      'transactionStart', 
      'createResource',
      'createVersion',
      'transactionEnd'
    ]);
  });
});

Mandatory Error Handling Patterns

The service command includes explicit instructions for three critical error handling patterns:

## Mandatory Error Tests (all 3)
// 1. Property Preservation
it('should preserve error properties', async () => {
  const appError = new Error('Failed'); appError.statusCode = 404;
  mockFunction.mockRejectedValue(appError);
  try { await serviceFunction(); fail('Expected error'); } 
  catch (error) { expect(error.statusCode).toBe(404); }
});

// 2. Error Wrapping
it('should wrap generic errors', async () => {
  mockFunction.mockRejectedValue(new Error('Timeout'));
  const error = await serviceFunction().catch(err => err);
  expect(error.statusCode).toBe(500);
});

// 3. Error Logging
it('should log errors with context', async () => {
  mockFunction.mockRejectedValue(new Error('Failed'));
  await expect(serviceFunction()).rejects.toThrow();
  expect(mockLogger.error).toHaveBeenCalled();
});

This ensures every service test includes comprehensive error handling verification:

// 1. Application Error Preservation
it('should preserve error properties when re-throwing application errors', async () => {
  const appError = new ForbiddenError('Insufficient permissions', 'PERMISSION_DENIED');
  appError.context = { userId: 'user123', requiredRole: 'admin' };
  mockCheckPermissions.mockRejectedValue(appError);

  try {
    await createService(data, userId, accountId);
    fail('Expected error to be thrown');
  } catch (error) {
    expect(error).toBeInstanceOf(ForbiddenError);
    expect(error.message).toBe('Insufficient permissions');
    expect(error.statusCode).toBe(403);
    expect(error.errorCode).toBe('PERMISSION_DENIED');
    expect(error.context).toEqual({ userId: 'user123', requiredRole: 'admin' });
  }
});

// 2. Generic Error Wrapping  
it('should wrap non-application errors in InternalServerError', async () => {
  const networkError = new Error('Network timeout');
  mockCheckPermissions.mockRejectedValue(networkError);

  const error = await createService(data, userId, accountId).catch(err => err);

  expect(error).toBeInstanceOf(InternalServerError);
  expect(error.message).toBe('Failed to create resource');
  expect(error.statusCode).toBe(500);
  expect(error.errorCode).toBe('RESOURCE_CREATION_ERROR');
});

// 3. Comprehensive Error Logging
it('should log all errors with sufficient context for debugging', async () => {
  const serviceError = new Error('Transaction rollback failed');
  mockWithTransaction.mockRejectedValue(serviceError);

  await expect(createService(data, userId, accountId)).rejects.toThrow();

  expect(mockLogger.error).toHaveBeenCalledWith(
    'Failed to create resource',
    expect.objectContaining({
      error: expect.objectContaining({
        message: 'Transaction rollback failed',
        stack: expect.any(String)
      }),
      context: expect.objectContaining({
        userId,
        accountId,
        operation: 'createResource'
      })
    })
  );
});

Function Layer Testing: Business Logic Isolation

Functions contain pure business logic and database operations. The function testing command (create-backend-functions-test.md) emphasizes complete isolation through comprehensive mocking:

# CRITICAL REQUIREMENTS:
1. **ALWAYS use vi.hoisted() pattern** for module-level mocks
2. **Use vi.mock() with hoisted factories** for external dependencies  
3. **Use vi.fn(), vi.spyOn() inside tests** for local mocks only
4. **Never reference unhoisted functions**
5. Mock ALL external dependencies (functions, models, utilities) & never make real service/database calls

The command includes specific mock setup patterns:

## Database Model Mock Setup:
const { mockUserModel, mockAccountModel } = vi.hoisted(() => ({
  mockUserModel: { findOne: vi.fn(), create: vi.fn(), findById: vi.fn() },
  mockAccountModel: { findOne: vi.fn(), create: vi.fn() }
}));

vi.mock('../../../models/User.Model.js', () => ({ default: mockUserModel }));
vi.mock('../../../models/Account.Model.js', () => ({ default: mockAccountModel }));

This generates function tests that mock ALL external dependencies:

// Generated function tests mock ALL external dependencies
const { 
  mockUserModel,
  mockResourceModel,
  mockExternalService,
  mockLogger
} = vi.hoisted(() => ({
  mockUserModel: {
    findById: vi.fn(),
    create: vi.fn(),
    updateOne: vi.fn()
  },
  mockResourceModel: {
    create: vi.fn(),
    findOne: vi.fn()
  },
  mockExternalService: {
    validateData: vi.fn(),
    sendNotification: vi.fn()
  },
  mockLogger: {
    info: vi.fn(),
    warn: vi.fn(),
    error: vi.fn(),
    debug: vi.fn()
  }
}));

// All external systems mocked
vi.mock('../../../../models/User.Model.js', () => ({ default: mockUserModel }));
vi.mock('../../../../models/Resource.Model.js', () => ({ default: mockResourceModel }));
vi.mock('../../../../external/validation.service.js', () => ({ default: mockExternalService }));
vi.mock('../../../../middleware/logger.js', () => ({ getLogger: () => mockLogger }));

Validation Chain Testing: Real Integration Over Mocking

The validation chain command (create-backend-validation-chains-test.md) uses a unique approach, testing real express-validator behavior instead of mocking it:

# CRITICAL REQUIREMENTS:
1.  **Integration Testing Approach (DO NOT MOCK EXPRESS-VALIDATOR)**
2. **Test actual validation behavior** - Run real validation against mock request objects
3. **Import validationResult from express-validator** - Use real validation result checking
4. **Create runValidation helper** - Execute validation chains against test data
5. Mock ALL OTHER external dependencies (functions, models, utilities) & never make real database calls

The command includes the exact helper pattern to use:

### Standard Validation Test Setup:
import { describe, it, expect, beforeEach, vi } from 'vitest';
import { validationResult } from 'express-validator';
import { validationChainName } from '../fileName.validationChains.js';

describe('fileName.validationChains', () => {
  let mockReq;
  let mockRes;
  let mockNext;

  beforeEach(() => {
    mockReq = { body: {}, params: {}, query: {} };
    mockRes = {
      status: vi.fn().mockReturnThis(),
      json: vi.fn().mockReturnThis()
    };
    mockNext = vi.fn();
    vi.clearAllMocks();
  });

  // Helper function to run validation chains
  const runValidation = async (body, params = {}, query = {}) => {
    mockReq.body = body;
    mockReq.params = params;
    mockReq.query = query;

    // Run all validation chains
    for (const validation of validationChainName) {
      await validation.run(mockReq);
    }

    return validationResult(mockReq);
  };

This generates tests that execute real validation chains:

import { validationResult } from 'express-validator';
import { createResourceValidation } from '../validation.chains.js';

// Helper function executes real validation chains
const runValidation = async (body, params = {}, query = {}) => {
  mockReq.body = body;
  mockReq.params = params;
  mockReq.query = query;

  // Execute ALL validation chains against mock request
  for (const validation of createResourceValidation) {
    await validation.run(mockReq);
  }

  return validationResult(mockReq);
};

// Test real validation behavior
it('should fail when email format is invalid', async () => {
  const result = await runValidation({
    email: 'invalid-email-format',
    name: 'Valid Name'
  });

  expect(result.isEmpty()).toBe(false);
  const errors = result.array();
  expect(errors.some(error => 
    error.path === 'email' && 
    error.msg === 'Please enter a valid email address'
  )).toBe(true);
});

Why No Mocking? The command explicitly explains this architectural decision:

Why no mocking? Express-validator's internal complexity makes mocking brittle and unreliable. Testing against real validation chains ensures our tests catch actual validation behavior changes.

Route Structure Testing: Configuration Over Logic

Routes in well-architected applications are pure configuration—they wire together middleware, validation, and controllers. Our route testing command (create-backend-route-test.md) focuses on structural integrity:

# CRITICAL REQUIREMENTS:
1. DO NOT USE MOCKS - Routes files require structure testing only.
2. Test configuration, not business logic
3. Focus on module imports and wiring

# WHAT THIS TESTS:
- Module loads successfully without syntax errors
- Router is properly exported  
- All required dependencies are available
- Route configuration executes without errors

# WHAT NOT TO TEST:
- Route handlers (test in controller tests)
- Middleware execution (test in middleware tests)
- Validation logic (test in validation tests)

This generates tests focused on structural integrity:

describe('Route Configuration Details', () => {
  test('should configure middleware chains without errors', async () => {
    const routesModule = await import('../routes.js');
    const router = routesModule.default;

    const layer = router.stack[0];
    expect(layer.route).toBeDefined();
    expect(layer.route.stack).toBeDefined();
    expect(Array.isArray(layer.route.stack)).toBe(true);

    // Verify middleware chain includes authentication, validation, controller
    expect(layer.route.stack.length).toBeGreaterThan(1);

    // Verify all handlers are functions
    layer.route.stack.forEach(routeLayer => {
      expect(typeof routeLayer.handle).toBe('function');
    });
  });
});

Advanced Patterns: Solving ES Module and Timing Challenges

The vi.hoisted() Pattern: Solving Module Resolution Timing

One of the most critical patterns encoded in our commands addresses ES module timing issues that plague modern Node.js testing. Every command includes this pattern:

# COPY THESE PATTERNS EXACTLY:
## AsyncHandler Mock:
vi.mock('../../../../middleware/asyncHandler.js', () => ({
  asyncHandler: vi.fn((fn) => fn)
}));

The commands explain why this pattern is essential:

# CRITICAL REQUIREMENTS:
1. **ALWAYS use vi.hoisted() pattern** for module-level mocks
2. **Use vi.mock() with hoisted factories** for external dependencies  
3. **Use vi.fn(), vi.spyOn() inside tests** for local mocks only
4. **Never reference unhoisted functions** in mock factories

This pattern eliminates the "mock function not found" errors that create inconsistent test behavior:

// WRONG: Mock functions not available during module loading
vi.mock('../service.js', () => ({
  serviceMethod: mockService // ReferenceError: Cannot access before initialization
}));

const mockService = vi.fn();

// CORRECT: Hoisted pattern ensures availability during module resolution
const { mockService } = vi.hoisted(() => ({
  mockService: vi.fn()
}));

vi.mock('../service.js', () => ({
  serviceMethod: mockService // ✓ Available during hoisting phase
}));

import { controllerMethod } from '../controller.js';

Integration Testing: Database Lifecycle Management

Our integration testing command (create-backend-integration-test.md) implements sophisticated database management for full workflow testing:

## Requirements Analysis

First, analyze the codebase to understand:
1. **Feature Scope**: Examine the feature's API endpoints, data models, and business logic
2. **Model Dependencies**: Identify the primary model and related models  
3. **Authentication Requirements**: Determine required user roles and permissions
4. **Business Logic**: Understand validation rules, subscription limits, and constraints

## Database Management
- Use isolated database worker for each test file
- Implement beforeEach cleanup for all relevant collections
- Create helper function for account creation and authentication token generation

This generates integration tests with proper database lifecycle management:

describe('Battlecard Creation Integration', () => {
  let testDatabase;
  let testAccount;
  let authToken;

  beforeAll(async () => {
    // Initialize isolated database worker
    testDatabase = new StandardWorkerSetup();
    await testDatabase.setup();

    // Create test account with full subscription setup
    const accountData = EnhancedAccountFactory.build({
      subscription: {
        tier: 'premium',
        limits: { battlecards: 50 },
        status: 'active'
      }
    });

    testAccount = await createTestAccount(accountData);
    authToken = await generateAuthToken(testAccount.adminUser);
  });

  beforeEach(async () => {
    // Clean database state between tests
    await testDatabase.clearCollections([
      'battlecards',
      'battlecardVersions', 
      'notifications',
      'auditLogs'
    ]);
  });

  afterAll(async () => {
    await testDatabase.cleanup();
  });

Comprehensive Verification Patterns

The integration command includes specific instructions for verification:

### Verification Patterns  
- Test both HTTP responses AND database state changes
- Verify relationship integrity (user.accountId matches account._id)
- Validate business logic enforcement (limits, email verification, etc.)
- Check response format consistency and error message clarity

This generates tests that verify both API responses AND database state changes:

it('should create battlecard with proper relationship integrity', async () => {
  const battlecardData = TestDataUtils.generateBattlecardData();

  // Set up API response listener
  const response = await request(app)
    .post('/api/v1/battlecards')
    .set('Authorization', `Bearer ${authToken}`)
    .send(battlecardData)
    .expect(201);

  // Verify API response structure
  expect(response.body).toMatchObject({
    success: true,
    message: 'Battlecard created successfully',
    data: {
      id: expect.any(String),
      competitorName: battlecardData.competitorName,
      version: 1
    }
  });

  // Verify database state
  const createdBattlecard = await BattlecardModel.findById(response.body.data.id);
  expect(createdBattlecard).toBeTruthy();
  expect(createdBattlecard.account.toString()).toBe(testAccount.id);
  expect(createdBattlecard.createdBy.toString()).toBe(testAccount.adminUser.id);

  // Verify version relationship
  const battlecardVersion = await BattlecardVersionModel.findOne({ 
    battlecardId: createdBattlecard._id 
  });
  expect(battlecardVersion).toBeTruthy();
  expect(battlecardVersion.versionNumber).toBe(1);

  // Verify subscription counter update
  const updatedAccount = await AccountModel.findById(testAccount.id);
  expect(updatedAccount.usage.battlecardsUsed).toBe(1);
});

Production Impact and Metrics

After implementing this command system across our production codebase, we achieved measurable improvements:

Scale Metrics:

200+ test files generated with consistent architectural patterns
95%+ code coverage across all layers with appropriate boundary testing
Zero mock strategy inconsistencies between files

Velocity Improvements:

5-10x faster test generation compared to manual writing
50% reduction in test maintenance overhead due to consistent patterns
90% reduction in architectural testing discussions during code reviews

Quality Metrics:

40% increase in edge case coverage due to systematic pattern application
60% reduction in test-related production issues
100% consistency in error handling test coverage across all service layers

Implementation Methodology

Command Evolution Process

Pattern Identification: Analyze existing high-quality manual tests to identify recurring patterns and architectural decisions
Knowledge Crystallization: Encode patterns into Claude commands with detailed explanations of architectural rationale
Validation and Refinement: Apply commands to new code, identify edge cases, refine patterns based on real-world usage
Team Adoption: Establish commands as standard practice with clear guidelines for when to use each command type
Continuous Evolution: Regular retrospectives to identify new patterns and update commands based on architectural changes

Quality Assurance Framework

Each generated test undergoes validation:

# Automated validation pipeline
npm run test -- ${generatedTestFile}  # Verify tests pass
npm run coverage -- ${generatedTestFile}  # Verify coverage targets
npm run lint:test -- ${generatedTestFile}  # Verify style consistency

The commands include verification checklists to ensure quality:

# BEFORE COMPLETING TASK - VERIFY:
- [ ] All service methods mocked with vi.hoisted()
- [ ] No real service/database calls
- [ ] Request/response objects properly mocked
- [ ] Tests cover happy path, errors, edge cases
- [ ] Error handling tests are included for EVERY controller function
- [ ] File saved as {functionName}.controller.unit.test.js
- [ ] All tests pass successfully

Architectural Principles Encoded

Separation of Concerns

Each command enforces testing exactly one architectural layer's concerns:

Controllers test HTTP protocol translation, not business logic
Services test orchestration, not business rules
Functions test business logic, not HTTP concerns

Error Boundary Management

Commands enforce proper error handling patterns:

Controllers propagate without handling
Services wrap and log systematically
Functions validate inputs and handle business errors

Mock Strategy Consistency

Commands apply consistent mocking philosophy:

Mock external dependencies completely
Use real implementations for complex internal systems (like express-validator)
Isolate units of testing appropriately for each architectural layer

Test Data Management

Commands generate realistic, interconnected test data:

Unique identifiers prevent test interference
Complete object relationships mirror production complexity
Edge cases and boundary conditions systematically covered

Conclusion: Systematic Testing Excellence

The challenge of maintaining consistent, high-quality testing patterns across large Node.js applications has a solution that goes beyond individual developer discipline or team processes. By encoding architectural testing knowledge into AI-powered commands, we create a systematic approach that ensures every test follows exact architectural boundaries, error handling patterns, and quality standards.

Custom Claude commands represent more than automation, they represent the crystallization of architectural testing knowledge into scalable, consistent patterns. By encoding years of production experience into AI-powered commands, we've created a testing architecture that evolves with our systems while maintaining unwavering quality standards.

For engineering teams serious about scaling testing practices systematically, custom Claude commands offer a path forward that combines the precision of expert architectural knowledge with the consistency and scale that only systematic approaches can provide. The result is not just faster test writing, but better tests, comprehensive, consistent, and maintainable at enterprise scale.

DEV Community