DEV Community

Cover image for Stop Writing Flaky Tests: The Ultimate Node.js Testing Strategy (Unit + E2E)
Pau Dang
Pau Dang

Posted on

Stop Writing Flaky Tests: The Ultimate Node.js Testing Strategy (Unit + E2E)

Hi DEV community,

If you are a Backend Developer working with Node.js, you have likely experienced the dreaded scenario: "It passes on my machine, but randomly fails on the CI/CD pipeline."

This phenomenon is known as Flaky Tests. It usually stems from writing End-to-End (E2E) tests that share database states across test files, or due to network and infrastructure services (Redis, Kafka) not being fully initialized when the test begins.

Today, I’m going to share the complete testing architecture and lessons I learned while building the automation framework nodejs-quickstart-structure. We will solve the core problem: How to build blazing fast Unit Tests and completely deterministic E2E Tests.


1. The Crisis in 90% of Projects

Many teams implement testing "half-heartedly":

  • Unit Tests: Dependencies like the Database or Redis aren't mocked, causing the tests to drag on because they wait for network I/O.
  • E2E Tests: Developers use their local dev database to run E2E suites. Test A creates a User, Test B checks the total number of Users. If they run in a different order, the test suite explodes! Some even use conditionals in tests: if (statusCode == 404) expect(404) else expect(201).

This is a massive Anti-pattern! A test must strictly return exactly one predictable result based on static inputs.


2. The "Big Tech" Strategy: Draw a Hard Line

To fix this, you must strictly delineate your boundaries:

Unit Tests (Fast & Isolated)

  • Goal: Verify Business Logic (Use cases, Services, Domain).
  • Rule: MOCK EVERYTHING. No real database connections, no external APIs, no touching Redis or Kafka.
  • Speed: Thousands of test cases should execute in less than 2 seconds.

E2E Tests (Black-box & Automated Infra)

  • Goal: Verify the entire request flow (Route -> Controller -> Usecase -> Repo -> DB -> Response).
  • Rule: Use the REAL Database, Redis, and Kafka (spinned up via isolated Docker Containers or Testcontainers).
  • Characteristics: Data must be TRUNCATED/Teared down before or after each test suite to guarantee a "clean room" environment. It runs slower, but absolute correctness is guaranteed.

3. The Recipe for Perfect E2E Tests

To prevent E2E tests from interfering with the developer's local development environment, follow these steps and source demo nodejs-service-redis-kafka:

Step 1: Fully isolate jest.config.js

Do not share your Unit test configurations with E2E. Create a dedicated jest.e2e.config.js with a higher testTimeout (e.g., 30 seconds to allow databases to boot).

module.exports = {
  ...require('./jest.config'),
  testMatch: ['<rootDir>/tests/e2e/**/*.test.ts'],
  testPathIgnorePatterns: ['/node_modules/'],
  testTimeout: 30000, 
  clearMocks: true
};
Enter fullscreen mode Exit fullscreen mode

Step 2: Use Node.js Scripts to Manage Docker Lifecycle

Instead of forcing developers to manually type docker-compose up before running tests, write an automated orchestration script:

  1. Assign a dedicated port (PORT=3001 instead of 3000) to avoid Dev Server collisions.
  2. execSync('docker-compose up -d db redis kafka').
  3. Use the wait-on npm package to poll the healthcheck until dependencies are fully green.
  4. Run npm run test:e2e:run.
  5. Clean up gracefully: execSync('docker-compose down').
// Wait for dependencies to prevent "Flaky connections"
execute(`npx wait-on http-get://127.0.0.1:${TEST_PORT}/health -t 120000`);
execute('jest --config ./jest.e2e.config.js');
Enter fullscreen mode Exit fullscreen mode

Step 3: Fix Kafka's "read ECONNRESET" Locally

The most common issue when testing Kafka in a local E2E run is that the tests run on the host network while Kafka is stuck in the Docker bridged network.

The Fix: Explicitly map the PLAINTEXT_HOST listener to a dedicated port (e.g., 9093):

# docker-compose.yml
kafka:
  ports:
    - "9093:9093" # Host mapping
  environment:
    - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:9093
    - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,PLAINTEXT_HOST://:9093,CONTROLLER://:9094
Enter fullscreen mode Exit fullscreen mode

Then, in your .env.test, simply point KAFKA_BROKER=localhost:9093.

Step 4: Write "Iron-clad" Assertions

Eliminate loose assertions. Because your database is wiped clean (or relies on random seeds like Date.now()), the output status must be absolutely rigid!

it('should create a user successfully via REST', async () => {
    const uniqueEmail = `test_${Date.now()}@example.com`;
    const response = await request(SERVER_URL)
        .post('/api/users')
        .send({ name: 'Test User', email: uniqueEmail });

    // Strictly expect a 201 Created
    expect(response.statusCode).toBe(201);
});
Enter fullscreen mode Exit fullscreen mode

4. Conclusion

Shifting to an isolated, automated Docker test strategy will cost you 1-2 days of initial setup infrastructure work. But in return, it brings absolute peace of mind to the team as the codebase scales.

If you find this setup process too tedious, you can simply grab the exact folder structures, Docker automation scripts, and Jest configurations that I’ve already pre-configured out of the box in my open-source CLI generator:
👉 nodejs-quickstart-structure on GitHub

Drop a star (⭐) if you find it helpful! Happy coding, and here's to never seeing Test Failed randomly in GitHub Actions again!

Top comments (0)