Artur Benagraph

Posted on Sep 27

Why Stable E2E and Integration Tests Matter — and How to Achieve It with Docker and a Slim Database Copy

#devops #testing #docker #cicd

Run tests in isolated Docker Compose stacks using a preview image of your app and a slim database copy seeded with only the entities you actually need. This drastically reduces flakiness, shortens run times, and restores trust in your test suite.

Introduction

End-to-end (E2E) and integration tests are essential for maintaining confidence in your product. They validate critical user flows and system interactions. But when tests are flaky—intermittently failing for non-code reasons—they block releases, waste debugging hours, and erode trust.

This article shows a practical approach to stabilizing tests by running them inside Docker against a dedicated preview image of your application, connected to a slimmed-down copy of your database. The result is deterministic, fast, and repeatable test runs.

The Problem with Unstable Tests

Common sources of flakiness:

Shared environments: Tests hit a master/shared database that is large and constantly changing.
Uncontrolled data dependencies: Required entities go missing or drift over time.
Heavy databases: Full clones are slow to snapshot, restore, and migrate, especially in CI.
Environment drift: Library versions, services, or feature flags differ between dev/CI/prod.

The Solution: Isolated Docker Environment + Slim Database Copy

We improve stability by isolating everything per test run:

Preview Docker image of the app (built from the current branch/PR).
Dedicated database container (fresh per run) with a small, known-good dataset.
Deterministic seeding: A repeatable seed that contains only the minimum required entities and relationships.

Benefits: stable data, predictable behavior, faster setup, and fewer false negatives.

Reference Setup (Docker Compose)

Below is a minimal Compose file for a web app and a Postgres database. Adapt the service names, ports, and env variables to your stack.

version: "3.9"
services:
  app:
    image: ghcr.io/your-org/your-app:${GIT_SHA:-preview}
    depends_on:
      - db
    environment:
      NODE_ENV: test
      DATABASE_URL: postgres://testuser:testpass@db:5432/testdb
    ports:
      - "8080:8080"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 5s
      timeout: 2s
      retries: 20

  db:
    image: postgres:16
    environment:
      POSTGRES_USER: testuser
      POSTGRES_PASSWORD: testpass
      POSTGRES_DB: testdb
    ports:
      - "5433:5432"
    volumes:
      - db_data_test:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U testuser -d testdb"]
      interval: 5s
      timeout: 2s
      retries: 20

volumes:
  db_data_test:

Building a Preview Image

Build the preview image in CI from the current branch and tag it with the commit SHA:

# Example Node app
FROM node:20-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci

FROM node:20-alpine AS build
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build

FROM node:20-alpine
WORKDIR /app
ENV NODE_ENV=production
COPY --from=build /app .
EXPOSE 8080
CMD ["node", "dist/server.js"]

Deterministic Seeding: a Slim DB Copy

A slim copy is not a full clone of master/prod. It is a curated dataset containing only the tables/rows needed for your test scenarios—users, roles, feature flags, minimal catalog data, etc.

Example (Postgres) seed

-- Minimal roles
INSERT INTO roles (id, name) VALUES 
  ('00000000-0000-0000-0000-000000000001', 'admin'),
  ('00000000-0000-0000-0000-000000000002', 'user')
ON CONFLICT (id) DO NOTHING;

-- Test user
INSERT INTO users (id, email, role_id, created_at)
VALUES ('10000000-0000-0000-0000-000000000001', 'testuser@example.com',
        '00000000-0000-0000-0000-000000000002', NOW())
ON CONFLICT (id) DO NOTHING;

-- Feature flags (only those required by flows under test)
INSERT INTO feature_flags (key, enabled)
VALUES ('checkout_v2', true)
ON CONFLICT (key) DO UPDATE SET enabled = EXCLUDED.enabled;

-- Domain entities essential for E2E flows
INSERT INTO products (id, name, price_cents)
VALUES (1, 'Sample Product', 1999)
ON CONFLICT (id) DO NOTHING;

Applying migrations and seeding in CI

Use a simple Makefile to orchestrate lifecycle steps:

.PHONY: up migrate seed test down

up:
    docker compose -f docker-compose.test.yml up -d --wait

migrate:
    # Replace with your migration tool, e.g., Prisma, Knex, Flyway, Liquibase
    docker compose -f docker-compose.test.yml exec -T app npm run migrate:up

seed:
    docker compose -f docker-compose.test.yml exec -T db \
      psql -U testuser -d testdb < seed/slim-seed.sql

# Example: Playwright/Cypress/WDIO/etc.
test:
    docker compose -f docker-compose.test.yml exec -T app npm run test:e2e

down:
    docker compose -f docker-compose.test.yml down -v

Then in CI (GitHub Actions example):

name: e2e
on: [pull_request]

jobs:
  e2e:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Build preview image
        run: |
          docker build -t ghcr.io/your-org/your-app:${{ github.sha }} .
      - name: Start stack
        run: |
          GIT_SHA=${{ github.sha }} make up
      - name: Migrate & seed
        run: |
          make migrate
          make seed
      - name: Run tests
        run: |
          make test
      - name: Teardown
        if: always()
        run: |
          make down

Generating a Slim Copy from a Large DB (Optional)

If you must derive your slim dataset from a large source, don’t clone everything. Extract only the minimal rows and maintain referential integrity.

Strategy A: Hand-crafted seed (recommended)

Write idempotent INSERT statements for the handful of entities you need.
Keep it in version control and review changes with tests.

Strategy B: Selective dump (Postgres example)

Use pg_dump with include lists and filters:

# Dump schema only (fast migrations)
pg_dump \
  --schema-only \
  --no-owner --no-privileges \
  "$DATABASE_URL" > schema.sql

# Dump minimal data for specific tables
pg_dump \
  --data-only \
  --table=roles \
  --table=feature_flags \
  --table=products \
  --column-inserts \
  "$DATABASE_URL" > minimal-data.sql

cat schema.sql minimal-data.sql > seed/slim-seed.sql

Strategy C: Programmatic export

Write a small script (Node/TS) that queries only the rows/columns needed and emits SQL/JSON fixtures.
Enforce stable IDs (UUIDs) for deterministic references across runs.

Tip: Keep seeds small (< a few hundred rows). If a test needs new data, expand the seed incrementally and add an assertion to prove it.

Running Tests Inside the Stack

Your test runner (e.g., Playwright, Cypress, WebdriverIO, Jest-integration) should:

Wait for app and DB health checks.
Use a single, known BASE_URL (e.g., http://app:8080).
Create transient test data per spec when necessary, or reuse the seeded fixtures.

Example Playwright config snippet:

import { defineConfig } from '@playwright/test';

export default defineConfig({
  use: {
    baseURL: process.env.BASE_URL || 'http://localhost:8080',
    trace: 'retain-on-failure',
    video: 'retain-on-failure',
  },
});

Run command:

BASE_URL=http://localhost:8080 npx playwright test

Troubleshooting Flakiness: A Short Checklist

Time: Replace arbitrary waits with explicit waits for network/DOM state.
Data: Ensure seeds are applied and idempotent; reset DB between specs if needed.
Isolation: No tests should depend on state from previous tests.
Clocks: Mock time if flows are time-dependent (tokens, expirations, cron).
External APIs: Stub/mirror third-party calls; don’t let them fail your CI.
Retries: Use test-level retries sparingly and only for known flaky endpoints.

Conclusion

Stable tests are the foundation of reliable delivery. By running E2E/integration tests in Docker against a preview app image and a slim database copy, you eliminate data drift and environment coupling. Your suites become faster, more deterministic, and far easier to trust.

In real-world practice, teams that adopted this approach saw a sharp drop in test failures caused by data or environment issues, and at the same time achieved a significant boost in test execution speed. Faster feedback cycles and fewer false alarms translate directly into developer productivity and higher confidence in releases.

If your team is battling flaky tests, try this approach. Start with a tiny, hand-crafted seed; automate migrations; and keep everything in Compose. The upfront effort pays back quickly in saved engineering time and safer releases.

Real-World Impact (Metrics Example)

Here’s a simplified before/after snapshot many teams observe when switching to Docker + slim DB copies:

Metric	Before (shared DB)	After (slim DB in Docker)
Average E2E run time	~45 min	~18 min
Flaky test failures	20–30% of runs	<5% of runs
Developer confidence	Low	High

Even approximate metrics like these make a strong case in articles and internal presentations.

Appendix: MongoDB Variant (Bonus)

For MongoDB, swap Postgres services/commands with Mongo images and mongorestore:

version: "3.9"
services:
  app:
    image: ghcr.io/your-org/your-app:${GIT_SHA:-preview}
    environment:
      MONGODB_URI: mongodb://testuser:testpass@mongo:27017/testdb?authSource=admin
    depends_on:
      - mongo

  mongo:
    image: mongo:7
    ports:
      - "27018:27017"
    environment:
      MONGO_INITDB_ROOT_USERNAME: testuser
      MONGO_INITDB_ROOT_PASSWORD: testpass

Seed example:

# JSON or bsondump fixtures kept small and versioned
mongorestore --uri "mongodb://testuser:testpass@localhost:27018/testdb?authSource=admin" \
  --drop ./seed/mongo/slim

DEV Community