Run tests in isolated Docker Compose stacks using a preview image of your app and a slim database copy seeded with only the entities you actually need. This drastically reduces flakiness, shortens run times, and restores trust in your test suite.
Introduction
End-to-end (E2E) and integration tests are essential for maintaining confidence in your product. They validate critical user flows and system interactions. But when tests are flaky—intermittently failing for non-code reasons—they block releases, waste debugging hours, and erode trust.
This article shows a practical approach to stabilizing tests by running them inside Docker against a dedicated preview image of your application, connected to a slimmed-down copy of your database. The result is deterministic, fast, and repeatable test runs.
The Problem with Unstable Tests
Common sources of flakiness:
- Shared environments: Tests hit a master/shared database that is large and constantly changing.
- Uncontrolled data dependencies: Required entities go missing or drift over time.
- Heavy databases: Full clones are slow to snapshot, restore, and migrate, especially in CI.
- Environment drift: Library versions, services, or feature flags differ between dev/CI/prod.
The Solution: Isolated Docker Environment + Slim Database Copy
We improve stability by isolating everything per test run:
- Preview Docker image of the app (built from the current branch/PR).
- Dedicated database container (fresh per run) with a small, known-good dataset.
- Deterministic seeding: A repeatable seed that contains only the minimum required entities and relationships.
Benefits: stable data, predictable behavior, faster setup, and fewer false negatives.
Reference Setup (Docker Compose)
Below is a minimal Compose file for a web app and a Postgres database. Adapt the service names, ports, and env variables to your stack.
version: "3.9"
services:
app:
image: ghcr.io/your-org/your-app:${GIT_SHA:-preview}
depends_on:
- db
environment:
NODE_ENV: test
DATABASE_URL: postgres://testuser:testpass@db:5432/testdb
ports:
- "8080:8080"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 5s
timeout: 2s
retries: 20
db:
image: postgres:16
environment:
POSTGRES_USER: testuser
POSTGRES_PASSWORD: testpass
POSTGRES_DB: testdb
ports:
- "5433:5432"
volumes:
- db_data_test:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U testuser -d testdb"]
interval: 5s
timeout: 2s
retries: 20
volumes:
db_data_test:
Building a Preview Image
Build the preview image in CI from the current branch and tag it with the commit SHA:
# Example Node app
FROM node:20-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci
FROM node:20-alpine AS build
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build
FROM node:20-alpine
WORKDIR /app
ENV NODE_ENV=production
COPY --from=build /app .
EXPOSE 8080
CMD ["node", "dist/server.js"]
Deterministic Seeding: a Slim DB Copy
A slim copy is not a full clone of master/prod. It is a curated dataset containing only the tables/rows needed for your test scenarios—users, roles, feature flags, minimal catalog data, etc.
Example (Postgres) seed
-- Minimal roles
INSERT INTO roles (id, name) VALUES
('00000000-0000-0000-0000-000000000001', 'admin'),
('00000000-0000-0000-0000-000000000002', 'user')
ON CONFLICT (id) DO NOTHING;
-- Test user
INSERT INTO users (id, email, role_id, created_at)
VALUES ('10000000-0000-0000-0000-000000000001', 'testuser@example.com',
'00000000-0000-0000-0000-000000000002', NOW())
ON CONFLICT (id) DO NOTHING;
-- Feature flags (only those required by flows under test)
INSERT INTO feature_flags (key, enabled)
VALUES ('checkout_v2', true)
ON CONFLICT (key) DO UPDATE SET enabled = EXCLUDED.enabled;
-- Domain entities essential for E2E flows
INSERT INTO products (id, name, price_cents)
VALUES (1, 'Sample Product', 1999)
ON CONFLICT (id) DO NOTHING;
Applying migrations and seeding in CI
Use a simple Makefile to orchestrate lifecycle steps:
.PHONY: up migrate seed test down
up:
docker compose -f docker-compose.test.yml up -d --wait
migrate:
# Replace with your migration tool, e.g., Prisma, Knex, Flyway, Liquibase
docker compose -f docker-compose.test.yml exec -T app npm run migrate:up
seed:
docker compose -f docker-compose.test.yml exec -T db \
psql -U testuser -d testdb < seed/slim-seed.sql
# Example: Playwright/Cypress/WDIO/etc.
test:
docker compose -f docker-compose.test.yml exec -T app npm run test:e2e
down:
docker compose -f docker-compose.test.yml down -v
Then in CI (GitHub Actions example):
name: e2e
on: [pull_request]
jobs:
e2e:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build preview image
run: |
docker build -t ghcr.io/your-org/your-app:${{ github.sha }} .
- name: Start stack
run: |
GIT_SHA=${{ github.sha }} make up
- name: Migrate & seed
run: |
make migrate
make seed
- name: Run tests
run: |
make test
- name: Teardown
if: always()
run: |
make down
Generating a Slim Copy from a Large DB (Optional)
If you must derive your slim dataset from a large source, don’t clone everything. Extract only the minimal rows and maintain referential integrity.
Strategy A: Hand-crafted seed (recommended)
- Write idempotent
INSERT
statements for the handful of entities you need. - Keep it in version control and review changes with tests.
Strategy B: Selective dump (Postgres example)
Use pg_dump
with include lists and filters:
# Dump schema only (fast migrations)
pg_dump \
--schema-only \
--no-owner --no-privileges \
"$DATABASE_URL" > schema.sql
# Dump minimal data for specific tables
pg_dump \
--data-only \
--table=roles \
--table=feature_flags \
--table=products \
--column-inserts \
"$DATABASE_URL" > minimal-data.sql
cat schema.sql minimal-data.sql > seed/slim-seed.sql
Strategy C: Programmatic export
- Write a small script (Node/TS) that queries only the rows/columns needed and emits SQL/JSON fixtures.
- Enforce stable IDs (UUIDs) for deterministic references across runs.
Tip: Keep seeds small (< a few hundred rows). If a test needs new data, expand the seed incrementally and add an assertion to prove it.
Running Tests Inside the Stack
Your test runner (e.g., Playwright, Cypress, WebdriverIO, Jest-integration) should:
- Wait for app and DB health checks.
- Use a single, known
BASE_URL
(e.g.,http://app:8080
). - Create transient test data per spec when necessary, or reuse the seeded fixtures.
Example Playwright config snippet:
import { defineConfig } from '@playwright/test';
export default defineConfig({
use: {
baseURL: process.env.BASE_URL || 'http://localhost:8080',
trace: 'retain-on-failure',
video: 'retain-on-failure',
},
});
Run command:
BASE_URL=http://localhost:8080 npx playwright test
Troubleshooting Flakiness: A Short Checklist
-
Time: Replace arbitrary
wait
s with explicit waits for network/DOM state. - Data: Ensure seeds are applied and idempotent; reset DB between specs if needed.
- Isolation: No tests should depend on state from previous tests.
- Clocks: Mock time if flows are time-dependent (tokens, expirations, cron).
- External APIs: Stub/mirror third-party calls; don’t let them fail your CI.
- Retries: Use test-level retries sparingly and only for known flaky endpoints.
Conclusion
Stable tests are the foundation of reliable delivery. By running E2E/integration tests in Docker against a preview app image and a slim database copy, you eliminate data drift and environment coupling. Your suites become faster, more deterministic, and far easier to trust.
In real-world practice, teams that adopted this approach saw a sharp drop in test failures caused by data or environment issues, and at the same time achieved a significant boost in test execution speed. Faster feedback cycles and fewer false alarms translate directly into developer productivity and higher confidence in releases.
If your team is battling flaky tests, try this approach. Start with a tiny, hand-crafted seed; automate migrations; and keep everything in Compose. The upfront effort pays back quickly in saved engineering time and safer releases.
Real-World Impact (Metrics Example)
Here’s a simplified before/after snapshot many teams observe when switching to Docker + slim DB copies:
Metric | Before (shared DB) | After (slim DB in Docker) |
---|---|---|
Average E2E run time | ~45 min | ~18 min |
Flaky test failures | 20–30% of runs | <5% of runs |
Developer confidence | Low | High |
Even approximate metrics like these make a strong case in articles and internal presentations.
Appendix: MongoDB Variant (Bonus)
For MongoDB, swap Postgres services/commands with Mongo images and mongorestore
:
version: "3.9"
services:
app:
image: ghcr.io/your-org/your-app:${GIT_SHA:-preview}
environment:
MONGODB_URI: mongodb://testuser:testpass@mongo:27017/testdb?authSource=admin
depends_on:
- mongo
mongo:
image: mongo:7
ports:
- "27018:27017"
environment:
MONGO_INITDB_ROOT_USERNAME: testuser
MONGO_INITDB_ROOT_PASSWORD: testpass
Seed example:
# JSON or bsondump fixtures kept small and versioned
mongorestore --uri "mongodb://testuser:testpass@localhost:27018/testdb?authSource=admin" \
--drop ./seed/mongo/slim
Top comments (0)