Alok

Posted on Oct 8

Two Faces of End-to-End Testing: Horizontal and Vertical

#keploy #e2e #sdlc #testing

Picture this: You've just deployed your latest feature to production. Your unit tests passed with flying colors, integration tests confirmed that your APIs work perfectly, and your code review was flawless. Yet, within hours, users are reporting that the checkout process fails intermittently, the dashboard displays incorrect data, and certain workflows simply don't work as expected. Sound familiar? This scenario plays out in development teams worldwide, and it highlights a critical gap in traditional testing strategies. Enter end-to-end testing—the comprehensive approach that validates your entire application from the user's perspective, catching the issues that slip through every other testing layer.

Understanding the Two Dimensions of End-to-End Testing

When we talk about end-to-end testing, we're actually discussing two distinct but complementary approaches: horizontal and vertical testing. Understanding this duality is crucial for building a robust testing strategy that catches bugs at every level.

E2E testing encompasses both horizontal testing, which validates data flow across different system components and layers, and vertical testing, which follows complete user journeys from start to finish. Think of horizontal testing as examining how data travels through your architecture—from the frontend through your API layer, business logic, database, and back. Vertical testing, on the other hand, mimics real user behavior by executing complete workflows like signing up for an account, making a purchase, or generating a report.

The distinction matters because each approach catches different types of bugs. Horizontal tests excel at identifying integration issues, API contract violations, and data transformation problems. They ensure that when your React frontend sends a request, your Node.js backend processes it correctly, your database stores the right information, and the response makes it back intact. Vertical tests catch workflow problems, user experience issues, and business logic failures that only manifest when multiple features interact in realistic scenarios.

The Evolution of End-to-End Testing in Modern Development

The software landscape has changed dramatically over the past decade. Monolithic applications have given way to microservices architectures, server-side rendering has evolved into complex single-page applications, and deployment cycles have accelerated from quarterly releases to multiple deployments per day. These changes have made end-to-end testing both more challenging and more essential.

In the era of monolithic applications, testing was relatively straightforward. You had one codebase, one deployment, and one database. End-to-end tests could directly interact with the application, and failures were usually easy to trace. Today's distributed systems present a different reality. A single user action might trigger calls to a dozen microservices, update multiple databases, publish events to message queues, and interact with third-party APIs. This complexity creates countless points of potential failure that can only be caught through comprehensive end-to-end testing.

Modern frontend frameworks like React, Vue, and Angular have introduced their own complexity. Components render dynamically, state management spans multiple layers, and asynchronous operations happen everywhere. Traditional testing approaches that worked for jQuery-based applications fall short when dealing with virtual DOM updates, websocket connections, and client-side routing. End-to-end testing frameworks have evolved to handle these challenges, providing sophisticated mechanisms for waiting, interacting with dynamic content, and validating complex UI states.

Building Your Horizontal Testing Strategy

Horizontal end-to-end testing validates that data flows correctly through your system's layers. This approach is particularly valuable in microservices architectures where services communicate through APIs, message queues, or event streams. The goal is to ensure that data transformations, format conversions, and integrations work seamlessly across boundaries.

Start by mapping your system's data flows. For an e-commerce application, this might include the product catalog flowing from your inventory service through your API gateway to the frontend, or order data moving from the checkout service through payment processing, inventory updates, and confirmation emails. Each of these flows represents a horizontal test scenario.

Your horizontal tests should cover the complete data journey. When a user adds a product to their cart, your test should verify that the cart service receives the correct product ID, the inventory service decrements stock appropriately, the pricing service applies the right discounts, and the frontend displays accurate information. This isn't just testing that each service works independently—it's validating that they work together correctly.

Consider authentication flows as a prime example of horizontal testing. A user logs in through your frontend, which sends credentials to your API gateway. The gateway forwards the request to your authentication service, which validates against your user database, generates a JWT token, and returns it through the layers. Your horizontal test should verify this entire chain, ensuring that the token contains the correct claims, has the right expiration, and successfully authenticates subsequent requests.

describe('Horizontal Authentication Flow', () => {
it('should authenticate user through all system layers', async () => {
// Frontend layer: Submit login form
await page.goto('https://app.example.com/login');
await page.fill('#email', 'testuser@example.com');
await page.fill('#password', 'SecurePass123!');
await page.click('#login-button');

// API Gateway layer: Verify token in cookies
const cookies = await page.context().cookies();
const authToken = cookies.find(c => c.name === 'auth_token');
expect(authToken).toBeDefined();

// Decode and verify JWT claims
const tokenPayload = JSON.parse(atob(authToken.value.split('.')[1]));
expect(tokenPayload.userId).toBe('user-12345');
expect(tokenPayload.exp).toBeGreaterThan(Date.now() / 1000);

// Database layer: Verify session created
const session = await database.query(
  'SELECT * FROM sessions WHERE user_id = ?',
  [tokenPayload.userId]
);
expect(session).toHaveLength(1);
expect(session[0].active).toBe(true);

// Verify authenticated request works
await page.goto('https://app.example.com/dashboard');
await page.waitForSelector('#user-profile');
const userName = await page.textContent('#user-name');
expect(userName).toBe('Test User');

});
});

Data consistency is another critical aspect of horizontal testing. In distributed systems, data often exists in multiple places—cached in Redis, stored in PostgreSQL, indexed in Elasticsearch, and replicated across regions. Your horizontal tests should verify that updates propagate correctly and that eventual consistency actually happens eventually. Test scenarios where a user updates their profile and verify that the change appears in all relevant services within acceptable timeframes.

Performance considerations matter in horizontal testing. While you're not conducting full load tests, you should validate that your data flows complete within reasonable timeframes. Set timeout expectations that reflect real-world performance requirements. If your API should respond within 500ms, your horizontal tests should fail if responses consistently take longer.

Crafting Effective Vertical Testing Scenarios

Vertical end-to-end testing follows complete user journeys through your application. These tests mimic real user behavior, clicking buttons, filling forms, navigating pages, and validating outcomes. Vertical tests are your best defense against workflow bugs, user experience problems, and business logic failures.

The key to effective vertical testing is selecting the right scenarios. You can't test every possible user path—that would be prohibitively expensive and time-consuming. Instead, focus on critical business workflows that represent your application's core value. These are the paths that most users follow and the ones where failure has the highest impact.

For a SaaS application, critical vertical test scenarios might include the complete onboarding flow, the primary feature workflows, subscription management, and critical integrations. For an e-commerce platform, you'd prioritize product discovery, cart management, checkout completion, and order tracking. These scenarios should represent end-to-end journeys that cross multiple features and system boundaries.

Let's examine a comprehensive vertical test for an e-commerce purchase flow:

describe('Complete Purchase Journey', () => {
it('should allow user to discover, purchase, and track an order', async () => {
// Journey starts: User lands on homepage
await page.goto('https://shop.example.com');

// Discovery phase: Search and browse
await page.fill('#search-input', 'wireless headphones');
await page.click('#search-button');
await page.waitForSelector('.product-grid');

const productCount = await page.locator('.product-card').count();
expect(productCount).toBeGreaterThan(0);

// Selection phase: View product details
await page.click('.product-card:first-child');
await page.waitForSelector('#product-details');

const productName = await page.textContent('#product-name');
const productPrice = await page.textContent('#product-price');

// Add to cart phase
await page.selectOption('#color-select', 'black');
await page.click('#add-to-cart');
await page.waitForSelector('.cart-notification');

// Verify cart updated
const cartCount = await page.textContent('#cart-count');
expect(cartCount).toBe('1');

// Checkout phase: Navigate to cart
await page.click('#cart-icon');
await page.waitForSelector('#cart-summary');

const cartItemName = await page.textContent('.cart-item-name');
expect(cartItemName).toContain('wireless headphones');

await page.click('#proceed-to-checkout');

// Authentication: Login or guest checkout
await page.click('#guest-checkout');

// Shipping information
await page.fill('#shipping-email', 'customer@example.com');
await page.fill('#shipping-name', 'John Doe');
await page.fill('#shipping-address', '123 Main Street');
await page.fill('#shipping-city', 'San Francisco');
await page.selectOption('#shipping-state', 'CA');
await page.fill('#shipping-zip', '94102');
await page.click('#continue-to-payment');

// Payment phase
await page.waitForSelector('#payment-form');

// Use test card for payment
await page.fill('#card-number', '4242424242424242');
await page.fill('#card-expiry', '12/25');
await page.fill('#card-cvc', '123');
await page.fill('#card-name', 'John Doe');

// Submit order
await page.click('#place-order');

// Confirmation phase
await page.waitForSelector('#order-confirmation');

const orderNumber = await page.textContent('#order-number');
expect(orderNumber).toMatch(/^ORD-\d{8}$/);

const confirmationEmail = await page.textContent('#confirmation-email');
expect(confirmationEmail).toBe('customer@example.com');

// Verify order total matches cart
const orderTotal = await page.textContent('#order-total');
expect(orderTotal).toContain(productPrice);

// Post-purchase: Verify order appears in system
await page.goto(`https://shop.example.com/orders/${orderNumber}`);
await page.waitForSelector('#order-status');

const orderStatus = await page.textContent('#order-status');
expect(orderStatus).toBe('Processing');

// Verify tracking information exists
const trackingSection = await page.locator('#tracking-info');
await expect(trackingSection).toBeVisible();

});
});

This vertical test covers the entire customer journey from discovery through post-purchase tracking. It validates not just that individual features work, but that they work together to provide a seamless user experience. The test catches issues like incorrect cart calculations, broken checkout flows, payment processing failures, and order tracking problems.

Vertical tests should include realistic user behavior patterns. Real users don't just click through forms at maximum speed—they pause, read content, and interact with the interface naturally. Incorporate these patterns into your tests. Wait for animations to complete, verify that loading states appear and disappear appropriately, and check that the interface provides adequate feedback for user actions.

Error handling deserves special attention in vertical tests. Users make mistakes—they enter invalid data, click the wrong buttons, and trigger edge cases you never anticipated. Your vertical tests should cover common error scenarios. What happens when a user enters an invalid credit card? How does the application handle network failures during checkout? Does the system recover gracefully when a payment processor timeout occurs?

Selecting the Right Testing Tools and Frameworks

The end-to-end testing ecosystem offers numerous tools, each with distinct advantages and trade-offs. Your choice should align with your technical stack, team expertise, and specific testing requirements. Let's examine the major players and their strengths.

Selenium remains the grandfather of browser automation. Its longevity has resulted in extensive language support, mature ecosystems, and broad adoption. Selenium excels when you need cross-browser testing across many browsers, including older versions and niche platforms. Its WebDriver protocol is an industry standard, and many testing tools build on top of it. However, Selenium's age shows in its API design, which can feel verbose and cumbersome compared to modern alternatives.

Playwright has emerged as a powerful modern alternative, developed by Microsoft with contributions from former Chrome DevTools team members. Its standout features include excellent cross-browser support (Chromium, Firefox, and WebKit), automatic waiting that eliminates most flakiness, parallel execution capabilities, and comprehensive debugging tools. Playwright's API feels natural to JavaScript developers, and its browser context isolation enables truly independent test runs without complex cleanup logic.

// Playwright example with advanced features
const { test, expect } = require('@playwright/test');

test.describe('Advanced E2E Testing', () => {
test('handles complex user interactions', async ({ page, context }) => {
// Network interception
await page.route('**/api/products', route => {
route.fulfill({
status: 200,
body: JSON.stringify([
{ id: 1, name: 'Test Product', price: 29.99 }
])
});
});

// Browser context for isolated testing
await context.grantPermissions(['geolocation']);
await context.setGeolocation({ latitude: 37.7749, longitude: -122.4194 });

await page.goto('https://example.com');

// Automatic waiting for network idle
await page.waitForLoadState('networkidle');

// Advanced selector with chaining
const productPrice = await page
  .locator('[data-testid="product-card"]')
  .first()
  .locator('.price')
  .textContent();

expect(productPrice).toBe('$29.99');

});
});

Cypress revolutionized end-to-end testing with its developer-friendly approach and exceptional debugging experience. Running tests directly in the browser enables time-travel debugging, where you can hover over test commands to see exactly what happened at each step. Cypress automatically records videos and screenshots of test runs, making failure investigation straightforward. Its real-time reloading speeds up test development significantly. The main limitation is that Cypress tests run in the browser, which constrains certain testing scenarios and makes true cross-browser testing more challenging than with Playwright or Selenium.

TestCafe offers a unique approach with no external dependencies—it doesn't require WebDriver or browser plugins. Tests run in standard browsers through JavaScript injection, making setup remarkably simple. TestCafe's automatic waiting mechanisms and smart assertions reduce test flakiness. Its syntax is clean and expressive, though its ecosystem is smaller than Playwright or Cypress.

For teams working with Puppeteer for other browser automation tasks, extending it for end-to-end testing makes sense. Puppeteer provides low-level control over Chrome and Chromium, enabling sophisticated testing scenarios. Its API directly exposes Chrome DevTools Protocol, allowing you to intercept network requests, measure performance, and access browser internals. However, Puppeteer requires more boilerplate than higher-level frameworks and focuses exclusively on Chromium browsers.

Your framework choice should consider several factors beyond just technical capabilities. Team expertise matters—adopting a tool that aligns with your team's existing skills reduces the learning curve. If your developers are proficient in Python, Selenium with pytest might be more productive than forcing them to learn JavaScript for Playwright. Integration with your existing toolchain is crucial. Your testing framework should work seamlessly with your CI/CD pipeline, reporting tools, and deployment processes.

Conquering Test Flakiness and Reliability Issues

Flaky tests are the bane of end-to-end testing. A flaky test passes sometimes and fails other times without any code changes, eroding team confidence and wasting valuable time investigating false positives. Understanding and eliminating flakiness is essential for maintaining a useful test suite.

Timing issues cause the majority of flaky tests. Asynchronous operations, network requests, animations, and dynamic content loading all introduce timing variability. The naive solution—hard-coded sleeps—doesn't solve the problem reliably and makes tests unnecessarily slow. Modern testing frameworks provide sophisticated waiting mechanisms that you should use exclusively.

// Anti-pattern: Hard-coded waits
await page.waitForTimeout(3000); // Bad: May be too short or too long

// Good: Conditional waiting
await page.waitForSelector('#submit-button', {
state: 'visible',
timeout: 10000
});

// Better: Wait for multiple conditions
await Promise.all([
page.waitForLoadState('networkidle'),
page.waitForSelector('#content-loaded'),
page.waitForFunction(() => window.appReady === true)
]);

// Best: Wait for specific application state
await page.waitForFunction(() => {
const state = window.APP_STATE;
return state && state.initialized && !state.loading;
});

Race conditions emerge when tests make assumptions about execution order that aren't guaranteed. Your test might click a button before it's fully interactive, or submit a form before validation completes. Always wait for the specific condition you need rather than assuming previous steps completed successfully. Verify that buttons are not just visible but also enabled and clickable. Check that forms are not only rendered but also ready to receive input.

Environment inconsistencies contribute significantly to flakiness. Tests that pass locally but fail in CI, or vice versa, usually indicate environment differences. Browser versions might differ, network speeds vary, system resources change, and timing behavior shifts. Address these by containerizing your test environment with Docker, explicitly controlling browser versions, and configuring CI environments to match local development closely.

Test interdependence creates fragile test suites where one test's failure cascades to others. Each test should be completely independent, capable of running successfully in any order. Achieve this through proper setup and teardown, database seeding, and avoiding shared state. Use unique test data for each test run, either by generating random identifiers or by cleaning up thoroughly after each test.

// Independent test with proper isolation
describe('User Management', () => {
let testUser;

beforeEach(async () => {
// Create fresh test data for each test
testUser = {
email: test-${Date.now()}@example.com,
username: user-${Math.random().toString(36).substring(7)},
password: 'TestPass123!'
};

// Ensure clean state
await database.query('DELETE FROM sessions WHERE test_user = true');

});

afterEach(async () => {
// Clean up test data
if (testUser.id) {
await database.query('DELETE FROM users WHERE id = ?', [testUser.id]);
}
});

it('should create new user account', async () => {
// Test uses unique data and cleans up
await page.goto('https://example.com/signup');
await page.fill('#email', testUser.email);
await page.fill('#username', testUser.username);
await page.fill('#password', testUser.password);
await page.click('#signup-button');

await page.waitForSelector('#welcome-message');
// Store ID for cleanup
testUser.id = await page.evaluate(() => window.currentUser.id);

});
});

Network instability affects tests that interact with external services, third-party APIs, or remote resources. Production dependencies shouldn't dictate test reliability. Mock external services using tools like MSW (Mock Service Worker) or WireMock, intercept network requests at the test framework level, or use contract testing to validate integrations without hitting actual services.

Browser and driver versioning causes unexpected failures when updates introduce behavioral changes. Pin your browser versions and driver versions in CI environments to prevent surprise breakage. Periodically update these versions deliberately, testing thoroughly to catch any breaking changes. Use Docker images with fixed browser versions for maximum reproducibility.

Managing Test Data and State

Test data management makes or breaks end-to-end testing reliability. Your tests need realistic data to validate behavior accurately, but managing this data across test runs and environments requires careful strategy.

The seeding approach involves populating your database with a known dataset before test runs. This provides consistent, realistic data that tests can rely on. Create seed files that represent various scenarios—active users, expired subscriptions, pending orders, and edge cases. Your tests reference this known data, making assertions predictable.

// Database seeding strategy
class TestDataSeeder {
async seedUserData() {
return await database.transaction(async (trx) => {
// Create test users with known IDs
const users = await trx('users').insert([
{
id: 'test-user-001',
email: 'active@example.com',
status: 'active',
created_at: new Date('2024-01-01')
},
{
id: 'test-user-002',
email: 'premium@example.com',
status: 'active',
subscription_tier: 'premium',
created_at: new Date('2024-01-01')
},
{
id: 'test-user-003',
email: 'suspended@example.com',
status: 'suspended',
created_at: new Date('2024-01-01')
}
]);

  return users;
});

}

async seedProductCatalog() {
return await database('products').insert([
{
id: 'prod-001',
name: 'Test Product Alpha',
price: 29.99,
inventory: 100,
active: true
},
{
id: 'prod-002',
name: 'Test Product Beta',
price: 49.99,
inventory: 0,
active: true
}
]);
}

async cleanupTestData() {
await database('orders').where('user_id', 'like', 'test-user-%').del();
await database('users').where('id', 'like', 'test-user-%').del();
await database('products').where('id', 'like', 'prod-%').del();
}
}

Factory patterns generate fresh test data dynamically for each test run. This approach eliminates conflicts from reused data and provides flexibility to create exactly the data each test needs. Libraries like Faker.js help generate realistic but random data.

// Factory-based test data generation
class UserFactory {
static create(overrides = {}) {
return {
id: user-${uuidv4()},
email: test-${Date.now()}-${Math.random()}@example.com,
username: faker.internet.userName(),
firstName: faker.person.firstName(),
lastName: faker.person.lastName(),
createdAt: new Date(),
status: 'active',
...overrides
};
}

static createPremium() {
return this.create({
subscriptionTier: 'premium',
subscriptionExpiry: new Date(Date.now() + 365 * 24 * 60 * 60 * 1000)
});
}

static createSuspended() {
return this.create({
status: 'suspended',
suspensionReason: 'Payment failed'
});
}
}

// Usage in tests
it('should display premium features for premium users', async () => {
const premiumUser = UserFactory.createPremium();
await database('users').insert(premiumUser);

await loginAs(premiumUser.email);
await page.goto('https://example.com/dashboard');

await expect(page.locator('#premium-badge')).toBeVisible();
});

API-based data management uses your application's API to create test data rather than directly manipulating the database. This approach has the advantage of exercising your APIs during test setup and ensuring that your tests work with data created through legitimate channels.

// API-based test data setup
class TestDataAPI {
constructor(baseURL, authToken) {
this.baseURL = baseURL;
this.authToken = authToken;
}

async createUser(userData) {
const response = await fetch(${this.baseURL}/api/users, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': Bearer ${this.authToken}
},
body: JSON.stringify(userData)
});

return await response.json();

}

async createOrder(userId, orderData) {
const response = await fetch(${this.baseURL}/api/orders, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': Bearer ${this.authToken},
'X-User-ID': userId
},
body: JSON.stringify(orderData)
});

return await response.json();

}
}

State management between tests requires careful consideration. The isolation approach runs each test in complete isolation with fresh data and clean state. This maximizes reliability but can be slow when setup is expensive. The cleanup approach allows tests to share some state but carefully cleans up after each test. This trades some isolation for improved performance.

For applications with complex state machines, consider snapshot-based approaches where you restore the application to predefined states between tests. Database transactions can help here—start a transaction before each test and roll it back afterward, ensuring complete state reset without slow cleanup operations.

Integrating E2E Tests into CI/CD Pipelines

End-to-end tests deliver maximum value when integrated into your continuous integration and deployment pipeline. However, their longer execution times and potential flakiness require thoughtful orchestration to avoid becoming bottlenecks.

Implement tiered test execution strategies. Not all tests need to run on every commit. Create a smoke suite containing critical path tests that run on every pull request—these might take 5-10 minutes and catch the most important regressions. Schedule comprehensive test suites to run nightly or before releases, where 30-60 minute execution times are acceptable.

GitHub Actions example with tiered testing

name: E2E Testing Pipeline

on:
pull_request:
branches: [main]
schedule:
- cron: '0 2 * * *' # Nightly at 2 AM

jobs:
smoke-tests:
if: github.event_name == 'pull_request'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install dependencies
run: npm ci
- name: Run smoke tests
run: npm run test:e2e:smoke

full-test-suite:
if: github.event_name == 'schedule'
runs-on: ubuntu-latest
strategy:
matrix:
browser: [chromium, firefox, webkit]
shard: [1, 2, 3, 4]
steps:
- uses: actions/checkout@v3
- name: Install dependencies
run: npm ci
- name: Run E2E tests
run: npm run test:e2e -- --browser=${{ matrix.browser }} --shard=${{ matrix.shard }}/4

Parallel execution dramatically reduces test runtime. Most modern testing frameworks support parallel test execution out of the box. Playwright and Cypress can run tests in parallel across multiple workers, cutting execution time proportionally. Configure parallelism based on available resources—local machines might use 2-4 workers, while CI servers with more CPU cores can handle 8-16 workers.

Sharding distributes tests across multiple CI machines, enabling even greater parallelization. Instead of running all tests on one machine with multiple workers, you split tests into shards that run on separate machines simultaneously. A 30-minute test suite split across 6 shards completes in about 5 minutes.

// Playwright sharding configuration
// playwright.config.js
module.exports = {
workers: process.env.CI ? 4 : 2,
shard: process.env.CI ? {
current: parseInt(process.env.SHARD_INDEX),
total: parseInt(process.env.TOTAL_SHARDS)
} : null,

use: {
screenshot: 'only-on-failure',
video: 'retain-on-failure',
trace: 'retain-on-failure'
}
};

Test result reporting must provide actionable information when failures occur. Generic "test failed" messages waste time forcing developers to dig through logs. Comprehensive reports should include failure screenshots, video recordings, console logs, network activity, and stack traces. Tools like Allure, ReportPortal, or Playwright's built-in HTML reporter provide detailed, browsable test results.

Implement retry logic judiciously. Retrying flaky tests can mask underlying issues, but sometimes transient failures occur despite best efforts. Configure your test runner to retry failed tests once or twice, but track retry rates closely. If tests consistently require retries, investigate and fix the root cause rather than relying on retries indefinitely.

// Playwright retry configuration with tracking
module.exports = {
retries: process.env.CI ? 2 : 0,

reporter: [
['html'],
['json', { outputFile: 'test-results.json' }],
['./custom-reporter.js'] // Track flaky tests
]
};

// custom-reporter.js
class FlakyTestTracker {
onTestEnd(test, result) {
if (result.retry > 0 && result.status === 'passed') {
// Log flaky test for investigation
console.warn(Flaky test detected: ${test.title} (passed after ${result.retry} retries));

  // Could send to monitoring service
  this.reportFlakyTest({
    test: test.title,
    retries: result.retry,
    duration: result.duration
  });
}

}
}

Environment management ensures tests run consistently across local development and CI. Containerize your test environment using Docker to eliminate "works on my machine" problems. Include your application, database, and any dependent services in the container setup.

docker-compose.yml for test environment

version: '3.8'
services:
app:
build: .
ports:
- "3000:3000"
environment:
- NODE_ENV=test
- DATABASE_URL=postgresql://test:test@db:5432/testdb
depends_on:
- db
- redis

db:
image: postgres:15
environment:
- POSTGRES_USER=test
- POSTGRES_PASSWORD=test
- POSTGRES_DB=testdb

redis:
image: redis:7-alpine

e2e-tests:
build:
context: .
dockerfile: Dockerfile.e2e
depends_on:
- app
volumes:
- ./test-results:/app/test-results
environment:
- BASE_URL=http://app:3000

Monitoring and Maintaining Your Test Suite

End-to-end tests require ongoing maintenance as your application evolves. Treating test code as first-class code with the same quality standards as production code pays dividends in long-term maintainability.

Apply the Page Object Model pattern to encapsulate UI interactions and reduce duplication. When UI changes occur, you update the page object rather than hunting through dozens of test files.

// Page Object Model example
class LoginPage {
constructor(page) {
this.page = page;
this.emailInput = '#email';
this.passwordInput = '#password';
this.submitButton = '#login-button';
this.errorMessage = '.error-message';
}

async navigate() {
await this.page.goto('https://example.com/login');
await this.page.waitForLoadState('networkidle');
}

async login(email, password) {
await this.page.fill(this.emailInput, email);
await this.page.fill(this.passwordInput, password);
await this.page.click(this.submitButton);
}

async getErrorMessage() {
return await this.page.textContent(this.errorMessage);
}

async waitForRedirect() {
await this.page.waitForURL('**/dashboard');
}
}

// Usage in tests
it('should login successfully', async ({ page }) => {
const loginPage = new LoginPage(page);
await loginPage.navigate();
await loginPage.login('user@example.com', 'password123');
await loginPage.waitForRedirect();

expect(page.url()).toContain

Monitoring and Maintaining Your Test Suite

End-to-end testing isn’t a “set it and forget it” activity — it’s a living system that evolves with your codebase. As new features roll out, existing ones change, and dependencies update, your test suite must adapt accordingly. Maintenance isn’t a cost; it’s an investment in long-term software quality.

Establish clear ownership for test maintenance within your engineering teams. Assign a Testing Champion or a QA Guild responsible for reviewing failing tests, managing flaky ones, and ensuring test coverage aligns with critical workflows. This shared responsibility prevents neglect and promotes a culture where testing is everyone’s job.

Use metrics to measure test suite health:

✅ Test pass rate — The percentage of tests passing consistently across runs.

⚙️ Flake rate — How many tests pass only after retries.

🕒 Execution time — The total time to run all tests; aim to keep it within reasonable limits.

📊 Coverage — Map tests to features or services, not just lines of code.

Integrate these metrics into your CI dashboards for visibility. Tools like Allure, Testmo, ReportPortal, or GitHub Actions summary reports can visualize these insights, helping teams identify unstable areas early.

Version control your test data and environment configurations alongside your application code. This ensures that when you roll back to a previous commit, your corresponding test environment matches exactly, preserving reproducibility. For example, if your docker-compose.test.yml or Playwright config changes, it should evolve with your app, not independently.

Finally, regularly prune obsolete tests. Features get deprecated, flows evolve, and maintaining outdated tests wastes time. Schedule a quarterly “test hygiene” sprint to remove, refactor, and modernize your test suite. This small habit keeps your E2E tests lean and relevant.

Bringing It All Together

The twin dimensions of horizontal and vertical end-to-end testing form the backbone of resilient modern software systems.

Horizontal tests ensure that your architecture’s plumbing works — data flows correctly through APIs, databases, and services.

Vertical tests confirm that from a user’s point of view, everything just works — the journeys, the workflows, and the experience.

When used together, they close the gaps left by unit and integration testing, delivering confidence that your application performs reliably in the real world.

Modern tools like Playwright, Cypress, and TestCafe have made it easier than ever to simulate real user interactions while integrating tightly into CI/CD workflows. However, success with E2E testing isn’t just about tooling — it’s about strategy, discipline, and continuous improvement.

As software complexity grows, so does the cost of escaping bugs. But by combining horizontal system validation and vertical user validation, you can catch failures early, reduce regression risks, and ensure that every deployment feels like a safe one.

Final Thoughts

The most successful engineering teams treat E2E testing not as an afterthought, but as an essential pillar of delivery confidence. Whether you’re building microservices or full-stack monoliths, investing in horizontal and vertical testing pays off in fewer production incidents, smoother releases, and happier users.

After all, the best test suite isn’t the one that just passes — it’s the one that makes your team fearless to ship. 🚀

If you’re looking to simplify and supercharge your testing workflow — from API to E2E — try Keploy.
It automatically generates test cases and mocks from real user traffic, helping teams achieve higher coverage and zero-flake testing without writing a single line of test code.

DEV Community

Two Faces of End-to-End Testing: Horizontal and Vertical

GitHub Actions example with tiered testing

docker-compose.yml for test environment

Top comments (0)