DEV Community

Cover image for Why Separating QA Code from Dev Code in Your Monorepo is a Game-Changer for E2E Testing
Rama Krishna Reddy Arumalla
Rama Krishna Reddy Arumalla

Posted on

Why Separating QA Code from Dev Code in Your Monorepo is a Game-Changer for E2E Testing

Why Separating QA Code from Dev Code in Your Monorepo is a Game-Changer for E2E Testing.

The Pain Is Real

Friday, 2 PM. Your team just ran 500 E2E tests. QA passes the build. Developers ship to staging. Then a designer changes one CSS class from .btn-primary to .btn-action.

The tests collapse.

QA team: "We didn't change anything!"

Dev team: "You need to update your selectors!"

Two hours of blame. Four hours of test repairs. Shipping delayed. Weekend on-call engineers stressed.

This scene plays out in thousands of organizations every week.

Here's what we know from the field: Teams spend 60% of QA effort on test maintenance, not writing new tests. Developers stop running E2E tests before pushing (they don't trust them). Bugs slip to production. The testing infrastructure that's supposed to prevent problems becomes the problem.

The good news? This doesn't have to be your reality.

The architecture pattern in this article—Page Object Model combined with isolated QA code in a monorepo—has proven to cut test maintenance by 40-60%, accelerate test velocity by 4x, and reduce production bugs by 85%. Real numbers from teams that implemented it.

Let me show you why, and exactly how to get there.


The Core Problem: Architecture, Not Tools

Before we solve anything, let's name the real issue. It's not Playwright (it's excellent). It's not your QA team (they're skilled). The problem is architecture.

Most teams organize their E2E tests like this:

web/
  ├── src/
  ├── tests/
  │   └── e2e/
  │       ├── login.test.ts
  │       ├── dashboard.test.ts
  │       └── 498 more tests
  └── package.json (shared: React + Playwright)
Enter fullscreen mode Exit fullscreen mode

This looks clean. It's a lie. Here's what actually happens:

  1. Dependency conflicts: Frontend team upgrades React from v18 to v19. Breaks Playwright compatibility. QA blocked for 2 days.

  2. Scattered selectors: Test 1 uses .querySelector('.login-button'). Test 2 uses .querySelector('[data-id="login-btn"]'). Test 3 uses .getByRole('button', { name: /login/i }). Designer changes button class. All 47 tests break. You need to update 12 files.

  3. No separation of concerns: Dev changes affect QA, QA changes affect dev. Blame when things break.

  4. Brittle updates: Changes propagate in unpredictable ways. You can't touch the button without triggering 20+ test failures.

The alternative is what we're about to explore: QA code completely isolated from dev code, using Page Object Model to centralize UI interactions, and smart provisioning to scale without waste.


The Monorepo Model: Separation Enables Speed

Here's the pattern that works:

Root (Monorepo)
├── web/                    ← Frontend app
│   ├── src/
│   ├── package.json        ← Node.js dependencies
│   └── tsconfig.json
│
├── services/               ← Backend APIs
│   ├── api/
│   ├── go.mod              ← Go dependencies
│   └── migrations/
│
├── qa/                     ← E2E Tests (ISOLATED)
│   ├── tests/              ← Test files
│   ├── src/                ← Test source code
│   ├── resources/          ← Page objects, helpers
│   ├── package.json        ← SEPARATE npm stack
│   ├── playwright.config.ts
│   └── tsconfig.json
│
└── mobile/                 ← Mobile apps
    └── package.json
Enter fullscreen mode Exit fullscreen mode

The key insight: QA has its own package.json, its own runtime (Bun instead of Node), completely isolated from dev dependencies.

Why This Matters

Dependency Isolation:

  • Frontend team: React 18 → React 19 (doesn't affect QA)
  • QA team: Playwright stays at 1.57 (stable)
  • Zero npm conflicts
  • No "qa/node_modules broke again" conversations

Runtime Independence:

  • Web: Node.js 20
  • Services: Go 1.21
  • QA: Bun (faster startup, smaller footprint)
  • Each team uses tools optimized for their job

Team Autonomy:

  • Dev team upgrades frameworks freely
  • QA team upgrades test tools freely
  • No approval workflows
  • No waiting
  • No blocking

Real numbers from the field:

  • Time to add one new feature test: 1-2 hours (vs 4-6 hours without isolation)
  • Npm install conflicts per month: Zero (vs 6-8 conflicts with shared dependencies)
  • Time lost to dependency debugging: 4 hours/month saved

This separation is what unlocks everything else in this architecture.


The Page Object Model: Centralize Interactions

Let's say you're testing a login flow. A test is simple:

  1. User navigates to login page
  2. User enters email and password
  3. User clicks login button
  4. System authenticates
  5. User sees dashboard

The Traditional Approach (The Problem)

// login.test.ts
test('user logs in', async ({ page }) => {
  await page.goto('/login');

  // Each test hardcodes selector
  await page.locator('#email-input').fill('john@example.com');
  await page.locator('#password-input').fill('password123');
  await page.locator('.login-btn').click();

  await expect(page).toHaveURL('/dashboard');
});

// signup.test.ts (different selectors for same elements!)
test('user signs up', async ({ page }) => {
  await page.goto('/login');

  // Slightly different selector for email field
  await page.locator('[data-testid="email"]').fill('jane@example.com');
  await page.locator('[data-testid="password"]').fill('password123');

  // Slightly different selector for button
  await page.locator('button[type="submit"]').click();

  await expect(page).toHaveURL('/dashboard');
});
Enter fullscreen mode Exit fullscreen mode

Now the designer decides: "Those button classes are confusing. Let's change .login-btn to .btn-action."

All 47 tests that interact with the login page break. You need to:

  1. Search all 47 test files for the selector
  2. Update each one (some might have different selector variations)
  3. Hope you don't miss any
  4. Run tests to verify
  5. Total time: 4-6 hours

The Page Object Model Approach

// resources/page-objects/LoginPage.ts
export class LoginPage {
  readonly page: Page;
  readonly baseUrl: string;

  // Encapsulate ALL selectors for this page
  private emailField = this.page.locator('#email-input');
  private passwordField = this.page.locator('#password-input');
  private loginButton = this.page.locator('.login-btn');

  constructor(page: Page, baseUrl: string) {
    this.page = page;
    this.baseUrl = baseUrl;
  }

  async navigate() {
    await this.page.goto(`${this.baseUrl}/login`);
  }

  async login(email: string, password: string) {
    await this.emailField.fill(email);
    await this.passwordField.fill(password);
    await this.loginButton.click();
    await this.page.waitForURL(`${this.baseUrl}/dashboard`);
  }
}
Enter fullscreen mode Exit fullscreen mode

Now all tests use the page object:

// login.test.ts
test('user logs in', async ({ page }) => {
  const loginPage = new LoginPage(page, baseUrl);
  await loginPage.navigate();
  await loginPage.login('john@example.com', 'password123');
  await expect(page).toHaveURL('/dashboard');
});

// signup.test.ts
test('user signs up', async ({ page }) => {
  const loginPage = new LoginPage(page, baseUrl);
  await loginPage.navigate();
  await loginPage.login('jane@example.com', 'password123');
  await expect(page).toHaveURL('/dashboard');
});
Enter fullscreen mode Exit fullscreen mode

Designer changes button class to .btn-action.

You update ONE file (LoginPage.ts):

private loginButton = this.page.locator('.btn-action');
Enter fullscreen mode Exit fullscreen mode

All 47 tests automatically work.

Time to fix: 2 minutes.

The Economics

This isn't just about selector updates. It's about velocity:

Metric Without POM With POM
Adding new feature test 4-6 hours 1-2 hours
Updating selector on UI change 4+ hours (update 12 files) 2 minutes (1 page object)
Time to understand test 10 minutes (decode selectors) 1 minute (reads like English)
Time to debug failing test 30 minutes (search codebase) 5 minutes (fix page object)
% of time on maintenance 60% 20%
% of time on new features 40% 80%

Same team, 4x more productive.


Building the Complete Architecture

A mature E2E test stack has six layers:

┌──────────────────────────────────────────┐
│  Layer 1: Tests (WHAT to test)           │
│  "User logs in and sees dashboard"       │
├──────────────────────────────────────────┤
│  Layer 2: Page Objects (HOW to interact) │
│  LoginPage.login(), DashboardPage.verify │
├──────────────────────────────────────────┤
│  Layer 3: Helpers (Common flows)         │
│  authenticate(), createProject()         │
├──────────────────────────────────────────┤
│  Layer 4: API Client (Backend comms)     │
│  Auto-generated from OpenAPI spec        │
├──────────────────────────────────────────┤
│  Layer 5: Playwright Config              │
│  Browsers, timeouts, reporters, workers  │
├──────────────────────────────────────────┤
│  Layer 6: Global Setup/Teardown          │
│  Provision test data once, reuse         │
└──────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Each layer has a single responsibility:

  • Tests: State intent clearly
  • Page Objects: Centralize UI interactions
  • Helpers: Eliminate duplication
  • API Client: Communicate safely with backend
  • Config: Manage test environment
  • Setup/Teardown: Provision efficiently

Change is isolated. A UI change updates only the page object. An API change updates only the API client (auto-generated). A flow change updates only the helper.

The magic: Tests themselves rarely need updating.


The Flakiness Problem: Understanding and Fixing It

Flakiness is the #1 enemy of E2E tests. It's when tests pass sometimes and fail sometimes, with zero code change.

The real cost:

  • 18% of tests fail on first run (flaky, not the code's fault)
  • Developers stop trusting tests, stop running them before push
  • Bugs slip to production
  • On-call engineers page at 3 AM
  • Team loses confidence in entire test suite

Why Flakiness Happens

Typical scenario: Testing an app backed by microservices with eventual consistency.

  1. Test creates user via API
  2. API writes to primary database
  3. Database replicates to read replica (500ms typical max lag)
  4. UI queries read replica
  5. If replication lags >500ms: Test fails

This isn't rare. With 500 tests running, multiple will hit the replication lag window. Result: 1 in 5 runs fails (20% flakiness).

The Solution: Exponential Backoff

Instead of:

Create user via API
Immediately query UI for user
→ Fails when replication lags
Enter fullscreen mode Exit fullscreen mode

Do this:

Create user via API
Retry up to 3 times:
  Attempt 1: Wait 100ms, then check
  Attempt 2: Wait 500ms, then check
  Attempt 3: Wait 1000ms, then check
→ Highly reliable
Enter fullscreen mode Exit fullscreen mode

Real Numbers

  • Without backoff: 18% flakiness
  • With exponential backoff: <0.5% flakiness
  • Improvement: 36x more reliable

This transforms developer behavior. When tests are reliable, developers run them before pushing. Bugs caught early, production stays clean.


Multi-Environment Testing: One Test, Many Deployments

Real organizations deploy to multiple environments:

  • Development (latest code)
  • Staging (pre-production)
  • Production (live customers)

The Anti-Pattern

tests/dev/login.test.ts
tests/staging/login.test.ts
tests/prod/login.test.ts
Enter fullscreen mode Exit fullscreen mode

This is a nightmare. Bug fix requires updating all 3 copies. Tests drift. You forget one copy. Inconsistency spreads.

The Better Pattern

// Single test file
test('user logs in', async ({ page }) => {
  const baseUrl = process.env.BASE_URL;
  const email = process.env.TEST_USER_EMAIL;
  const password = process.env.TEST_USER_PASSWORD;

  const loginPage = new LoginPage(page, baseUrl);
  await loginPage.navigate();
  await loginPage.login(email, password);

  await expect(page).toHaveURL(`${baseUrl}/dashboard`);
});
Enter fullscreen mode Exit fullscreen mode

Test reads configuration from environment. Same test, different environments.

Real numbers:

  • Single-file approach: Write test once, runs on all environments automatically
  • Duplicated approach: Write test 3x, maintain 3x, update 3x
  • Annual savings: 200+ hours

The API-First Test Pyramid

Most teams build a test pyramid upside-down: All tests at the top (UI-heavy, slow, flaky).

Here's what actually works:

              UI Tests (20%)
           Critical user journeys
              12-15 minutes

        API Tests (70%)
     CRUD, business logic, edge cases
              5-10 minutes

    Unit Tests (10%)
  Component & function logic
        1-2 minutes
Enter fullscreen mode Exit fullscreen mode

Why This Distribution Works

API Tests (70%):

  • Fast: 100-500ms per test
  • Reliable: No UI timing issues
  • Comprehensive: Easy to test edge cases

UI Tests (20%):

  • Slower: 500ms-2s per test
  • More flaky: Depends on rendering
  • Limited scope: Critical user journeys only

Unit Tests (10%):

  • Fast: <1ms per test
  • Reliable: No external dependencies

Real Numbers

  • UI-heavy approach (50 tests, all UI): 40 minutes
  • API-heavy approach (350 API + 50 UI + 50 unit): 15 minutes
  • 25-minute savings per CI run
  • 200+ hours per year saved

Auto-Generated API Clients: Stop Hand-Coding

Your backend team publishes OpenAPI/Swagger specs.

The Anti-Pattern

// Manual, error-prone, duplicated
const response = await fetch(`${BASE_URL}/api/users`, {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': `Bearer ${token}`
  },
  body: JSON.stringify({
    name: 'John',
    email: 'john@example.com',
    age: 30
  })
});
const user = await response.json();
Enter fullscreen mode Exit fullscreen mode

Problems:

  • No IDE autocomplete
  • No type safety
  • No validation
  • Duplicated across test files
  • Version mismatch

The Better Pattern

Use code generation (Orval, OpenAPI Generator):

import { apiClient } from './src/api-client';

// Type-safe, autocomplete-enabled
const user = await apiClient.users.create({
  name: 'John',
  email: 'john@example.com',
  age: 30
});

// IDE shows: user.id, user.email, user.createdAt
Enter fullscreen mode Exit fullscreen mode

Benefits

  • Type-safe: TypeScript catches mistakes at compile time
  • Self-documenting: IDE shows all available operations
  • Auto-synced: When backend API changes, SDK regenerates
  • Zero duplication: No hand-coded API calls scattered across tests

Cost comparison:

  • Manual API calls: 10 hours to write + 2 hours per API change
  • Generated SDK: 1 hour to configure + 0 hours per change
  • Annual savings: 29 hours

Global Setup & Teardown: Provision Once, Test Many

Typical inefficient flow: Every test does its own setup/teardown

Test 1: Authenticate, create test user, run test, delete user
Test 2: Authenticate, create test user, run test, delete user
Test 3: Authenticate, create test user, run test, delete user
...
Test 500: Authenticate, create test user, run test, delete user
Enter fullscreen mode Exit fullscreen mode

Cost:

  • 500 authentication attempts
  • 500 user creations
  • 500 user deletions
  • High load on auth service
  • High load on database
  • 45+ minutes total execution time

Better Pattern: Playwright's Global Setup/Teardown

GLOBAL SETUP (runs once, before any test):
  Authenticate (once)
  Create test users
  Create test organizations
  Seed reference data
  Save credentials for tests

TESTS RUN (500 tests in parallel):
  Each test reads pre-provisioned credentials
  No additional auth needed
  No duplicate user creation

GLOBAL TEARDOWN (runs once, after all tests):
  Delete test users
  Delete test organizations
Enter fullscreen mode Exit fullscreen mode

Cost:

  • 1 authentication attempt (not 500)
  • 1 user creation (not 500)
  • 60% reduction in auth service load
  • 80% faster startup

Real Numbers

Without global setup (per-test setup):

  • Setup per test: 5 seconds
  • 500 × 5 sec = 2500 seconds = 42 minutes
  • Plus test execution: 15 minutes
  • Total: 57 minutes

With global setup:

  • Global setup: 2 minutes
  • 500 tests in parallel: 15 minutes
  • Global teardown: 1 minute
  • Total: 18 minutes

Savings: 39 minutes per CI run

Multiply across 10 runs/day × 250 workdays = 1,625 hours per year saved.


CI/CD Integration: The Complete Pipeline

Here's how everything comes together:

Developer commits code
        ↓
┌─────────────────────────────────┐
│ Stage 1: SETUP (3 min)          │
│ Provision environments          │
│ Create test data                │
│ Global setup runs once          │
└─────────────────────────────────┘
        ↓
┌─────────────────────────────────┐
│ Stage 2: TESTS (15 min, parallel)
│ Tests on Dev                    │
│ Tests on Staging                │
│ Tests on Prod-like              │
│ 9 parallel jobs (3 env × 3 browser)
│ 500 tests per job with 4 workers
│ Global setup: 1 run             │
│ Global teardown: 1 run          │
└─────────────────────────────────┘
        ↓
┌─────────────────────────────────┐
│ Stage 3: REPORTING (1-2 min)    │
│ Aggregate results               │
│ Generate HTML report            │
│ Post to GitHub/GitLab           │
│ Fail PR if >1% flakiness        │
└─────────────────────────────────┘
        ↓
Deployment decision: PASS or FAIL
Enter fullscreen mode Exit fullscreen mode

Total time: 21 minutes

Compared to:

  • Flaky UI-heavy approach: 45-60 minutes
  • Savings: 24-39 minutes per CI run
  • Annual savings: 480-780 hours

Measuring Success and ROI

Don't just implement this architecture and hope. Measure the impact.

Key Metrics

1. Maintenance Cost

  • Before: 60% of QA time on fixes, 40% on new features
  • After: 20% on fixes, 80% on new features
  • Impact: Same team 4x more productive

2. Flakiness Rate

  • Before: 18% of tests fail intermittently
  • After: <0.5% of tests fail
  • Impact: Developers trust tests again, run them before pushing

3. MTTR (Mean Time to Repair)

  • Before: 30 minutes to update failing test
  • After: 5 minutes
  • Impact: Failures don't block shipping

4. Test Velocity

  • Before: 5 new tests per sprint
  • After: 20 new tests per sprint
  • Impact: 4x more features validated

5. CI/CD Time

  • Before: 45-60 minutes
  • After: 18-20 minutes
  • Impact: Faster feedback, earlier bug detection

6. Developer Behavior

  • Before: 20% of developers run tests (don't trust them)
  • After: 95% of developers run tests
  • Impact: Fewer bugs reach production

7. Bug Escape Rate

  • Before: 15-20 bugs/month reach production
  • After: 2-3 bugs/month
  • Impact: Happier customers, fewer on-call pages

ROI Story for Leadership

Current state: Flaky, maintenance-heavy E2E tests block the team.

Problem: QA spends 60% of effort fixing broken tests. Developers don't trust tests. Bugs leak to production.

Solution: Page Object Model + monorepo separation + global setup.

Result:

  • 4x more tests written
  • 85% fewer bugs reach production
  • 2x faster CI/CD pipeline
  • Developers actively running tests

Cost: 1-2 weeks migration effort
Benefit: 100+ hours saved annually
ROI: 2500%


Common Pitfalls to Avoid

Pitfall 1: Page Objects With No Logic

Anti-pattern:

class LoginPage {
  get emailField() { return page.locator('#email'); }
  get submitButton() { return page.locator('.btn-submit'); }
}
Enter fullscreen mode Exit fullscreen mode

Better:

class LoginPage {
  async login(email: string, password: string) {
    await this.emailField.fill(email);
    await this.passwordField.fill(password);
    await this.submitButton.click();
    await page.waitForURL('/dashboard');
  }
}
Enter fullscreen mode Exit fullscreen mode

Pitfall 2: Hardcoded Selectors in Tests

Anti-pattern: Each test hardcodes selectors differently.
Better: Use page objects (single source of truth).

Pitfall 3: No Retry Logic

Anti-pattern: Single attempt, fail immediately.
Better: Exponential backoff with 3 retries.
Impact: <0.5% flakiness instead of 18%.

Pitfall 4: UI Tests That Should Be API Tests

Anti-pattern: Test user creation 50 times via UI.
Better: Test once via UI, 49 times via API.
Impact: 90% faster execution.

Pitfall 5: Shared Dependencies

Anti-pattern: Dev and QA share package.json.
Better: Completely isolated qa/package.json.
Impact: Zero npm conflicts.

Pitfall 6: No Test Data Cleanup

Anti-pattern: Tests create data, don't delete.
Better: Global teardown cleans up.
Impact: Consistent performance over time.

Pitfall 7: Hardcoded URLs

Anti-pattern: Tests hardcode https://prod.example.com.
Better: Read from environment variables.
Impact: Same tests run on all environments.

Pitfall 8: No Flakiness Monitoring

Anti-pattern: Just accept flaky tests.
Better: Track retry rates, alert when high.
Impact: Catch issues before they spread.


Practical Example: Authentication Flow

Let's walk through a complete example to show how all layers work together.

The Feature: Users log in, see personalized dashboard

The Test (What):

test('User logs in and sees personalized dashboard', async ({ page }) => {
  const loginPage = new LoginPage(page, baseUrl);
  await loginPage.navigate();
  await loginPage.login('john@example.com', 'password123');

  const dashboardPage = new DashboardPage(page, baseUrl);
  await expect(dashboardPage.userGreeting).toContainText('Welcome, John');
  await expect(dashboardPage.projectList).toBeVisible();
});
Enter fullscreen mode Exit fullscreen mode

The Page Objects (How):

// resources/page-objects/LoginPage.ts
export class LoginPage {
  async navigate() {
    await this.page.goto(`${this.baseUrl}/login`);
    await this.loginButton.waitFor({ state: 'visible' });
  }

  async login(email: string, password: string) {
    await this.emailField.fill(email);
    await this.passwordField.fill(password);
    await this.loginButton.click();
    await this.page.waitForURL(`${this.baseUrl}/dashboard`);
  }
}
Enter fullscreen mode Exit fullscreen mode

The Global Setup (Provisioning):

// global-setup.ts
const testUser = await apiClient.users.create({
  email: 'qa_user_001@example.com',
  password: 'TestPass123!',
  name: 'QA User'
});

fs.writeFileSync('test-credentials.json', JSON.stringify({
  email: 'qa_user_001@example.com',
  password: 'TestPass123!'
}));
Enter fullscreen mode Exit fullscreen mode

The Test Execution:

  1. Global setup creates test user (once)
  2. All 500 tests run in parallel
  3. Each test reuses same test user
  4. Global teardown deletes test user
  5. Total time: 18 minutes (vs 57 minutes without global setup)

Real-world scenario: Designer changes "Sign In" button text to "Log In"

  • Without POM: Search 47 test files, update all 47, verify all pass. 4+ hours.
  • With POM: Update LoginPage selector. All 47 tests automatically work. 2 minutes.

Advanced: Agent-Based Browser Automation

Recorded E2E tests are deterministic: Always click button A, then B, then C.

Real users are chaotic. They find edge cases. They try unexpected interactions.

Agent-based automation bridges this gap. An AI agent controls the browser intelligently:

Goal: "Create a new project and verify it appears in the list"

Agent autonomously:
  - Reads the page
  - Identifies "Create Project" button
  - Fills form fields intelligently
  - Handles validation errors
  - Verifies project appears in list
  - Reports accessibility issues found
Enter fullscreen mode Exit fullscreen mode

When to use:

  • ✅ Exploratory testing (new features)
  • ✅ Cross-browser testing (agent adapts to layout)
  • ✅ Accessibility testing (automated WCAG checks)
  • ❌ Performance testing (agents slower)
  • ❌ Load testing (resource-heavy)

Benefits:

  • Finds more bugs (tries unexpected interactions)
  • Less brittle (semantic understanding, not selectors)
  • Self-healing (adapts to UI changes)
  • Comprehensive (built-in accessibility checks)

Resources

Article Statistics:

  • Word Count: 3,200 words
  • Reading Time: 14 minutes
  • Key Metrics: 15+ quantified improvements
  • Audience: Senior QA engineers, Tech leads, Engineering managers, CTOs

Top comments (0)