A developer pushes a button color change. CI triggers. Your full suite runs, including the complete checkout flow, destructive state tests, and edge cases that have nothing to do with the change. 45 minutes later, green check. Merge. Nobody questioned it.
The damage isn't just time. It's trust. Developers start ignoring CI results when feedback takes 45 minutes on a two-line change. Flaky failures get dismissed.Real failures get missed.
Test tagging fixes this.
BEFORE: No tags. Everything runs on every trigger.
test('add to cart', async ({ page }) => { /* ... */ });
test('complete purchase', async ({ page }) => { /* ... */ });
test('request refund', async ({ page }) => { /* ... */ });
AFTER: Tagged. Filtered by when and where they should run.
test('add to cart', { tag: '@smoke' }, async ({ page }) => { /* ... */ });
test('complete purchase', { tag: '@regression' }, async ({ page }) => { /* ... */ });
test('request refund', { tag: '@destructive' }, async ({ page }) => { /* ... */ });
// PR pipeline: npx playwright test --grep @smoke → 4 min
// Nightly: npx playwright test --grep @regression → 45 min
// Isolated: npx playwright test --grep @destructive → runs alone
@smoke: 20-30 critical tests. Every PR. Under 5 minutes. @regression: Full suite. Nightly. Catches everything without blocking anyone. @destructive: Modifies shared state. Isolated run. Doesn't pollute other results.
45 minutes down to 4. No infrastructure changes. Just tags.
Do you enforce a tagging strategy on your team, or is it a free-for-all? Drop it in the comments.
Top comments (1)
exactly right on the memory issue. we had the same — chromium just doesn't release properly without explicit cleanup
now using snapapi.pics for all our screenshot needs. api-based, no browser on our servers