This article is a summary of the presentation delivered at FEConf2024. The presentation is published in two parts. Part 1 explores E2E testing, the tools that assist with it, and methods for building efficient test code by reducing maintenance costs. Part 2 focuses on test code reuse, modularization, and improvements made to Playwright. All images included in this article are sourced from the presentation materials of the same title and are not cited individually.
Dreaming of Easy and Convenient E2E Test Automation
By Buseok Baek, CTO at Stibee
- Dreaming of Easy and Convenient E2E Test Automation - pt 1.
- Dreaming of Easy and Convenient E2E Test Automation - pt 2.
In the previous article, we explored E2E testing and the tools that facilitate it. We also examined how to write efficient test code while navigating authentication workflows. In this second part, we'll cover the final stages of E2E testing: modularization, reusability, and enhancements to Playwright.
Modularization and Reusability
What does modularization mean in test automation?
For instance, login sequences are frequently reused across test cases. Ideally, we could reuse them as-is, but in practice, developers must often refactor and abstract such code into reusable modules. While common, this approach falls short of complete automation.
A Custom Test Automation Tool
To achieve complete automation, I evaluated various tools—but none satisfied my requirements. Therefore, I built a custom solution using Electron.
In the tool, test cases appear on the left and are linked to Playwright. By clicking a record button, user actions are recorded, and test code is automatically generated.
This code is then processed to extract variables—user inputs such as email or password—which are displayed on the right panel.
These inputs are parameterized, eliminating the need for additional developer work when reusing the module elsewhere.
This process leverages AST (Abstract Syntax Tree), which represents code as a hierarchical tree structure. In this tool, AST manipulation is employed to identify and extract values for modularization.
To use a modularized login flow, you can simply select it from a list and prepend it to any new test case. The entire login sequence becomes reusable with minimal effort.
By visually chaining modules together and recording interactions, developers can generate comprehensive test scenarios without code duplication or rewriting.
Sustainable Test Code
As mentioned in Part 1, creating a sustainable test environment is crucial. Static input values can cause failures in subsequent test executions. For example, if a test consistently creates an address book named "FEConf 2024 Address Book," it will pass initially but fail thereafter due to duplicate key constraints.
To address this issue, I implemented Faker to randomize input values—ensuring each test execution uses unique titles or identifiers, thereby preventing conflicts and reducing maintenance overhead.
Additionally, traditional test runners discourage using external variables for the sake of test independence. But I re-evaluated this constraint: as long as test outcomes remain stable, why not reuse result data?
For example, after creating an address book, its unique ID can be stored locally and used in subsequent tests. This enables seamless test chaining—passing data from Test A to Test B without friction.
Improving Playwright
Even with this modular tool, about 10% of the test code still required manual edits. Having modified Selenium and Puppeteer in the past, I decided to enhance Playwright as well.
Improved Result Reports
Conventional test reports are typically stored locally, making them difficult to share. I enhanced this by capturing comprehensive metadata about test failures—including which user action triggered the issue—and making reports accessible through shareable URLs.
Custom Selector Strategy
Playwright’s selectors are powerful, but I customized the logic that determines selector priority. For instance, if an input field includes a name
attribute, that should take precedence—making tests more stable against text or placeholder changes.
By analyzing Playwright's internal architecture, I modified the selector engine to prioritize the name attribute when available, thereby enhancing resilience against UI copy modifications.
Dictionary-Based UI Copy
In our service, UI copy is stored in Google Sheets as key-value pairs. Developers bind keys to the frontend; planners and designers can update text values directly in the spreadsheet.
This eliminates the need for designers to revise Figma files or for developers to redeploy just to change a label. It also ensures test selectors based on keys remain consistent across UI changes.
Checkpoints for Test Recovery
If a test fails midway, restarting from the beginning is tedious. Using Playwright’s Session Storage API, I implemented checkpoints that allow the test to resume from the last stable state.
When a failure occurs, the tool automatically loads the saved session and resumes execution—saving time and debugging effort.
Screenshot Comparison Enhancements
Animations and dynamic content can interfere with screenshot-based visual testing. For example, background effects or dynamic ads cause screenshots to differ on every run.
To address this, I added logic to pause CSS animations and hide unstable elements before capturing screenshots—reducing false positives in visual comparisons.
Auto-Generated User Flows
Since the test reports already include step-by-step screenshots and interaction logs, I added a feature to auto-generate user flow diagrams. These show how users navigate through the app—e.g., from login to dashboard—based on recorded test activity.
Regrets and Improvements
Looking back, I wish I had built the tool as a VS Code extension rather than a standalone app—it would have made integration much smoother.
There’s also more potential in the test reports. For example, using Kubernetes tools like MOON could allow test cases to run remotely at scale.
In the future, I plan to enhance this further by linking frontend screens to backend APIs—understanding which APIs are called on each screen, and how changes affect the system.
Final Thoughts
The tool I created is still far from perfect. I aimed for usability, but it remains difficult for others to adopt without guidance. That said, I’m continuing to refine it based on lessons learned.
Even if you don't adopt this specific tool, I hope the concepts shared here help you enhance your own E2E testing workflows—making them more efficient, maintainable, and developer-centric.
Top comments (0)