chlorine

Posted on Nov 10, 2024

Execute E2E Test Cases Using Natural Language with Intelli-Browser

#ai #playwright #tooling #programming

Introduction

End-to-end (E2E) testing ensures that your application works correctly from start to finish. Writing and maintaining E2E test cases can be complex and time-consuming, especially when dealing with intricate user interactions.

What if you could write E2E tests using simple natural language instructions? This would not only make the tests more readable but also easier to maintain.

The Intelli-Browser is an innovative project that allows you to do exactly that. In this article, we will walk you through how to use it to execute E2E test cases using natural language.

Inspired by Claude-3.5-Sonnet Computer Use

Intelli-Browser draws inspiration from advanced language models Claude-3.5-Sonnet, known for their new ability to use computers the way people do. By interacting with the large model, Intelli-Browser accepts actions , simulates browser usage and combines it with E2E testing scenarios.

Video Demo

Here we have an user task:

Click search and input "Web API", press "arrow down" once to select the second result. Then press "ENTER" to search it. Find "Keyboard API" nearby title "K" and click it

And a video demo is in: MDN Demo Video

How It Works

User Prompt and Page Information

When you write a natural language test case, the user prompt and the current page information are sent to a large language model (LLM) integrated within Intelli-Browser. The LLM analyzes the page content and interactive elements, understanding the context and the required actions.

Action Planning

Based on the analysis, the LLM plans a sequence of actions to achieve the goal specified in the natural language test case. These actions might include clicking buttons, filling out forms, navigating through pages, and verifying certain elements on the webpage.

Execution and Feedback Loop

Intelli-Browser executes the planned actions within a real or simulated browser environment and provides feedback to the LLM on the success or failure of each action. This feedback loop helps in adjusting the subsequent actions based on the real-time status of the webpage.

Termination Conditions

The process continues until either no more actions are required, achieving the task's goal, or it becomes evident that the goal cannot be achieved due to an error or an unexpected page state. This intelligent handling ensures robustness and adaptability in test case execution.

Getting Started

Installation

You can install Intelli-Browser via npm, yarn, or pnpm. Here are the commands for each package manager:

# use npm
npm install @intelli-browser/core

# use yarn
yarn add @intelli-browser/core

# use pnpm
pnpm add @intelli-browser/core

API Reference

To start using Intelli-Browser, you need to import it and create a client instance. Here is how you can do it:

import { IntelliBrowser } from '@intelli-browser/core';

const client = new IntelliBrowser({
  apiKey: '',  // add apiKey or provide ANTHROPIC_API_KEY in .env file
});

// Example of executing natural language instructions
await client.run({
  page,  // playwright Page instance
  message: 'Click search and input "Web API", press "arrow down" to select the second result. then press "ENTER" to search it',  // user prompt
});

Writing Your First Natural Language E2E Test

Let’s create a basic test case that verifies a search functionality on a web application

import { IntelliBrowser } from '@intelli-browser/core';
import { chromium } from 'playwright';

(async () => {
  // Launch a browser instance
  const browser = await chromium.launch();
  const page = await browser.newPage();

  // Navigate to the homepage
  await page.goto('https://example.com');

  const client = new IntelliBrowser({
    apiKey: '',  // add apiKey or provide ANTHROPIC_API_KEY in .env file
  });

  // Execute the user prompt
  await client.run({
    page,
    message: 'Click search and input "Web API", press "arrow down" to select the second result. then press "ENTER" to search it',
  });

  // Close the browser
  await browser.close();
})();

Generating Traditional E2E Test Cases

If you want to generate traditional E2E test cases after executing the natural language prompt, you can retrieve the return data from the client.run method.

import { IntelliBrowser } from '@intelli-browser/core';
import { chromium } from 'playwright';

(async () => {
  // Launch a browser instance
  const browser = await chromium.launch();
  const page = await browser.newPage();

  // Navigate to the homepage
  await page.goto('https://example.com');

  const client = new IntelliBrowser({
    apiKey: '',  // add apiKey or provide ANTHROPIC_API_KEY in .env file
  });

  // Execute the user prompt and generate E2E cases
  const e2eCases = await client.run({
    page,
    message: 'Click search and input "Web API", press "arrow down" to select the second result. then press "ENTER" to search it',
  });

  console.log(e2eCases);
  // Example output:
  // [
  //   'await page.mouse.move(1241.61, 430.2)',
  //   'await page.waitForTimeout(2266)',
  //   'await page.mouse.down()',
  //   'await page.mouse.up()',
  //   'await page.waitForTimeout(3210)',
  //   "await page.mouse.type('Web API')",
  //   'await page.waitForTimeout(3064)',
  //   "await page.keyboard.press('ArrowDown')",
  //   'await page.waitForTimeout(2917)',
  //   "await page.keyboard.press('Enter')",
  //   'await page.waitForTimeout(6471)',
  //   "await page.keyboard.press('PageDown')",
  //   'await page.waitForTimeout(7021)',
  //   'await page.mouse.move(687.39, 923.4)',
  //   'await page.waitForTimeout(4501)',
  //   'await page.mouse.down()',
  //   'await page.mouse.up()'
  // ]

  // Close the browser
  await browser.close();
})();

At End

The Intelli-Browser project simplifies the process of writing and executing E2E test cases by using natural language instructions and advanced language model capabilities. This approach makes tests more understandable and maintainable, especially for teams with less focus on coding expertise.

For more information and advanced configurations, check out the Github repository. Feel free to open an issue or submit a pull request.

Happy testing!

DEV Community