DEV Community

Cover image for Automated Testing using MCP and AI Agents

Automated Testing using MCP and AI Agents

Speaker: Mariana Chow @ AWS Community Day Hong Kong 2025

Summary by Amazon Nova

https://www.youtube.com/watch?v=cgoFdt8ybwY



Preparation and Planning

Foundation: Testing Data

  • Often overlooked but critical

  • Task management system (e.g., Jira): Use webhooks to store updates in AWS S3

  • Swagger documentation: Provides API specifications and parameters

  • Historical test cases: Allows verification and retesting of previous cases

Connecting Data Sources

  • Traditional methods have limited capability to find relationships

  • Introduction of Large Language Models (LLMs) to bridge connections

  • Example:

  • Mariana (AWS career in cloud computing with AI)

  • Dario (co-founder of Entropic, an AI company)

  • Entropic developed a series of LLMs available through AWS Bedrock

Execution

AI-Driven Testing Workflow

  • Leveraging AI to automate and enhance testing processes

  • Detailed dive into how AI integrates with existing testing frameworks

Reporting

Comprehensive Reporting

  • Generating insightful reports from automated tests

  • Utilizing AI to provide actionable insights and recommendations

  • Emphasis on transforming traditional testing workflows with AI

  • Invitation for feedback and discussion on the presented ideas



Understanding Relationships with Knowledge Graphs

  • Knowledge graphs capture complex relationships that simple text searches cannot

  • Example: Relationship between Mariana (AWS career) and Entropic (AI company founded by Dario)

  • Knowledge graph reveals hidden connections not found by simple searches

Importance of Knowledge Graphs in Automated Testing

  • Crucial for automated task and test case generation

  • Extracts entities and connects related items from data sources

  • Example:

  • Requirement tickets

  • API specifications

  • Historical test cases

Creating a Knowledge Graph

  • Traditional methods (e.g., PostgreSQL) are widget-heavy and low visibility

  • No-code solutions (e.g., DynamoDB) are inefficient

  • Graph databases (e.g., AWS Neptune, Neo4j) are better but require learning new concepts (nodes, edges, properties)

Solution: Amazon Neptune Analytics

  • Combines graph database capabilities with foundational models and AWS Bedrock

  • Allows insertion and retrieval of information without complex syntax

  • Provides a beautiful graph view for data visualization

Graph Retrieval Augmented (RA)

  • Technique where AI retrieves reference documents and generates responses

  • Graph RA uses a knowledge graph for retrieval instead of simple text search

AI-Driven Test Pipeline

  • Built on top of the knowledge graph foundation

  • Uses Behavior-Driven Development (BDD) with Gherkin format for test cases

  • Gherkin scenarios use keywords like Given, When, and Then to specify the initial state, action, and expected outcome of a feature

  • Example test case: Checking homepage titles with prerequisites and scenarios

  • Knowledge graphs are essential for effective automated testing

  • Amazon Neptune Analytics simplifies graph database management and visualization

  • Graph RA enhances AI-driven test case generation and execution



AI-Driven Test Case Generation

Goal

  • Improve test case coverage and consistency using AI agents

AI Agent Capabilities

  • Chooses the best actions to perform and achieve testing goals

  • Analyzes business flows and reads requirements from the knowledge graph

Example: 

Subscription Management Feature

Business Flow Analysis

  • Knowledge graph identifies:

  • Validation of payment method in the UI

  • Calling payment APIs

Conflict Detection

  • AI agent detects requirement conflicts:

  • Old rule: User must verify email before accessing premium features

  • New rule: Trial users can access premium features for seven days without email verification

  • Updates related test cases accordingly

API Details Discovery

  • Extracts endpoints, required/optional parameters, and error responses from the graph DB

  • Identifies API dependencies through recorded data (e.g., successful subscription creation, payment failure)

Test Data Generation

  • Creates test data covering:

  • Happy flow (successful scenarios)

  • Edge cases (boundary conditions)

  • Error conditions (failure scenarios)

Conclusion

  • AI agents enhance test case generation by:

  • Analyzing business flows

  • Detecting conflicts and updating test cases

  • Discovering API details and dependencies

  • Generating comprehensive test data



Refining and Executing Test Cases

Refining Test Cases with Business Rules

  • Use information from business flow analysis and identified rules to enhance scenarios

  • Convert refined scenarios into scenario-based test cases

Human-in-the-Loop Verification

  • AI can generate comprehensive test cases, but human experts are needed for validation

  • Human verification ensures edge cases and business contexts are captured

  • Jira ticket system, API documentation, and historical test cases may not cover all scenarios

Execution with Playwright

  • Playwright: Fast and reliable end-to-end testing

  • Playwright supports all modern rendering engines including Chromium, WebKit, and Firefox. Cross-platform. Test on Windows, Linux, and macOS, locally 

  • Utilize Playwright for test execution

  • Well-known, open-source automated testing framework

  • Communicates directly with the browser for efficient testing

  • Supports multiple languages (Java, Python, C, JavaScript, TypeScript)

  • Strong community support with 78.9K stars on GitHub

  • Native support for MCP (Model-Based Continuous Planning) for natural language instructions

  • Built-in features:

  • Parallel execution for efficiency

  • Comprehensive tracing and debugging capabilities

  • Automated screenshot and video recording

  • Detailed logging for documenting the testing process and enabling later verification

Conclusion

  • AI enhances test case generation, but human verification is crucial

  • Playwright offers efficient, versatile, and feature-rich test execution



Executing Test Cases with AI Agents

Task Executor Agent

  • AI agent for handling the execution phase

  • Extracts test parameters from generated tasks

  • Creates and stores Playwright scripts in the database for future reference

  • Automatically executes tests

Examples of AI-Driven Testing with Playwright

  • Front-end and back-end testing

  • Experiment using Playwright MCP with Amazon Q developer CLI

  • Three test cases for a simple e-commerce website:

  • Purchasing a product with shipment information and completing the order process

  • Logging in and entering the product page without further action

  • Adding a product to the shopping cart without further action

Post-Execution Verification

  • Test cases stored in the database

  • Utilization of features like video recording for later verification

  • Testers can review recorded videos to ensure alignment with expected behavior

Recap of Complete Test Execution Flow

  • Three AI agents work together:

  • Test case generator

  • Test case executor

  • Report generator (captures results and recordings)

  • Modularized approach provides flexibility and scalability

Leveraging Multiple MCPs

  • Use various MCPs for different testing needs:

  • MySQL MCP for generating realistic data

  • Redis MCP for storing recently used test cases

  • Simplifies testing processes

Key Message: Shift-Left Testing

  • Integrate automated testing early in the development cycle

  • Benefits:

  • Better quality

  • Improved efficiency

  • Lower costs

  • Reduced technical debt

Conclusion

  • AI-driven automated testing saves time and cost

  • Testers focus on verification, not test case generation



Team:

AWS FSI Customer Acceleration Hong Kong

AWS Amarathon Fan Club

AWS Community Builder Hong Kong

Top comments (0)