Eliana Lam for AWS Community Builders

Posted on Nov 17

Automated Testing using MCP and AI Agents

#aws #beginners #cloud #productivity

Speaker: Mariana Chow @ AWS Community Day Hong Kong 2025

Summary by Amazon Nova

https://www.youtube.com/watch?v=cgoFdt8ybwY

Preparation and Planning

Foundation: Testing Data

Often overlooked but critical
Task management system (e.g., Jira): Use webhooks to store updates in AWS S3
Swagger documentation: Provides API specifications and parameters
Historical test cases: Allows verification and retesting of previous cases

Connecting Data Sources

Traditional methods have limited capability to find relationships
Introduction of Large Language Models (LLMs) to bridge connections
Example:
Mariana (AWS career in cloud computing with AI)
Dario (co-founder of Entropic, an AI company)
Entropic developed a series of LLMs available through AWS Bedrock

Execution

AI-Driven Testing Workflow

Leveraging AI to automate and enhance testing processes
Detailed dive into how AI integrates with existing testing frameworks

Reporting

Comprehensive Reporting

Generating insightful reports from automated tests
Utilizing AI to provide actionable insights and recommendations
Emphasis on transforming traditional testing workflows with AI
Invitation for feedback and discussion on the presented ideas

Understanding Relationships with Knowledge Graphs

Knowledge graphs capture complex relationships that simple text searches cannot
Example: Relationship between Mariana (AWS career) and Entropic (AI company founded by Dario)
Knowledge graph reveals hidden connections not found by simple searches

Importance of Knowledge Graphs in Automated Testing

Crucial for automated task and test case generation
Extracts entities and connects related items from data sources
Example:
Requirement tickets
API specifications
Historical test cases

Creating a Knowledge Graph

Traditional methods (e.g., PostgreSQL) are widget-heavy and low visibility
No-code solutions (e.g., DynamoDB) are inefficient
Graph databases (e.g., AWS Neptune, Neo4j) are better but require learning new concepts (nodes, edges, properties)

Solution: Amazon Neptune Analytics

Combines graph database capabilities with foundational models and AWS Bedrock
Allows insertion and retrieval of information without complex syntax
Provides a beautiful graph view for data visualization

Graph Retrieval Augmented (RA)

Technique where AI retrieves reference documents and generates responses
Graph RA uses a knowledge graph for retrieval instead of simple text search

AI-Driven Test Pipeline

Built on top of the knowledge graph foundation
Uses Behavior-Driven Development (BDD) with Gherkin format for test cases
Gherkin scenarios use keywords like Given, When, and Then to specify the initial state, action, and expected outcome of a feature
Example test case: Checking homepage titles with prerequisites and scenarios
Knowledge graphs are essential for effective automated testing
Amazon Neptune Analytics simplifies graph database management and visualization
Graph RA enhances AI-driven test case generation and execution

AI-Driven Test Case Generation

Goal

Improve test case coverage and consistency using AI agents

AI Agent Capabilities

Chooses the best actions to perform and achieve testing goals
Analyzes business flows and reads requirements from the knowledge graph

Example:

Subscription Management Feature

Business Flow Analysis

Knowledge graph identifies:
Validation of payment method in the UI
Calling payment APIs

Conflict Detection

AI agent detects requirement conflicts:
Old rule: User must verify email before accessing premium features
New rule: Trial users can access premium features for seven days without email verification
Updates related test cases accordingly

API Details Discovery

Extracts endpoints, required/optional parameters, and error responses from the graph DB
Identifies API dependencies through recorded data (e.g., successful subscription creation, payment failure)

Test Data Generation

Creates test data covering:
Happy flow (successful scenarios)
Edge cases (boundary conditions)
Error conditions (failure scenarios)

Conclusion

AI agents enhance test case generation by:
Analyzing business flows
Detecting conflicts and updating test cases
Discovering API details and dependencies
Generating comprehensive test data

Refining and Executing Test Cases

Refining Test Cases with Business Rules

Use information from business flow analysis and identified rules to enhance scenarios
Convert refined scenarios into scenario-based test cases

Human-in-the-Loop Verification

AI can generate comprehensive test cases, but human experts are needed for validation
Human verification ensures edge cases and business contexts are captured
Jira ticket system, API documentation, and historical test cases may not cover all scenarios

Execution with Playwright

Playwright: Fast and reliable end-to-end testing
Playwright supports all modern rendering engines including Chromium, WebKit, and Firefox. Cross-platform. Test on Windows, Linux, and macOS, locally
Utilize Playwright for test execution
Well-known, open-source automated testing framework
Communicates directly with the browser for efficient testing
Supports multiple languages (Java, Python, C, JavaScript, TypeScript)
Strong community support with 78.9K stars on GitHub
Native support for MCP (Model-Based Continuous Planning) for natural language instructions
Built-in features:
Parallel execution for efficiency
Comprehensive tracing and debugging capabilities
Automated screenshot and video recording
Detailed logging for documenting the testing process and enabling later verification