Kento IKEDA for AWS Community Builders

Posted on Mar 30 • Edited on Jun 20 • Originally published at zenn.dev

AI-Powered Test Case Review with MagicPod MCP Server and Claude

#ai #testing #mcp #magicpod

Test Automation has Advanced. Is Review Also Part of the Loop?

The number of teams automating E2E tests with no-code tools like MagicPod is increasing. The hurdle for creating test cases has certainly lowered. However, who ensures the quality of the created test cases, and how?

For source code in product development, there is a place where code management and review are integrated, such as GitHub Pull Requests. On the other hand, no-code test automation tools often lack an equivalent environment, leaving the establishment of a review system to the team's will and ingenuity. Often, the creator of a test case remains the only person who understands it deeply. It is worth pausing to consider whether this state is leading to individual dependency (siloing).

In this article, I will introduce a mechanism for AI review of test cases by combining the official MCP server provided by MagicPod with Claude. I have summarized it in a reproducible form, from setup to actual review.

Target Audience of This Article

Teams operating test cases in MagicPod but without a functioning review process.
QA engineers, developers, and PdMs who feel there are challenges with test quality.
Those interested in MCP servers and AI agents but lacking a specific image of how to use them.

To ensure that even those unfamiliar with terminal operations can reproduce this, I have also included GUI-based procedures.

Overview: What to Do and How

The overall structure of the mechanism is simple:

Connect the MagicPod MCP server to Claude Desktop.
Prepare a Skill file that defines the review criteria.
Simply instruct "Review this," and the AI will fetch, analyze, and output a report of the test cases.

The MagicPod MCP Server is an official module for operating various MagicPod functions from AI agents (Claude, Cursor, Cline, etc.). It is published on GitHub as MIT-licensed OSS, and no additional costs are incurred on the MagicPod side.

As of March 2026, there are mainly four things you can do via the MCP server:

Execute tests via Web API.
Retrieve test execution information (statistical information, etc.).
Refer to help pages to suggest usage or troubleshooting.
Create and edit test cases in natural language via Autopilot.

In the AI review described in this article, we use the "retrieval of test case information" among these. It only reads test cases and does not make changes to existing tests. Note that test creation, editing, and execution are only supported in cloud environments, and not in local PC environments—a constraint to keep in mind. All you need is a MagicPod contract and a Claude subscription.

Furthermore, the official help also introduces how to identify unstable locators using the MCP server. This content is also included in the review criteria of this article, showing that MagicPod also envisions improving test case quality via MCP.

Important Note: The official help explicitly states that user information is not used for machine learning via the MCP server, and MagicPod does not retain the entered prompt information. The Web API token is written in the MCP server's config file but is designed not to be passed to the AI agent.

Why Explain with Claude Desktop?

In this article, I explain the procedure using Claude Desktop because it is an environment that is easy to introduce even for those not used to CLI or terminals. Terminal operation is limited to just one edit of a configuration file; everything else is completed within the chat UI. It can be set up via GUI, and the MCP connection status can be checked on the screen, making it suitable as a first step for those encountering MCP servers for the first time.

If You Use Other Tools

Since the MagicPod MCP server complies with MCP (Model Context Protocol), it can be used from any AI tool that supports MCP. The review criteria and prompts in this article are designed to be general-purpose and tool-independent.

If you use Cursor

Cursor has a mechanism called Project Rules, and you can place the Skill file from this article directly as a Rule. An article by Hacobu ("Trying to create an AI review mechanism with Cursor × MagicPod MCP Server") is a helpful preceding case for Cursor users. Refer to Cursor's official documentation for how to set up the MCP server.

If you use Claude Code

Claude Code also supports MCP servers. You can achieve equivalent operation by writing the contents of the Skill file in CLAUDE.md or specifying the MCP server with the --mcp-config option. This might be easier for those comfortable with the terminal.

If you use ChatGPT

Since September 2025, ChatGPT has supported MCP servers in Developer Mode. However, it only supports remote servers (SSE / streaming HTTP) and does not support local execution (stdio). Since the MagicPod MCP server is a stdio method launched via npx locally, you need to use something like ngrok to create a tunnel to connect directly from ChatGPT. In terms of ease, Claude Desktop or Cursor are simpler to set up.

Other MCP-compatible tools (Cline, Windsurf, etc.)

If you add the MagicPod MCP server to the MCP server configuration file (equivalent to claude_desktop_config.json), it will work with any tool. The review criteria prompts are plain text, so there is no need to convert them to tool-specific formats.

Note: Although the steps in this article assume Claude Desktop, the design of the review criteria and the content of the Skill file are the essence of this mechanism. Choose the tool based on your preference.

Setup (5 minutes)

The following steps are for the first time only. Once the setup is complete, the AI will autonomously execute reviews according to the Skill file.

Prerequisites

Have a MagicPod account and access to the project to be reviewed (Free trial is also possible).
Claude Desktop must be installed (Mac / Windows).
Must be subscribed to Claude Pro or higher.

Step 1: Get MagicPod API Token

Access https://app.magicpod.com/accounts/api-token/, issue a token, and copy it.

Step 2: Install Node.js (if not already installed)

npx is required to run the MCP server. Please install the LTS version from https://nodejs.org/.

Step 3: Edit Claude Desktop Configuration File

Open Claude Desktop.
Select Claude → Settings... from the menu bar.
Click Developer on the left menu → Edit Config.
The folder containing the configuration file will open in Finder (or Explorer on Windows). Open claude_desktop_config.json with a text editor.
Replace the content with the following and save:

{
  "mcpServers": {
    "magicpod-mcp-server": {
      "command": "npx",
      "args": ["-y", "magicpod-mcp-server", "--api-token=PASTE_YOUR_TOKEN_FROM_STEP_1_HERE"]
    }
  }
}

Replace PASTE_YOUR_TOKEN_FROM_STEP_1_HERE with the API token string you copied in Step 1. The token should directly follow --api-token= (e.g., --api-token=abc123def456).

Step 4: Restart and Confirm Connection

Completely quit Claude Desktop and restart it.

Note: To check the MCP after restarting, look under the + → Connectors below the chat input field. It's okay if magicpod-mcp-server is displayed. If it doesn't appear in an already open chat, try opening a new chat.

As a functional check, type the following into the chat:

Are you connected to MagicPod? Please tell me the list of available organizations and projects.

If the organization name and project list are returned, the setup is complete.

Implementing the Skill File

Once setup is finished, the next thing to do is place the Skill file.

What is a Skill File?

A Skill file is a Markdown file that teaches an AI agent how to perform a specific task. Based on the Agent Skills Open Standard, the same SKILL.md format can be used across multiple tools like Claude Code, Cursor, Gemini CLI, and Codex CLI.

The Skill file provided in this article combines the following into a single file:

Target organization and project names for review.
Review criteria (what to check, importance levels).
Output format (findings for each test case + summary).
Report generation (sharing formats for Slack/Confluence).

Placement Method

Tool	Placement Location
Claude Desktop / claude.ai	Create a project and add the file as Knowledge
Claude Code	Place at `.claude/skills/magicpod-review/SKILL.md`
Cursor	Place in `.cursor/rules` (Loaded as Project Rules)
Other MCP Tools	Add to each tool's rule/knowledge location

Copy the full text of the Skill file at the end of this article.
Save it with the filename magicpod-review.md.
Rewrite {Organization Name} and {Project Name} in the file.
Add it to the placement location for your tool according to the table above.

If you have multiple projects, rewrite the "Basic Settings" section of the Skill file as follows:

## Basic Settings

- Target Organization: MyOrganization
- Target Projects (Review all projects in order if none specified):
  - ProjectA (Browser)
  - ProjectB (Mobile App)
  - ProjectC (Browser)

For Claude Desktop:

Create a new project from "Projects" (e.g., "MagicPod Review").
Select "Files" and upload magicpod-review.md.
Open a new chat within that project.

Contents of the Skill File: Designing Review Perspectives

The core of a Skill file is the definition of review perspectives. Here, based on the 10 perspectives from the MagicPod official blog post, 10 Ideas for Test Automation Review Perspectives, we have categorized them into those that are easy for AI to detect and those that require human judgment.

This classification is a crucial point that determines the accuracy of the Skill file. While an AI will function even if you simply ask it to check everything, the results will be a mix of irrelevant comments and useful suggestions, making it time-consuming to filter through them. By deciding in advance what to leave to the AI and what should be reviewed by a human, the reliability of the output increases, bringing the quality closer to a level where review results can be shared with the team as they are.

AI Detectable Criteria

Criterion	What to detect	Importance
Missing Assertions	Test cases with zero verification commands	High
Fixed Wait Times	Use of "Wait X seconds" type commands	Medium
Unstable Locators	XPath index dependency (e.g., `div[2]/button[1]`)	Low
Naming Issues	Empty test case names or description fields	Medium
Mechanical Names	Auto-generated names like "Button Element (1)"	Medium
Abandoned Steps	Steps left as "Temporarily Disabled"	Low
Long Tests	Over 200 steps	High
Unused Shared Steps	Repeated patterns not using shared steps	Medium

Criteria requiring human judgment (AI only provides information)

Criterion	Information provided by AI
Validity of Assertions	List of assertions in use (Team decides if appropriate)
Dependencies	List of test cases not using session restart

Note: MagicPod Web API's human_readable_steps does not include step line numbers and cannot distinguish whether a step is disabled. Identifying the specific location of a finding must be based on step content.

Are These 10 Perspectives Sufficient?

To conclude, they function well for a primary screening in AI reviews, but they are not a silver bullet. Based on the results of reviewing over 60 test cases across multiple projects, here are the strengths and limitations of these 10 perspectives.

Fields well-covered: Structural issues

Among the 10 perspectives, issues such as missing assertions (Perspective 1), naming deficiencies (Perspective 4), overly long tests (Perspective 7), and underutilized shared steps (Perspective 8) can be identified by looking at the structure of the test case. AI excels at this type of pattern matching and was able to detect these with high accuracy during actual reviews. In particular, empty description fields were found in many test cases, and it can be said that this single perspective alone made the review worthwhile.

Blind spots: Validity of test design

On the other hand, as mentioned in the official blog as other perspectives, consistency with test design documents and the validity of test data are not included in these 10 perspectives. These are difficult to judge within a MagicPod review alone, as they require cross-referencing with test design documents or product specifications. Therefore, this is currently a heavy burden for AI reviews via MCP.

Findings outside the 10 perspectives

In this review, points were raised that were not included in the original 10 perspectives. These were all observations regarding test cases created to verify MagicPod's behavior, but it is noteworthy that the AI picked up on such remnants.

Broken steps (UI element: not configured). Left incomplete without being configured.
Test data containing passwords in plain text.
A mixture of tests for checking MagicPod behavior and production regression tests within the same project.

These are examples where the AI picked up additional issues from the context without being bound by the list of perspectives in the prompt. In other words, while specifying the 10 perspectives in the prompt, there is room for the AI to expand upon them autonomously. I feel that it is important for the practical operation of AI reviews to not restrict the perspectives too strictly and to leave some margin for the AI.

In summary, the 10 perspectives are well-balanced for covering test case implementation quality and can be recommended as a baseline for AI reviews. Reviews that require test design validity or product-specific domain knowledge remain the role of humans.

How to Use

If you have already finished placing the Skill file, the usage is simple.

Basics: Running a Review

Just type one of the following into the chat:

Review this

Check the test cases

Since the organization name, project name, viewpoints, and output format are all defined in the Skill file, the AI will autonomously perform the following steps with just that one phrase: "Retrieve test case list -> Analyze each step -> Check according to viewpoints -> Output according to the format."

Note: Please execute this from the chat within the project where the Skill file is located. If you type "Review this" in a regular chat (outside of the project), the AI will ask "What would you like me to review?" because the Skill file will not be loaded.

Sharing the Report

Once the review results are available, you can also generate reports for sharing.

Summarize for Slack

Create a report in a format that can be pasted into Confluence

The Skill file includes templates for both Slack mrkdwn and standard Markdown formats, ensuring the output matches your intended destination.

In environments where the Slack MCP connector is connected, entering "Summarize for Slack" will present you with options such as:

Post directly to a channel (posts immediately if a channel name is specified)
Save as a draft (for when you want to review the content before posting)
Output only the mrkdwn text (for manual copying and pasting)

If the Confluence or Jira MCP connectors are connected, you can also create Confluence pages or Jira tickets directly from the chat. Even in environments without active MCP connectors, text output for copying and pasting is always available.

The following is an example of a report output for Slack (project names and other details have been replaced with placeholders).

🔍 MagicPod Test Case AI Review Results
Date: 2026/03/28 | Target: SampleOrg (4 projects / Over 60 cases)

🔴 High: 3 cases 🟡 Medium: Many 🟢 Low: 0 cases
━━━━━━━━━━━━━━━━━━━━━━━
🚨 Top 3 Priority Improvement Actions

1. Description Field Entries - Medium
The description field is empty in many of the target test cases.

2. Organization of Behavior Verification Tests - Medium
Tests used to verify behavior during initial setup still remain in the old folder.

3. Replacement of Fixed Wait Times - Medium
Wait 3 seconds and Wait 5 seconds are used extensively throughout ProjectD.
━━━━━━━━━━━━━━━━━━━━━━━
1. ProjectA_Web (22 cases)
- Description field is empty: Almost all cases
- Mechanical UI element names: Area(8), Area(119), etc.
- Regression tests have substantial assertions and appropriate comment separation.

2. ProjectB_Web (23 cases)
- Description field is empty: All cases
- Shared steps not utilized: Screen verification patterns are identical in 10 or more cases.
- Naming conventions are well-consistent across all test cases.

(Details for the other 2 projects are in the thread below 👇)

The report features a three-tier structure: Priority Improvement Actions, followed by Project Summaries, and then Details in Threads. This format allows the team to understand exactly what to do first at a glance. By also including positive points (💡), I have ensured the report provides encouragement rather than just pointing out issues.

Customization Examples

Review while excluding the old folder.

Review only ProjectB_Web.

Show the differences from the last time.

The Skill file defines rules to handle these types of instructions, so it works simply by adding conditions in natural language.

Results of the Actual Implementation

We executed AI reviews on more than 60 test cases across 4 projects. The following are the results.

Most Common Finding: Numerous Empty Description Fields

A large number of the target test cases had empty description fields. This is likely a common situation across many teams. When only the test case name is provided, the intent is not conveyed effectively, which leads to lower accuracy in reviews, handovers, and AI utilization.

While MagicPod features an AI summarization function, it summarizes the actions of the steps. The underlying purpose, such as what specifically needs to be verified in the test, must be written by a human.

Typical Findings in Learning Test Cases

Test cases created at the beginning of projects to verify MagicPod's behavior were still present. Although they were stored in an old folder separate from production regression tests, the AI review accurately detected issues with them.

A test named Test 1. The purpose was unclear, and the description field was empty.
Missing assertions. The test only performed login operations without a step to verify that the login was successful. It was just running through operations without guaranteeing anything.
UI element names like Area(119). Mechanical names automatically generated by MagicPod remained, making it impossible for a third party to identify what the element was.
Neglected unconfigured steps. In one test case, a broken step labeled UI element: not configured was left behind.

All of these were remnants from when MagicPod's behavior was being tested. In the production regression tests, naming conventions were unified and assertions were properly implemented. However, the AI's ability to point out that these traces of experimentation remaining in the project could become noise during bulk execution was very useful.

These are the types of issues that anyone would notice if they performed a review, but if no review is conducted, they remain neglected forever. The value of an AI review lies in mechanically picking up these problems that are obvious upon inspection but lack someone to look at them.

Discoveries Outside the Defined Perspectives

Furthermore, some findings emerged that were not included in the ten initial perspectives.

Frequent use of fixed wait times. In one project, wait 3 seconds and wait 5 seconds were scattered across multiple test cases.
Visualization of quality differences between projects. By reviewing across four projects, it became immediately clear that the rate of description completion varied significantly by project.

While following the list of perspectives, the AI sometimes picks up additional issues from the context. This behavior is similar to that of a human reviewer and can be seen as an advantage of not strictly limiting the prompts.

Detecting Positive Aspects

AI reviews were also effective for detecting good practices, not just pointing out flaws.

Test cases where sections were appropriately divided using comments (//).
Configurations where shared steps were utilized, keeping the login process DRY (Don't Repeat Yourself).
Projects where test case naming conventions were unified across all test cases.

Reports that only contain criticisms can lower team motivation. Including notes on what is well-done increases the overall acceptability of the report.

Limitations and Proper Usage

AI review is not a silver bullet. It is necessary to recognize the following limitations in advance.

What AI is good at:

Structural issues that can be detected via pattern matching (naming conventions, presence of assertions, number of steps)
Bulk analysis and comparison across projects
Consistency in maintaining the same check standards no matter how many times it is executed

What AI is not good at:

Semantic-level judgment, such as whether this assertion is appropriate for the purpose of this test
Checking consistency with test design documents (when those documents are outside of MagicPod)
Team-specific context (e.g., there is an unavoidable reason for this fixed-time wait)

In practice, a realistic approach is a two-tier structure: use AI review for self-review or primary screening, and then have a human review only the points that require human judgment. A workflow is becoming widespread where an AI review runs first on a code Pull Request, and humans focus on making decisions after reviewing the summary. The same structure can be applied to test case reviews.

Summary

The reason test case reviews become dependent on specific individuals often stems from the lack of a formal review system. When review criteria are vague and reviewers lack sufficient time, the reviews themselves eventually stop happening.

The MagicPod MCP server combined with AI offers a solution to this problem, providing consistent primary screening rather than a perfect review. In this experiment, the fact that many test case description fields were empty is something a human reviewer could also point out. What the AI did was simply look at them.

If you are already using MagicPod, setup takes only five minutes, and it takes about ten minutes even including the placement of the Skill file. The only additional cost is the fee for Claude. Why not start by typing review into one of your projects and seeing what kind of feedback you get?

Appendix: Full Text of the Skill File

The review perspectives, output formats, and report sharing features explained in this article are all included in this single file. Please replace {Organization Name} and {Project Name} with your own environment and add the file to the location mentioned earlier.

# MagicPod Test Case AI Review Skill

## About This File

Placing this file in Claude Desktop's Project Knowledge or CLAUDE.md enables automated review of test cases via the MagicPod MCP server.

## Prerequisites

- The MagicPod MCP server must be connected.
- You must know the target Organization Name and Project Name.

## Basic Settings

- Target Organization: {Organization Name}
- Target Project: {Project Name}
- Review Output Language: English

## How to Execute a Review

Execute the review when the user gives any of the following instructions:
- "Review this"
- "Check the test cases"
- "Please do a MagicPod review"

### Execution Steps

1. Retrieve the list of test cases for the target project via the MagicPod MCP server.
2. Retrieve step details for each test case.
3. Check them according to the review perspectives below.
4. Output the results according to the specified format.

## Review Perspectives

Reference: https://magicpod.com/blog/testcase-review-idea/

The following perspectives are categorized into "Automatic Detection" and "Human Judgment Support." 
For automatic detection, always point out any relevant findings. 
For human judgment support, present the information and leave the final decision to the user.

### Automatic Detection (Point out if applicable)

#### Perspective 1: Missing Assertions
- Detect test cases that do not contain any verification commands (assert-type or "verify"-type).
- Tests consisting only of operations do not guarantee anything.
- Severity: High

#### Perspective 2: Use of Fixed-Time Waits
- Detect locations using "Wait X seconds" or "Fixed-time wait" commands.
- Recommend replacing them with condition-based waits (e.g., "Wait until element is displayed").
- If their use is unavoidable, a reason should be left in the comments.
- Severity: Medium

#### Perspective 3: Unstable Locators
- Detect index-dependent locators in XPath (e.g., //div[@id='root']/div[2]/button[1]).
- Only individually created locators are subject to check (AI self-healing targets have lower priority).
- Severity: Low (Since these are naturally discovered if run daily).

#### Perspective 4: Deficient Test Case Names or Descriptions
- Test case name is empty or does not follow naming conventions.
- Description field is empty (the purpose of the test is not described in text).
- Severity: Medium

#### Perspective 5: Remaining Mechanical UI Element Names
- Automatically generated names like "Button element (1)" or "Text input (2)" remain as-is.
- Meaningful names like "Email address input field" or "Login button" are ideal.
- Severity: Medium

#### Perspective 6: Neglected Disabled Steps
- Detect steps that remain in a "Temporarily disabled" state.
- Unnecessary steps hinder the understanding of the test's intent.
- Severity: Low

#### Perspective 7: Overly Long Tests
- Detect test cases exceeding 200 steps.
- Negatively affects readability, stability, and maintainability.
- Severity: High

#### Perspective 8: Underutilization of Shared Steps
- Detect suspected cases where the same operation patterns are hard-coded across multiple test cases.
- Typical examples include login processes or test data initialization.
- Severity: Medium

### Human Judgment Support (Information presentation only)

#### Perspective 9: Validity of Assertions
- List the assertions (verification commands) used in each test case.
- Leave it to the user to judge whether they are appropriate for the objective.
- Identify whether URL verification, image diff, element value verification, visibility check, or title check is used.

#### Perspective 10: Dependencies Between Test Cases
- List test cases that do not use session restart.
- If dependencies exist, check if that fact is noted in the description field.

### Constraints on API Specifications

The `human_readable_steps` in the MagicPod Web API has the following constraints:
- Step line numbers are not returned: When pointing out issues, identify them by step content rather than "Step X."
- It is unclear if a step is disabled: Add a note that the detection accuracy for Perspective 6 is limited.

## Output Format

### Findings per Test Case

```markdown
## Test Case: {Test Case Name}

| # | Perspective | Finding | Severity |
|---|-------------|---------|----------|
| 1 | Missing Assertions | No verification commands exist | High |
| 4 | Naming Deficiency | Description field is empty | Medium |
```

### Summary (Always output at the end of the review)

```markdown
## Review Summary

- Total Test Cases: X
- Cases with Findings: Y
- Cases without Findings: Z

### Tally by Perspective
| Perspective | Count |
|-------------|-------|
| Missing Assertions | X |
| Fixed-Time Wait | X |
| ... | ... |

### Top 3 Priority Improvement Actions
1. ...
2. ...
3. ...
```

## Setup Guide (No terminal required)

### Step 1: Obtain MagicPod API Token
1. Access https://app.magicpod.com/accounts/api-token/
2. Copy the token string.

### Step 2: Install Node.js (Only if not already installed)
1. Access https://nodejs.org/
2. Download and install the LTS version.
3. This enables `npx`, which is required to run the MCP server.

### Step 3: Edit Claude Desktop Configuration File
1. Open Claude Desktop.
2. Select Claude > Settings... from the menu bar.
3. Select "Developer" from the left menu.
4. Click the "Edit Config" button. The folder containing the config file will open in Finder/Explorer.
5. Replace the content with the following and save:

```json
{
  "mcpServers": {
    "magicpod-mcp-server": {
      "command": "npx",
      "args": ["-y", "magicpod-mcp-server", "--api-token=PASTE_TOKEN_HERE"]
    }
  }
}
```

Note: If other MCP servers are already configured, add this entry inside the `mcpServers` object.

### Step 4: Restart Claude Desktop
1. Completely quit Claude Desktop (Cmd+Q).
2. Start it again.
3. Connection is successful if the MCP tool icon appears below the chat input field.

### Step 5: Verify Operation
Enter the following in the chat:
```plaintext
Are you connected to MagicPod? Please tell me the list of available organizations and projects.
```

### About Costs
- MagicPod MCP server itself: Free (MIT licensed OSS)
- Required: MagicPod main subscription + Claude Pro subscription
- For review purposes (read-only), no additional MagicPod costs are incurred.


## Report Generation and Sharing

After the review is complete, generate a report for sharing if the user gives any of the following instructions:
- "Make a report" / "Summarize this"
- "I want to share this on Slack" / "I want to post this to Confluence"
- "Summarize for sharing"

### Format Based on Destination

#### For Slack Posting

Output using Slack mrkdwn syntax. Assuming long texts will be split into threads, the main body should contain only key points.

```markdown
*🔍 MagicPod Test Case AI Review Results*
Execution Date: YYYY-MM-DD

*Target*
• Project: {Project Name}
• Test Cases: X

*Summary*
• Findings: Y / No findings: Z
• 🔴 High: X  🟡 Medium: X  🟢 Low: X

*Top 3 Priority Improvement Actions*
1. ...
2. ...
3. ...

Details in thread 👇
```

Post detailed findings for each test case in the thread.

#### For Confluence / Documentation

Output a full report in Markdown format. This can be used as-is with Confluence's Markdown paste feature.

```markdown
# MagicPod Test Case AI Review Report

## Basic Information
- Execution Date: YYYY-MM-DD
- Target Organization: {Organization Name}
- Target Project: {Project Name}
- Total Test Cases: X
- Review Perspectives: Based on MagicPod Official Blog Top 10 + Additional Perspectives

## Summary

| Metric | Value |
|--------|-------|
| Test Cases with Findings | Y (XX%) |
| Test Cases without Findings | Z (XX%) |
| High Severity Findings | X |
| Medium Severity Findings | X |
| Low Severity Findings | X |
```

```markdown
## Tally by Perspective

| # | Perspective | Count | Severity |
|---|-------------|-------|----------|
| 1 | Missing Assertions | X | High |
| 2 | Fixed-Time Wait | X | Medium |
| 3 | Unstable Locators | X | Low |
| 4 | Naming Deficiency (Name/Desc) | X | Medium |
| 5 | Mechanical UI Element Names | X | Medium |
| 6 | Neglected Disabled Steps | X | Low |
| 7 | Long Tests (Over 200) | X | High |
| 8 | Underutilization of Shared Steps | X | Medium |
| + | Security (Plaintext passwords, etc.)| X | High |
| + | Broken Steps (not configured) | X | High |

## Top 3 Priority Improvement Actions

1. **[Action Name]** — [Reason and Expected Effect]
2. **[Action Name]** — [Reason and Expected Effect]
3. **[Action Name]** — [Reason and Expected Effect]

## Good Practices

(Include good practices detected by AI. Provide praise as well as improvements.)

## Details by Test Case

(Expand the list of findings for each test case here.)

## Supplement: Basis for Review Perspectives

This review is based on the perspectives from the MagicPod official blog "Reviewing Automated Tests? 10 Review Perspective Ideas" 
(https://magicpod.com/blog/testcase-review-idea/), reconstructed in a format suitable for AI detection.
```

### Instructions for Sharing Execution

Handling cases where the user specifies a concrete sharing destination:

#### "Post to Slack"
1. If Slack MCP server is connected: Post directly to the channel using `slack_send_message`.
2. If not connected: Output text in Slack mrkdwn format and prompt the user to copy-paste.

#### "Post to Confluence"
1. If Atlassian MCP server is connected: Create a page using `createConfluencePage`.
2. If not connected: Output in Markdown format and prompt the user to use Confluence's Markdown paste.

#### "Make a Jira ticket"
1. Propose creating Jira tickets for findings with High severity.
2. After user approval, create them using `createJiraIssue`.

### Report Customization

Adjust the report if the user specifies the following:
- "Exclude the 'old' folder" -> Generate report only for production tests.
- "Summarize across projects" -> Generate a cross-project summary.
- "Show only the good points" -> Generate a "praise" report rather than an improvement report.
- "Show the difference from last time" -> Compare with the previous report (saved in Knowledge).

Top comments (1)

Raju Dandigam • Jun 30

This is a strong use case for MCP because test-case review has enough structure to benefit from AI, but still needs clear boundaries. I like that the workflow keeps the test-management system as the source of truth instead of copying everything into chat. The next layer I’d want in a production setup is a review trace: which cases were inspected, what criteria were applied, and what changes the agent recommended. That makes the AI review easier for QA and engineering teams to trust.