<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: jan strelec</title>
    <description>The latest articles on DEV Community by jan strelec (@jan_strelec_f0260c269988b).</description>
    <link>https://dev.to/jan_strelec_f0260c269988b</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3847575%2Fd6a10081-2854-48d0-9a69-d36e76b06f82.jpeg</url>
      <title>DEV Community: jan strelec</title>
      <link>https://dev.to/jan_strelec_f0260c269988b</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jan_strelec_f0260c269988b"/>
    <language>en</language>
    <item>
      <title>How I Use a SKILL.md File to Make Claude Code Run a Full QA Workflow Automatically</title>
      <dc:creator>jan strelec</dc:creator>
      <pubDate>Sat, 28 Mar 2026 11:55:50 +0000</pubDate>
      <link>https://dev.to/jan_strelec_f0260c269988b/how-i-use-a-skillmd-file-to-make-claude-code-run-a-full-qa-workflow-automatically-5h8g</link>
      <guid>https://dev.to/jan_strelec_f0260c269988b/how-i-use-a-skillmd-file-to-make-claude-code-run-a-full-qa-workflow-automatically-5h8g</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;SKILL.md gives Claude Code a persistent QA methodology: 5 workflow phases, folder structure, locator rules, and a capped fix loop&lt;/li&gt;
&lt;li&gt;Drop it in your project root, give Claude Code your codebase or test cases, it writes and saves an organized Playwright suite automatically&lt;/li&gt;
&lt;li&gt;Tests persist on disk and run free forever via Playwright's native runner&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Writing tests takes as long as writing the feature. Most devs either skip them or write shallow ones that break on the next refactor. The issue isn't Playwright — it's that there's no system. Every test session starts from scratch, and coverage is whatever you had time for.&lt;/p&gt;

&lt;p&gt;Asking Claude Code to "write some tests" helps, but without instructions it's inconsistent. It writes differently every time and has no idea how you want things organized.&lt;/p&gt;

&lt;p&gt;The fix is giving it a methodology to follow.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's a SKILL.md?
&lt;/h2&gt;

&lt;p&gt;A markdown file in your project root. Claude Code reads it at the start of every session as its operating instructions for that project.&lt;/p&gt;

&lt;p&gt;The difference from prompting: prompts are forgotten when the conversation ends. A SKILL.md persists. You configure the methodology once; Claude Code follows it on every session, in every project you drop it into.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the SKILL.md Defines
&lt;/h2&gt;

&lt;h3&gt;
  
  
  5 Workflow Phases
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Phase&lt;/th&gt;
&lt;th&gt;What happens&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Assess&lt;/td&gt;
&lt;td&gt;Reads your project, proposes test scenarios, waits for your approval&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Author&lt;/td&gt;
&lt;td&gt;Writes organized test files by concern, saves them with meaningful names&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Execute&lt;/td&gt;
&lt;td&gt;Runs the suite via Playwright's native runner&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fix&lt;/td&gt;
&lt;td&gt;Categorizes each failure, fixes and reruns — max 3 attempts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Report&lt;/td&gt;
&lt;td&gt;Results summary, bugs found, coverage gaps, flaky test flags&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Folder Structure
&lt;/h3&gt;

&lt;p&gt;Tests go into tests/e2e/ organized by concern:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;happy-path — core flows that must always work&lt;/li&gt;
&lt;li&gt;validation — form errors, required fields, bad input handling&lt;/li&gt;
&lt;li&gt;edge-cases — empty inputs, special characters, boundary values&lt;/li&gt;
&lt;li&gt;accessibility — keyboard nav, focus order, aria attributes&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Locator Priority
&lt;/h3&gt;

&lt;p&gt;Claude Code follows this order on every test it writes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;getByRole — survives refactors, matches user intent&lt;/li&gt;
&lt;li&gt;getByLabel — for form fields&lt;/li&gt;
&lt;li&gt;getByText — for buttons and visible content&lt;/li&gt;
&lt;li&gt;data-testid — when semantic locators aren't enough&lt;/li&gt;
&lt;li&gt;CSS selectors — last resort only&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;CSS selectors break every time someone touches a class name. Enforcing this in the SKILL.md means Claude Code never takes the lazy route.&lt;/p&gt;

&lt;h3&gt;
  
  
  Fix Loop
&lt;/h3&gt;

&lt;p&gt;When a test fails, Claude Code decides what kind of failure it is before touching anything:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Test bug (wrong selector, race condition) → fix the test&lt;/li&gt;
&lt;li&gt;Real app bug → fix the app, report what broke&lt;/li&gt;
&lt;li&gt;Flaky (intermittent) → add a wait on that specific action&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Max 3 attempts. After that it stops and explains rather than looping forever.&lt;/p&gt;

&lt;h3&gt;
  
  
  Language Support
&lt;/h3&gt;

&lt;p&gt;The SKILL.md works with TypeScript, JavaScript, Python, Java, and C#. Claude Code detects your language from your project files and generates the right file extensions, run commands, and test syntax automatically.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Workflow — 4 Prompts
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;"Here's my codebase / feature spec. What should I be testing?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Claude Code reads everything, identifies scenarios grouped by concern, and gives you a numbered list to approve. Nothing gets written until you confirm. This is your only required input.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Generate the full test suite based on those scenarios and save the files."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It writes the tests, picks the folder structure, names files meaningfully, and saves to tests/e2e/. Locator rules and test hygiene from the SKILL.md apply automatically.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Run the tests and fix any failures."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The fix loop runs per the SKILL.md — categorize, fix, rerun, 3 attempts max. You get a clean report either way.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Here's a new user story. Add tests for it to the existing suite."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Weeks later, new feature lands. Claude Code reads your existing files, avoids duplicates, and extends the suite cleanly. The tests grow with the product.&lt;/p&gt;




&lt;h2&gt;
  
  
  Swarm Mode
&lt;/h2&gt;

&lt;p&gt;For full pre-release coverage, Claude Code spawns 3 sub-agents in parallel instead of running sequentially:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Agent 1 — happy paths and success flows&lt;/li&gt;
&lt;li&gt;Agent 2 — validation, edge cases, error states&lt;/li&gt;
&lt;li&gt;Agent 3 — accessibility and UX behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All three write simultaneously. Playwright runs the combined suite once they're done. AI tokens are spent once across 3 parallel agents; execution is free regardless of how many tests were written.&lt;/p&gt;

&lt;p&gt;Use single agent for targeted checks. Use swarm mode when you need comprehensive coverage before a release.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get the SKILL.md
&lt;/h2&gt;

&lt;p&gt;Drop it in your project root as SKILL.md. Open Claude Code. Start with Prompt 1.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://gist.github.com/strelec00/b76230c45523a54597b6d115f78b80f7" rel="noopener noreferrer"&gt;https://gist.github.com/strelec00/b76230c45523a54597b6d115f78b80f7&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  5 Prompts Worth Saving
&lt;/h2&gt;

&lt;p&gt;Gap analysis on an existing suite:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Read all test files in this project. What's covered, what's missing, and what looks redundant?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Failure mode thinking:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"What are the 5 most likely ways this feature could break that aren't covered by happy-path tests?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Regression from a bug report:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"A bug was reported where [describe it]. Write a test that would have caught this before it shipped."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Suite update after UI changes:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"The UI for this flow changed. Here's the new spec. Update the existing tests to match without removing coverage."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Auto-generated documentation:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Write a short guide explaining what this test suite covers, how to run it, and what to do when a test fails."&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Limitations
&lt;/h2&gt;

&lt;p&gt;Review generated locators. If Claude Code only has a plain-English description to work from, it guesses at button labels and input names. Give it component code or describe your actual UI for accurate selectors.&lt;/p&gt;

&lt;p&gt;Quality of input determines quality of output. A vague spec produces shallow tests. Specific acceptance criteria produces real coverage.&lt;/p&gt;

&lt;p&gt;Swarm mode takes time on large codebases. For a quick pre-commit check, single agent is faster. Swarm is for depth runs where thoroughness matters more than speed.&lt;/p&gt;

</description>
      <category>testing</category>
      <category>playwright</category>
      <category>claudecode</category>
      <category>automation</category>
    </item>
  </channel>
</rss>
