DEV Community

brian austin
brian austin

Posted on

Claude Code testing: write tests, run them, fix failures automatically

Claude Code testing: write tests, run them, fix failures automatically

One of the most underused Claude Code workflows is automated test-fix loops. Instead of manually running tests and copying error output back to Claude, you can set up a pattern where Claude writes the test, runs it, reads the failure, and fixes the code — all without leaving your terminal.

This article shows you exactly how.

The basic test loop

Start with a simple prompt:

> Write a test for the `parseDate` function, run it, and fix any failures
Enter fullscreen mode Exit fullscreen mode

Claude will:

  1. Inspect parseDate to understand its interface
  2. Write a test file with edge cases
  3. Run the test with your test runner
  4. Read the output
  5. Fix failures until tests pass

The key is step 5. Claude doesn't stop at "here's a test" — it runs and iterates.

Setting up hooks for automatic test runs

You can make this continuous using Claude Code hooks:

// .claude/settings.json
{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Edit",
        "hooks": [
          {
            "type": "command",
            "command": "npm test -- --testPathPattern=${input.path} --passWithNoTests 2>&1 | tail -20"
          }
        ]
      }
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

Now every file edit automatically runs the relevant tests. Claude sees the output in its next turn and can self-correct.

The TDD pattern with Claude Code

Test-driven development works especially well with Claude because you can write the test spec in plain English:

> I want a function called `validateEmail` that:
> - Returns true for valid emails
> - Returns false for missing @ symbol
> - Returns false for missing TLD
> - Handles edge cases like multiple @ symbols
>
> Write the tests first, then implement the function to make them pass
Enter fullscreen mode Exit fullscreen mode

Claude will:

  1. Write 8-12 test cases based on your spec
  2. Run them (all fail — function doesn't exist yet)
  3. Implement validateEmail
  4. Run tests again
  5. Fix edge cases until all pass

This takes about 90 seconds end-to-end.

Running your entire test suite with retry

For larger codebases:

> Run `npm test`. For any failures, read the error, locate the broken code, 
> fix it, and run again. Repeat until the suite is green. 
> If the same test fails 3 times, stop and explain why.
Enter fullscreen mode Exit fullscreen mode

The "stop after 3 failures" guard is important — it prevents Claude from spinning in a loop on a genuinely hard bug.

Testing across multiple files

For integration tests that span modules:

> The payment flow has three files: cart.js, checkout.js, payment.js.
> Write integration tests that test the full flow:
> add to cart → checkout → process payment → confirm order.
> Use mocks for the Stripe API. Run and fix until green.
Enter fullscreen mode Exit fullscreen mode

Claude will map the dependency chain and write tests that exercise the boundaries between modules.

The coverage report pattern

> Run `npm test -- --coverage`. Find the 5 functions with lowest coverage.
> Write tests for them. Run until coverage improves by at least 10%.
Enter fullscreen mode Exit fullscreen mode

This turns coverage reports into an actionable task queue.

Handling flaky tests

> Run the test suite 3 times. If a test passes sometimes and fails others,
> identify it as flaky. Look at the test code and find race conditions,
> timing dependencies, or shared state. Fix the flakiness.
Enter fullscreen mode Exit fullscreen mode

Claude is surprisingly good at spotting setTimeout hacks and uncleared test state.

The rate limit problem during long test runs

If you're running hundreds of test-fix iterations, you'll eventually hit Claude Code's rate limits mid-session. The loop breaks, you lose context, and you have to start over.

The fix developers use is setting ANTHROPIC_BASE_URL to a proxy that removes the rate limit cap:

export ANTHROPIC_BASE_URL=https://simplylouie.com
export ANTHROPIC_API_KEY=your-key
claude
Enter fullscreen mode Exit fullscreen mode

SimplyLouie runs at ✌️$2/month and removes the per-session limit. Worth it for long test-fix loops.

Summary

The Claude Code test-fix pattern:

  1. Basic loop: Write test → run → fix → repeat
  2. Hooks: Auto-run tests on every file save
  3. TDD: Spec in English → tests first → implementation
  4. Full suite: Run all tests, fix failures with retry guard
  5. Coverage: Use coverage reports as a task queue
  6. Flakiness: 3-run detection pattern for non-deterministic failures

The pattern that unlocks this: stop telling Claude what to fix, start telling Claude to run the tests and fix whatever breaks.


Testing without rate limit interruptions: simplylouie.com — ✌️$2/month, 7-day free trial

Top comments (0)