brian austin

Posted on Apr 4

Claude Code testing: write tests, run them, fix failures automatically

#claudecode #testing #productivity #ai

Claude Code testing: write tests, run them, fix failures automatically

One of the most underused Claude Code workflows is automated test-fix loops. Instead of manually running tests and copying error output back to Claude, you can set up a pattern where Claude writes the test, runs it, reads the failure, and fixes the code — all without leaving your terminal.

This article shows you exactly how.

The basic test loop

Start with a simple prompt:

> Write a test for the `parseDate` function, run it, and fix any failures

Claude will:

Inspect parseDate to understand its interface
Write a test file with edge cases
Run the test with your test runner
Read the output
Fix failures until tests pass

The key is step 5. Claude doesn't stop at "here's a test" — it runs and iterates.

Setting up hooks for automatic test runs

You can make this continuous using Claude Code hooks:

// .claude/settings.json
{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Edit",
        "hooks": [
          {
            "type": "command",
            "command": "npm test -- --testPathPattern=${input.path} --passWithNoTests 2>&1 | tail -20"
          }
        ]
      }
    ]
  }
}

Now every file edit automatically runs the relevant tests. Claude sees the output in its next turn and can self-correct.

The TDD pattern with Claude Code

Test-driven development works especially well with Claude because you can write the test spec in plain English:

> I want a function called `validateEmail` that:
> - Returns true for valid emails
> - Returns false for missing @ symbol
> - Returns false for missing TLD
> - Handles edge cases like multiple @ symbols
>
> Write the tests first, then implement the function to make them pass

Claude will:

Write 8-12 test cases based on your spec
Run them (all fail — function doesn't exist yet)
Implement validateEmail
Run tests again
Fix edge cases until all pass

This takes about 90 seconds end-to-end.

Running your entire test suite with retry

For larger codebases:

> Run `npm test`. For any failures, read the error, locate the broken code, 
> fix it, and run again. Repeat until the suite is green. 
> If the same test fails 3 times, stop and explain why.

The "stop after 3 failures" guard is important — it prevents Claude from spinning in a loop on a genuinely hard bug.

Testing across multiple files

For integration tests that span modules:

> The payment flow has three files: cart.js, checkout.js, payment.js.
> Write integration tests that test the full flow:
> add to cart → checkout → process payment → confirm order.
> Use mocks for the Stripe API. Run and fix until green.

Claude will map the dependency chain and write tests that exercise the boundaries between modules.

The coverage report pattern

> Run `npm test -- --coverage`. Find the 5 functions with lowest coverage.
> Write tests for them. Run until coverage improves by at least 10%.

This turns coverage reports into an actionable task queue.

Handling flaky tests

> Run the test suite 3 times. If a test passes sometimes and fails others,
> identify it as flaky. Look at the test code and find race conditions,
> timing dependencies, or shared state. Fix the flakiness.

Claude is surprisingly good at spotting setTimeout hacks and uncleared test state.

The rate limit problem during long test runs

If you're running hundreds of test-fix iterations, you'll eventually hit Claude Code's rate limits mid-session. The loop breaks, you lose context, and you have to start over.

The fix developers use is setting ANTHROPIC_BASE_URL to a proxy that removes the rate limit cap:

export ANTHROPIC_BASE_URL=https://simplylouie.com
export ANTHROPIC_API_KEY=your-key
claude

SimplyLouie runs at ✌️$2/month and removes the per-session limit. Worth it for long test-fix loops.

Summary

The Claude Code test-fix pattern:

Basic loop: Write test → run → fix → repeat
Hooks: Auto-run tests on every file save
TDD: Spec in English → tests first → implementation
Full suite: Run all tests, fix failures with retry guard
Coverage: Use coverage reports as a task queue
Flakiness: 3-run detection pattern for non-deterministic failures

The pattern that unlocks this: stop telling Claude what to fix, start telling Claude to run the tests and fix whatever breaks.

Testing without rate limit interruptions: simplylouie.com — ✌️$2/month, 7-day free trial

DEV Community

Claude Code testing: write tests, run them, fix failures automatically

Claude Code testing: write tests, run them, fix failures automatically

The basic test loop

Setting up hooks for automatic test runs

The TDD pattern with Claude Code

Running your entire test suite with retry

Testing across multiple files

The coverage report pattern

Handling flaky tests

The rate limit problem during long test runs

Summary

Top comments (0)