brian austin

Posted on Apr 8

How I use Claude Code to write and run tests automatically — a complete workflow

#claudecode #testing #tdd #programming

How I use Claude Code to write and run tests automatically — a complete workflow

Testing is the part of development everyone knows they should do more of but never has time for. Claude Code changed this for me — not by making testing less necessary, but by making it fast enough that I actually do it.

This is my exact workflow.

The basic pattern

The simplest thing Claude Code can do is write tests for existing code:

/test this file

But that's not the most powerful use. The real workflow is:

Describe what the function should do
Ask Claude to write the test first
Then ask Claude to write the implementation that passes the test

This is TDD, but automated. And it works remarkably well.

Step 1: Write the test specification

Start a session by telling Claude what you're building:

I need a function that validates email addresses.
Rules:
- Must have @ symbol
- Must have a domain with a dot
- Must not have spaces
- Must handle edge cases: plus addressing, subdomains, international domains

Write the tests first, then I'll ask you to implement the function.

Claude generates:

describe('validateEmail', () => {
  test('accepts valid standard email', () => {
    expect(validateEmail('user@example.com')).toBe(true);
  });

  test('accepts plus addressing', () => {
    expect(validateEmail('user+tag@example.com')).toBe(true);
  });

  test('accepts subdomain', () => {
    expect(validateEmail('user@mail.example.com')).toBe(true);
  });

  test('rejects missing @', () => {
    expect(validateEmail('userexample.com')).toBe(false);
  });

  test('rejects spaces', () => {
    expect(validateEmail('user @example.com')).toBe(false);
  });

  test('rejects missing domain dot', () => {
    expect(validateEmail('user@example')).toBe(false);
  });

  test('accepts international domain', () => {
    expect(validateEmail('user@münchen.de')).toBe(true);
  });
});

Now run the tests. They all fail (function doesn't exist yet). That's correct.

Step 2: Implement to pass the tests

Now implement validateEmail so all those tests pass.

Claude writes the implementation. Run tests again. Usually 6/7 pass on the first try — the international domain one often needs iteration.

The international domain test is still failing. Fix it.

Claude fixes it. All tests pass.

Step 3: The mutation testing trick

Here's something most developers don't do but should:

Deliberately break my validateEmail function in 3 subtle ways,
then check if the tests catch all 3 breaks.

This tells you if your test suite actually works. If Claude's mutations sneak past the tests, your tests have gaps.

The third mutation wasn't caught. Add a test that would catch it.

Now your test suite is actually rigorous.

The real-world workflow: testing during refactors

This is where Claude Code saves the most time. When refactoring:

I'm about to refactor the authentication module.
First, write comprehensive tests for the CURRENT behavior so I
have a safety net. Don't change any implementation.

Claude writes tests that document current behavior — including edge cases you might not have thought of.

Now refactor. The tests tell you immediately if you broke something.

Session management tip

For a test-heavy workflow, your CLAUDE.md should include:

## Testing conventions
- Always write tests before implementation when asked
- Use Jest for unit tests, Playwright for E2E
- Test files live next to source files as *.test.js
- Run `npm test` to verify after every change
- When a test fails, show me the full error before trying to fix

This means you never have to re-explain the testing setup to Claude. Every session starts with the same conventions.

The rate limit problem with test-heavy sessions

Here's the issue: test-heavy development sessions hit rate limits fast. You're generating:

Test files
Implementation files
Iteration cycles when tests fail
Sometimes entire test suites for large modules

I started using SimplyLouie as my ANTHROPIC_BASE_URL proxy specifically because of this. At ✌️2/month, I can run long test-generation sessions without hitting the Claude Code rate limit wall.

If you do a lot of TDD with Claude Code, the math makes sense:

Official Claude Code: rate-limited after heavy sessions
API direct: expensive at scale
Proxy at $2/month: consistent throughput for less than a coffee

The 7-day free trial means you can test it on your next big test-writing session.

Advanced: generating test data

Claude Code is also good at generating realistic test fixtures:

Generate 20 realistic test users for the auth tests.
Mix of: valid accounts, expired subscriptions, banned accounts,
admin users, and edge cases (very long names, unicode characters,
emails with plus addressing).

This is tedious to write by hand. Claude generates it in seconds.

The complete workflow summary

Spec first: Tell Claude what the function should do, not how
Tests first: Ask Claude to write tests before implementation
Red-green cycle: Run tests (fail), implement, run tests (pass)
Mutation testing: Ask Claude to verify your tests actually catch breaks
Refactor safety net: Before any refactor, write tests for current behavior
CLAUDE.md conventions: Encode your testing setup so every session starts fresh

The biggest shift is treating Claude as a test-writing partner, not just a code-writing partner. Tests are often more valuable than the implementation — they document intent and catch regressions forever.

For developers doing serious TDD, especially with long refactoring sessions, the rate limit issue is real. A $2/month proxy at simplylouie.com is worth trying — 7 days free, no commitment.

DEV Community

How I use Claude Code to write and run tests automatically — a complete workflow

How I use Claude Code to write and run tests automatically — a complete workflow

The basic pattern

Step 1: Write the test specification

Step 2: Implement to pass the tests

Step 3: The mutation testing trick

The real-world workflow: testing during refactors

Session management tip

The rate limit problem with test-heavy sessions

Advanced: generating test data

The complete workflow summary

Top comments (0)