How I use Claude Code to write and run tests automatically — a complete workflow
Testing is the part of development everyone knows they should do more of but never has time for. Claude Code changed this for me — not by making testing less necessary, but by making it fast enough that I actually do it.
This is my exact workflow.
The basic pattern
The simplest thing Claude Code can do is write tests for existing code:
/test this file
But that's not the most powerful use. The real workflow is:
- Describe what the function should do
- Ask Claude to write the test first
- Then ask Claude to write the implementation that passes the test
This is TDD, but automated. And it works remarkably well.
Step 1: Write the test specification
Start a session by telling Claude what you're building:
I need a function that validates email addresses.
Rules:
- Must have @ symbol
- Must have a domain with a dot
- Must not have spaces
- Must handle edge cases: plus addressing, subdomains, international domains
Write the tests first, then I'll ask you to implement the function.
Claude generates:
describe('validateEmail', () => {
test('accepts valid standard email', () => {
expect(validateEmail('user@example.com')).toBe(true);
});
test('accepts plus addressing', () => {
expect(validateEmail('user+tag@example.com')).toBe(true);
});
test('accepts subdomain', () => {
expect(validateEmail('user@mail.example.com')).toBe(true);
});
test('rejects missing @', () => {
expect(validateEmail('userexample.com')).toBe(false);
});
test('rejects spaces', () => {
expect(validateEmail('user @example.com')).toBe(false);
});
test('rejects missing domain dot', () => {
expect(validateEmail('user@example')).toBe(false);
});
test('accepts international domain', () => {
expect(validateEmail('user@münchen.de')).toBe(true);
});
});
Now run the tests. They all fail (function doesn't exist yet). That's correct.
Step 2: Implement to pass the tests
Now implement validateEmail so all those tests pass.
Claude writes the implementation. Run tests again. Usually 6/7 pass on the first try — the international domain one often needs iteration.
The international domain test is still failing. Fix it.
Claude fixes it. All tests pass.
Step 3: The mutation testing trick
Here's something most developers don't do but should:
Deliberately break my validateEmail function in 3 subtle ways,
then check if the tests catch all 3 breaks.
This tells you if your test suite actually works. If Claude's mutations sneak past the tests, your tests have gaps.
The third mutation wasn't caught. Add a test that would catch it.
Now your test suite is actually rigorous.
The real-world workflow: testing during refactors
This is where Claude Code saves the most time. When refactoring:
I'm about to refactor the authentication module.
First, write comprehensive tests for the CURRENT behavior so I
have a safety net. Don't change any implementation.
Claude writes tests that document current behavior — including edge cases you might not have thought of.
Now refactor. The tests tell you immediately if you broke something.
Session management tip
For a test-heavy workflow, your CLAUDE.md should include:
## Testing conventions
- Always write tests before implementation when asked
- Use Jest for unit tests, Playwright for E2E
- Test files live next to source files as *.test.js
- Run `npm test` to verify after every change
- When a test fails, show me the full error before trying to fix
This means you never have to re-explain the testing setup to Claude. Every session starts with the same conventions.
The rate limit problem with test-heavy sessions
Here's the issue: test-heavy development sessions hit rate limits fast. You're generating:
- Test files
- Implementation files
- Iteration cycles when tests fail
- Sometimes entire test suites for large modules
I started using SimplyLouie as my ANTHROPIC_BASE_URL proxy specifically because of this. At ✌️2/month, I can run long test-generation sessions without hitting the Claude Code rate limit wall.
If you do a lot of TDD with Claude Code, the math makes sense:
- Official Claude Code: rate-limited after heavy sessions
- API direct: expensive at scale
- Proxy at $2/month: consistent throughput for less than a coffee
The 7-day free trial means you can test it on your next big test-writing session.
Advanced: generating test data
Claude Code is also good at generating realistic test fixtures:
Generate 20 realistic test users for the auth tests.
Mix of: valid accounts, expired subscriptions, banned accounts,
admin users, and edge cases (very long names, unicode characters,
emails with plus addressing).
This is tedious to write by hand. Claude generates it in seconds.
The complete workflow summary
- Spec first: Tell Claude what the function should do, not how
- Tests first: Ask Claude to write tests before implementation
- Red-green cycle: Run tests (fail), implement, run tests (pass)
- Mutation testing: Ask Claude to verify your tests actually catch breaks
- Refactor safety net: Before any refactor, write tests for current behavior
- CLAUDE.md conventions: Encode your testing setup so every session starts fresh
The biggest shift is treating Claude as a test-writing partner, not just a code-writing partner. Tests are often more valuable than the implementation — they document intent and catch regressions forever.
For developers doing serious TDD, especially with long refactoring sessions, the rate limit issue is real. A $2/month proxy at simplylouie.com is worth trying — 7 days free, no commitment.
Top comments (0)