DEV Community

Zac
Zac

Posted on • Originally published at builtbyzac.com

My testing strategy with Claude Code (what actually catches bugs)

Claude Code will write tests for anything you ask it to. The problem: unconstrained, it writes tests that confirm what the code does rather than tests that catch when the code breaks.

Here's the testing strategy that's worked for me.

Test behavior, not implementation

The prompt matters:

❌ "Write tests for this function"

✅ "Write tests that verify the behavior of this function from a caller's perspective. Don't test internal implementation details."

When Claude tests implementation details, every refactor breaks your tests even when nothing actually broke.

Lead with edge cases

Claude's default is happy path first. Flip this:

Write tests for [function]. Start with edge cases:
- Empty/null inputs
- Boundary values (0, -1, max int)
- Invalid types
- Concurrent calls (if applicable)

Then add the happy path.
Enter fullscreen mode Exit fullscreen mode

Edge cases are where real bugs live. If Claude writes them first, they get proper attention.

One assertion per test

Claude packs multiple assertions into single tests unless you stop it. Add to your prompt: "One assertion per test. If you need to verify multiple things about the same scenario, split them into separate tests."

What not to test

Tell Claude explicitly:

  • Don't test library code
  • Don't test constants and config values
  • Don't mock everything — if you're mocking more than one thing per test, your function probably does too much

The test review prompt

After Claude writes tests, run this:

Review these tests. For each test, tell me:
1. What scenario is it testing?
2. Would it catch a bug where [describe likely failure mode]?
3. Is there anything it claims to test but doesn't actually verify?
Enter fullscreen mode Exit fullscreen mode

This catches the "test that always passes" problem.

The coverage trap

High test coverage from Claude-generated tests is not necessarily good. 100% coverage with tests that don't catch real bugs is worse than 60% coverage with tests that do.

When I ask Claude to increase coverage: "Only add tests that would catch a real bug. If a test can't fail due to a realistic code change, don't write it."


More patterns like this are in the Agent Prompt Playbook — full section on testing prompts including TDD, mutation testing, and flaky test diagnosis.

Top comments (0)