DEV Community

brian austin
brian austin

Posted on

How I use Claude Code to write tests for untested legacy code

How I use Claude Code to write tests for untested legacy code

Every codebase has that section. The module nobody touches. The service that "just works" — until it doesn't. No tests. No docs. Original author long gone.

I've been using Claude Code to write characterization tests for these parts of our codebase, and the workflow has become one of my most-used patterns.

Here's exactly how I do it.

The problem with untested legacy code

You can't refactor what you can't break safely. And you can't write tests for code you don't understand.

This creates a deadlock:

  • Code is fragile → you need tests before changing it
  • Code is complex → you need to understand it before writing tests
  • Understanding takes time → so tests never get written
  • Code stays fragile → repeat

Claude Code breaks this deadlock.

Step 1: Characterization tests first

Before writing any "proper" unit tests, I start with characterization tests — tests that capture what the code currently does, not what it should do.

> Read this module and write characterization tests that capture its 
> current behavior. Don't test for correctness — test for behavior.
> I want to know what this code does today so I can safely refactor it.
Enter fullscreen mode Exit fullscreen mode

This is the key insight: you're not testing for correctness, you're testing for behavior.

If the legacy code has a bug that's been there for 5 years, the characterization test should capture that bug. Because downstream systems may depend on that broken behavior.

Step 2: Map the blast radius

> Which other modules import from this file? Show me the call sites 
> and what they depend on from this module.
Enter fullscreen mode Exit fullscreen mode

Claude Code traces the import graph and shows you the blast radius. This tells you:

  • What will break if you change this module
  • Which tests you need before refactoring
  • What the module's public API actually is (not what the docs say)

Step 3: Gap analysis on existing tests

If the codebase has some tests but not enough:

> Compare the test file to the implementation. What code paths have 
> no test coverage? Focus on: error handling, edge cases, and the 
> main happy path.
Enter fullscreen mode Exit fullscreen mode

This gives you a prioritized list of what to write next. Start with error handling — that's almost always the first thing that breaks in production.

Step 4: Write tests iteratively

> Write a test for the error case where [database connection fails / 
> input is malformed / external API times out]. Show the test and 
> explain what you're testing.
Enter fullscreen mode Exit fullscreen mode

I go one test at a time with explanation. This keeps the context focused and makes review easier.

The rate limit problem

Here's where this workflow gets painful: large legacy modules are huge.

A single file can be 2,000-5,000 lines. Add the test file, the import graph, the related modules Claude Code needs to read — you're looking at 20,000+ tokens in context before you've written a single test.

This is when Claude Code starts hitting rate limits mid-session:

Claude AI Usage Limit Reached
Your Claude.ai Pro account has hit its usage limit.
Usage resets in 3h 47m.
Enter fullscreen mode Exit fullscreen mode

At 3 hours into a legacy testing session. With half the tests written.

How I handle rate limits

I use SimplyLouie — a $2/month API proxy that gives me continued access when Claude Code's built-in limits cut out.

When I hit a rate limit, I:

  1. Take the exact prompt I was about to send
  2. Note the context (which file, which test, what we'd established)
  3. Continue via the SimplyLouie API

The session context is gone, but the characterization tests I've already written are in the repo. I can pick up from where the tests are and continue.

Developer API details: simplylouie.com/developers

The complete testing workflow

# 1. Start with characterization tests
> Read [legacy_module.js] and write characterization tests

# 2. Map dependencies  
> What imports this module? Show call sites.

# 3. Gap analysis
> What's not covered? Focus on error handling.

# 4. Fill gaps iteratively
> Write a test for [specific error case]

# 5. If rate limit hits — continue via SimplyLouie API
curl https://simplylouie.com/api/chat \
  -H "Authorization: Bearer $LOUIE_KEY" \
  -d '{"model":"claude-opus-4-5","messages":[...]}'
Enter fullscreen mode Exit fullscreen mode

Real results

Using this workflow on a 3,200-line payment processing module:

  • Session 1 (Claude Code): characterization tests + blast radius mapping
  • Hit rate limit after ~2 hours
  • Session 2 (SimplyLouie API): gap analysis + error case tests
  • Result: 47 new tests, safe to refactor

The module had a bug in its error handling that had been there for 4 years. The characterization test caught it. We kept the bug (other systems depend on it) and documented it.

Why characterization tests work

They're honest. They test what the code does, not what you wish it did.

This matters for legacy code because:

  • The original requirements are gone
  • The business logic is in the code, not in docs
  • Other systems have adapted to the code's behavior (including bugs)

Refactor safely. Test what exists, then change it intentionally.


SimplyLouie is a $2/month Claude API proxy. 7-day free trial. No usage limits on the API tier. simplylouie.com

Top comments (0)