DEV Community

brian austin
brian austin

Posted on

How I use Claude Code to refactor legacy code — without breaking everything

How I use Claude Code to refactor legacy code — without breaking everything

Legacy code is inevitable. Every team has that module nobody wants to touch — the one with 800-line functions, zero tests, and comments like // TODO: fix this properly (2019). I've found Claude Code dramatically changes how I approach these refactors.

Here's the workflow I use.

Step 1: Build a complete picture before touching anything

The worst refactoring mistakes come from changing code you don't fully understand. Before writing a single line, I use Claude Code to map the codebase:

# Start a session focused on understanding, not changing
claude "Read through UserService.js and its dependencies. 
Don't suggest changes yet — just tell me:
1. What does this module actually do?
2. What are the hidden dependencies?
3. What would break if I touched it?
4. What's the riskiest part to change?"
Enter fullscreen mode Exit fullscreen mode

This generates a mental model I can actually reason about. The key instruction is "don't suggest changes yet" — Claude Code defaults to fixing things, but you need understanding first.

Step 2: Write characterization tests before refactoring

Characterization tests capture the current behavior — including bugs — so you know if you accidentally changed something:

claude "Write characterization tests for UserService.js.
These tests should document what the code CURRENTLY does,
not what it SHOULD do. Include edge cases you noticed.
Use the existing test framework in package.json."
Enter fullscreen mode Exit fullscreen mode

Once these pass on the current codebase, they become your safety net. Any refactor that breaks them means you changed behavior.

Step 3: Extract the smallest possible piece first

Large refactors fail because they're large. I use Claude Code to identify the smallest extractable unit:

claude "Looking at UserService.js, what's the smallest self-contained
piece I could extract into its own module without breaking anything else?
Show me exactly what the extracted module would look like and
how the existing code would import it."
Enter fullscreen mode Exit fullscreen mode

Typically this surfaces something like a utility function, a data transformation, or a validation block that's already fairly isolated.

Step 4: Refactor in small, tested increments

This is where Claude Code shines. For each increment:

claude "Extract the email validation logic from UserService.js into
a new file validators/email.js. 
- Keep the existing UserService.js API identical
- Update the import in UserService.js
- Run the characterization tests to confirm nothing broke
- Then run the full test suite"
Enter fullscreen mode Exit fullscreen mode

The critical instruction: "keep the existing API identical". Each increment should be a refactor (same behavior, different structure), not a rewrite.

Step 5: Handle the hardest part — state and side effects

Legacy code is often a tangle of state mutations and side effects. I use Claude Code to surface these explicitly:

claude "Map all the places where UserService.js mutates external state
or has side effects (database writes, API calls, event emissions,
global variable changes). List them in order of when they happen."
Enter fullscreen mode Exit fullscreen mode

This generates a dependency graph that makes the refactoring order obvious. Clean up pure logic first, then tackle the stateful parts.

Step 6: Document the new architecture as you go

Refactored code that isn't documented just becomes new legacy code. After each significant extraction:

claude "Update CLAUDE.md to document the new module structure.
Explain why the code is organized this way and what patterns
future developers should follow when adding to this area."
Enter fullscreen mode Exit fullscreen mode

This creates institutional knowledge that survives team changes.

The rate limit problem with large refactors

Legacy code refactors are long sessions. The codebase context is large, the characterization tests add more tokens, and each incremental refactor requires holding the full mental model.

I've had 3-hour refactoring sessions that hit the Claude Code rate limit right in the middle of a half-finished extraction — with tests failing and the code in an intermediate state.

For long refactoring sessions I now use SimplyLouie as my API endpoint. It's $2/month with no rate limits, which means I can run characterization tests, ask "what did I just break?", and keep iterating without interruption.

The ANTHROPIC_BASE_URL swap is one line:

export ANTHROPIC_BASE_URL=https://api.simplylouie.com
Enter fullscreen mode Exit fullscreen mode

Then Claude Code keeps working exactly as before, but the session never gets cut off mid-refactor.

What changes with this workflow

The biggest shift is psychological. When you can ask "what would this break?" 20 times without worrying about rate limits, you actually do it. You probe the codebase instead of assuming. You write more characterization tests. You take smaller steps.

Legacy code stops being the thing nobody touches and becomes just... code that needs a careful, patient process.


What's your worst legacy code horror story? Drop it in the comments — I'm always looking for new edge cases to add to the workflow.

If you're doing long refactoring sessions and hitting rate limits, SimplyLouie is $2/month with no interruptions — 7-day free trial, no commitment.

Top comments (0)