How I use Claude Code to refactor legacy code — without breaking everything
Legacy code is the silent killer of developer productivity. You inherit a codebase with no tests, no docs, functions that span 400 lines, and variable names like temp2, dataFinal, and doTheThing.
Refactoring it manually takes weeks. Getting it wrong breaks production.
Here's the workflow I've built with Claude Code that makes legacy refactoring systematic, safe, and actually fast.
The problem with ad-hoc refactoring
Most legacy refactoring fails because:
- You change one thing and break three others
- You don't have tests to catch regressions
- You can't hold the whole system in your head at once
- You run out of time and ship the half-refactored version
Claude Code changes this because it can hold large amounts of context — and it can write the tests you need before you touch anything.
My complete legacy refactoring workflow
Step 1: Audit before touching anything
> Read the entire UserController.js file. List:
> 1. What each function does
> 2. What other files depend on it
> 3. What inputs/outputs each function has
> 4. What you think the original developer intended
> Do NOT make any changes yet.
This produces a map of the code. I can now see what I'm dealing with.
Step 2: Identify the danger zones
> Based on your audit, which functions are:
> 1. Most likely to break if changed
> 2. Most depended on by other modules
> 3. Missing input validation
> 4. Doing multiple things they shouldn't
> Rank them by refactoring risk.
Now I know the order of operations. Start with low-risk, isolated functions. End with the scary ones.
Step 3: Write tests BEFORE refactoring
This is the most important step most developers skip:
> Before we change anything, write comprehensive tests for getUserById().
> The tests should cover:
> 1. Happy path (valid user ID, user exists)
> 2. User not found (valid ID, no record)
> 3. Invalid input (null, undefined, non-numeric)
> 4. Database connection failure
> Use Jest. Don't change the implementation — only test current behavior.
Now you have a safety net.
Step 4: Refactor one function at a time
> Refactor getUserById() with these constraints:
> 1. All existing tests must still pass
> 2. Same function signature (don't change what callers expect)
> 3. Add JSDoc comments
> 4. Extract the database query into a named variable
> 5. Handle the null case explicitly
> Show me the diff before applying it.
Key constraint: show me the diff first. Never auto-apply to legacy code.
Step 5: Verify after each change
> Run the tests for getUserById(). If any fail, explain why and fix them.
> Then check if any other tests in the codebase reference this function.
One function at a time. Verify before moving on.
Step 6: Extract utilities and shared logic
After you've mapped the codebase and refactored individual functions:
> I've noticed getUserById, getUserByEmail, and getUserByToken all have
> similar database error handling. Extract that into a shared utility
> function called handleDbError(). Show me the extraction plan first.
Step 7: Update the docs as you go
> Update the README section on user authentication to reflect the
> refactored getUserById function. Keep it under 10 lines.
Documentation debt compounds. Pay it down as you go.
The session structure that works
Legacy refactoring sessions get long. Here's my structure:
[Start of session]
> We're refactoring UserController.js. Here's the audit from last session: [paste]
> Today's goal: refactor getUserById and getUserByEmail.
> Rules: tests first, diff before apply, one function at a time.
[Middle of session - when you hit rate limits]
> Checkpoint: we've refactored getUserById (tests passing).
> Next: getUserByEmail. Same constraints as before.
[End of session]
> Summarize what we changed today, what's still to do, and any risks
> I should know about before next session.
That last summary is gold. Paste it at the start of your next session.
What actually breaks during legacy refactoring
In my experience, the three things that go wrong:
1. Implicit dependencies you didn't know about
Solution: grep -r 'getUserById' . before touching it. Claude Code can do this for you.
2. Behavior that was a bug but code depended on
Solution: write tests that document current behavior including the weird parts before changing anything.
3. Token context running out mid-refactor
This is the real pain point. When you're 2 hours into a session and Claude Code hits its rate limit, you lose the context you've built up.
I work around this by using a separate API endpoint that doesn't share the same rate limit pool as my main Claude subscription. The sessions cost almost nothing — I'm paying about $2/month total for all my AI usage — and I can run them continuously without watching the clock.
If you're hitting context/rate limits on big refactoring jobs, that's why. The solution is separating your long-running sessions from your day-to-day usage.
The full refactoring checklist
[ ] Audit the file — understand before touching
[ ] Map dependencies — who calls this code?
[ ] Write tests first — document current behavior
[ ] Identify refactoring order — low risk first
[ ] Refactor one function at a time
[ ] Show diff before applying
[ ] Run tests after each change
[ ] Extract shared utilities after individual functions are clean
[ ] Update docs as you go
[ ] End-of-session summary for next time
What takes weeks now takes days
The last legacy refactor I did this way: a 2,800-line monolithic service file with zero tests. Three focused sessions with Claude Code. Tests written first, refactored incrementally, no regressions.
It's not magic — it's systematic. The audit-first, test-first approach is what makes it safe. Claude Code is what makes it fast.
Running long Claude Code sessions for legacy work? I use SimplyLouie — it's a Claude API proxy for $2/month that keeps my sessions running when my main subscription hits limits. 7-day free trial.
Top comments (0)