DEV Community

Learn AI Resource
Learn AI Resource

Posted on • Originally published at learnairesource.com

How to Use LLMs to Refactor Legacy Code Without Burning Everything Down

How to Use LLMs to Refactor Legacy Code Without Burning Everything Down

We all have that codebase. The one with functions that are 500 lines long. The one where nobody remembers why that weird loop exists, just that removing it breaks something unpredictable. The one that makes you want to refactor everything at once.

Here's the thing: you probably shouldn't. But an LLM can help you refactor parts of it safely, without ending up in production-incident hell.

The Problem With "Just Rewrite It"

Legacy code is legacy code for a reason—it works. It might be ugly, it might confuse new developers, but it works. The last person who tried to refactor it all at once created a 47-file PR that took three weeks to review and broke two features nobody tested.

LLMs are great at understanding code patterns and suggesting improvements, but they're not great at understanding your business logic or edge cases. So you need a strategy.

The Safe Refactoring Strategy

1. Start With Tests (Yes, Really)

Before you touch anything, write tests for the code you're about to refactor. This is where an LLM actually saves time:

Prompt: "Here's a function that calculates shipping costs. Write unit tests for it that cover edge cases like zero weight, international addresses, and oversized items."
Enter fullscreen mode Exit fullscreen mode

Feed it your function. Let it generate test cases. You review them, adjust for your actual business rules, and commit them. Now you have a safety net.

The LLM won't know your edge cases, but it'll catch the obvious ones. You'll catch the weird ones based on your domain knowledge.

2. Break It Into Tiny Pieces

Instead of refactoring a 500-line function, ask the LLM to:

  • Extract helper functions
  • Identify loops that can be simplified
  • Find repeated logic that can be consolidated

Do ONE of these. Test it. Deploy it. Then move to the next piece.

Real example: You have a data processing function that's hard to follow.

Prompt: "This function processes user data and applies multiple transformations. Can you identify the distinct logical steps and suggest extracting them into separate functions? Don't rewrite the whole thing, just show me what the extracted functions would look like."
Enter fullscreen mode Exit fullscreen mode

Now you have a roadmap. You implement step 1, merge it, and the code is slightly better. Rinse and repeat.

3. Use LLMs For Modernization, Not Rewriting

Ask it for specific improvements:

Good prompts:

  • "This code uses var everywhere. Show me how to convert these to const safely."
  • "How would you replace this callback pyramid with async/await?"
  • "Can you show me a cleaner way to handle this error checking?"

Bad prompts:

  • "Rewrite this entire file to modern standards"
  • "Make this code better"
  • "Refactor this"

The good prompts are specific. The LLM can't overthink them. You get a focused change you can review and test.

4. Validate Everything Twice

  1. Read it. Does the logic make sense? Did it change behavior?
  2. Test it. Run your test suite. Run manual tests. Deploy to staging.

LLMs are confident about things they're wrong about. A function might look right but silently change behavior in an edge case. This is why the tests in step 1 matter.

Real Example: The Query Builder

We had a 600-line function that constructed database queries. It was a mess—lots of conditionals, repeated patterns, but it worked.

Instead of rewriting it:

  1. We wrote tests for 20 different query combinations
  2. We asked the LLM to "identify places where we're building SQL strings that could be DRY'd up"
  3. It suggested extracting 4 helper functions
  4. We implemented and tested one at a time
  5. Each change reduced duplication and made the code slightly clearer
  6. Three weeks later, the function was down to 250 lines and actually readable

More importantly: we never broke anything in production. Zero incidents.

The Tools That Actually Help

  • Claude or GPT for understanding patterns and suggesting refactorings (yeah, I'm biased toward Claude, but both work)
  • GitHub Copilot for completing extracted functions once you know what they should do
  • Perplexity if you need to understand a library or framework the original coder used
  • Your own test suite (not an LLM, but critical)

Throw your code at these tools, get suggestions, implement them methodically.

What LLMs Are Actually Bad At

  • Understanding your business rules without context
  • Knowing what edge cases matter
  • Recognizing intentional complexity that serves a purpose
  • Spotting security implications of changes

So you, the human, need to stay in the loop. The LLM is a suggestion engine, not the driver.

One More Thing

Skip the "let's refactor everything" meetings. Instead, pick one function, refactor it using this process, demo the improvement, and let others copy the pattern. Change spreads faster when people see it working than when you mandate it.

Legacy code gets better slowly. That's fine. It also doesn't explode in production. That's better.


Want to level up your AI workflow? Check out LearnAI Weekly—they cover practical AI tools, productivity hacks, and real-world use cases every week. None of the hype, all the useful stuff.

Top comments (0)