DEV Community

pythonassignmenthelp.com
pythonassignmenthelp.com

Posted on

We Tried Using LLMs to Automatically Refactor Legacy JavaScript Code—Here's What Actually Happened

If you’ve ever stared at a gnarly, ten-year-old JavaScript codebase and wished a robot could just clean it up for you, you’re not alone. I’ve been there—shoulders slumped, coffee gone cold, wondering if there’s a shortcut through the spaghetti. When LLMs (Large Language Models) like ChatGPT and GitHub Copilot started making noise about automatic code refactoring, you better believe I was one of the first in line to see if they could really save us from the pain.

Turns out, the story is a bit more complicated—and way more interesting—than just “click a button, get pretty code.” If you’re curious about what happens when you throw actual messy, real-world JavaScript at an LLM, pull up a chair.

What Does “Refactoring with LLMs” Actually Mean?

Before we get into the weeds, let’s get clear: we’re talking about using AI tools to take old, hard-to-read JavaScript and make it better—cleaner, more modern, easier to maintain. We’re not talking about just running Prettier or ESLint autofix. I mean taking code that uses callbacks and var and magical numbers, and asking an LLM to upgrade it to ES6+ syntax, add comments, even break it into smaller functions.

Why bother? Because, as anyone who’s inherited a legacy JS app knows, manual refactoring is slow, risky, and, frankly, a little soul-crushing after the fifth hundred-line function.

How We Set Up the Experiment

Our team picked a handful of classic "bad" JavaScript files from an old project—think jQuery everywhere, no modules, var for days, and business logic mixed with DOM manipulation. We tried a few approaches:

  • Pasting code into ChatGPT and asking for refactoring suggestions or direct rewrites.
  • Using GitHub Copilot inside VSCode to see if it could autocomplete cleaner code as we started rewriting.
  • Feeding small code chunks into Claude, as it can handle more context.

We didn’t automate the whole repo—honestly, the tools just aren’t there yet. But we could see how these models handled specific pain points.

Code Example #1: Upgrading Callback Hell to Promises

One of the first things we tried was turning this classic callback mess into something less headache-inducing.

// Original legacy code: deeply nested callbacks
function getUserData(userId, callback) {
  db.getUser(userId, function(err, user) {
    if (err) return callback(err);
    db.getOrders(user.id, function(err, orders) {
      if (err) return callback(err);
      db.getRecommendations(user.id, function(err, recs) {
        if (err) return callback(err);
        callback(null, {
          user: user,
          orders: orders,
          recommendations: recs
        });
      });
    });
  });
}
Enter fullscreen mode Exit fullscreen mode

We asked ChatGPT:

"Refactor this to use Promises and modern JavaScript syntax."

Here’s what it came back with (after a little back-and-forth to clarify the APIs):

// Refactored code: using Promises and async/await
async function getUserData(userId) {
  try {
    const user = await db.getUser(userId);
    const orders = await db.getOrders(user.id);
    const recommendations = await db.getRecommendations(user.id);
    return { user, orders, recommendations };
  } catch (err) {
    throw err;
  }
}
Enter fullscreen mode Exit fullscreen mode

Key takeaways:

  • The LLM nailed the basic transformation.
  • It assumed the db.getUser methods returned Promises (which they didn’t). We had to manually wrap those in Promise first.
  • The code is way more readable, but the AI skipped over some error-handling nuances—like logging or partial failures.

Code Example #2: Converting var Soup to let/const and Arrow Functions

Legacy JS loves its var. But modern code should use let and const for block scoping and clarity. We tested this simple snippet:

// Original code
var numbers = [1,2,3,4,5];
var doubled = [];
for (var i = 0; i < numbers.length; i++) {
  doubled.push(numbers[i] * 2);
}
console.log(doubled);
Enter fullscreen mode Exit fullscreen mode

Prompt to the LLM:

"Rewrite this using ES6 features and arrow functions."

It suggested:

// Refactored code with ES6 features
const numbers = [1, 2, 3, 4, 5];
const doubled = numbers.map(num => num * 2);
console.log(doubled);
Enter fullscreen mode Exit fullscreen mode

What worked:

  • This is a textbook upgrade—more concise, idiomatic, and immutable.
  • The LLM automatically recognized the chance to use .map, which is nice.

Where it can go wrong:

  • If the original code had side effects or complex logic in the loop, the AI sometimes oversimplified, leading to missed behaviors.
  • In bigger functions, the LLM sometimes missed subtle issues with scope changes between var and let.

Code Example #3: Splitting Monolithic Functions

Ever run into a function that tries to do everything? We tossed this monster at an LLM:

// Original code: way too much responsibility in one function
function handleFormSubmit(event) {
  event.preventDefault();
  var name = document.getElementById('name').value;
  var email = document.getElementById('email').value;
  if (!name || !email) {
    alert('Please fill out all fields');
    return;
  }
  // Save to server
  ajax('/save', { name: name, email: email }, function(response) {
    if (response.success) {
      document.getElementById('success').style.display = 'block';
      document.getElementById('form').reset();
    } else {
      alert('Failed to save');
    }
  });
}
Enter fullscreen mode Exit fullscreen mode

Prompt:

"Break this into smaller functions and use modern JS syntax."

Result (with some nudging):

// Refactored code: smaller functions, modern syntax
function getFormData() {
  const name = document.getElementById('name').value;
  const email = document.getElementById('email').value;
  return { name, email };
}

function validateForm({ name, email }) {
  return name && email;
}

function showSuccess() {
  document.getElementById('success').style.display = 'block';
  document.getElementById('form').reset();
}

function handleFormSubmit(event) {
  event.preventDefault();
  const data = getFormData();
  if (!validateForm(data)) {
    alert('Please fill out all fields');
    return;
  }
  ajax('/save', data, response => {
    if (response.success) {
      showSuccess();
    } else {
      alert('Failed to save');
    }
  });
}
Enter fullscreen mode Exit fullscreen mode

Observations:

  • The AI did a decent job splitting responsibilities.
  • It didn’t touch the old ajax utility; we had to prompt it to use fetch if we wanted that.
  • Naming wasn’t perfect, but the code is way more readable and testable.

What LLMs Are Good At (and Not So Good At)

After a week of this, some clear patterns emerged.

Strengths:

  • Syntax Upgrades: LLMs excel at swapping out old language patterns for new ones. Turning var into const, callbacks to async/await, and so on.
  • Code Shortening: They spot places to use array methods, destructuring, arrow functions, and often suggest more concise code.
  • Readable Splits: If you ask directly, they can break up huge functions into smaller ones, usually with decent names.

Weaknesses:

  • Assumed APIs: LLMs often assume your functions already return Promises, or that your code is modularized when it isn’t.
  • Surface Level: The AI can miss subtle business logic or side effects, especially when refactoring big chunks.
  • Context Limitations: For large files or interconnected logic, the LLM loses track—especially if you’re pasting bits at a time.

Common Mistakes When Using LLMs for Refactoring

I wish someone had warned me about these.

1. Trusting the LLM Without Testing

It’s tempting to paste in the AI’s “fixed” code, hit save, and call it a day. But, honestly, LLMs can introduce silent bugs—especially with async code or subtle scope changes. Always write or update tests after a big AI-driven refactor.

2. Ignoring Old Dependencies

LLMs love using modern APIs, but legacy projects often have old dependencies for a reason. If your codebase still relies on jQuery or ancient Node.js, the AI may suggest code that won’t actually run in your environment.

3. Blindly Applying Idioms

Sometimes, the AI rewrites loops as .map or .forEach, assuming pure transformations. If your original code has side effects, this can break things in sneaky ways. Always double-check that the refactor preserves behavior.

Key Takeaways

  • LLMs are great for suggesting modern syntax and breaking up code, but not for fully-automatic, safe refactoring—human review is a must.
  • Don’t assume the AI understands your project’s context or environment; clarify APIs and dependencies before trusting the output.
  • Always test refactored code, especially when changing async patterns or introducing new ES6+ features.
  • Use LLMs for small, focused upgrades rather than whole-file or whole-repo rewrites.
  • Treat AI output as a starting point, not an authoritative source—bring your own judgment, and don’t skip code review.

Wrapping Up

AI is an exciting new wrench in the developer’s toolbox, but it’s not magic. If you’re willing to collaborate with it—spotting its blind spots and filling in the details—it can seriously speed up your refactoring grind. Just remember: the real work (and the real learning) still happens between your ears.


If you found this helpful, check out more programming tutorials on our blog. We cover Python, JavaScript, Java, Data Science, and more.

Top comments (0)