Lord Vardhan

Posted on Aug 24

Using Copilot with Legacy PHP: Lessons from Refactoring, Database Migration, and AI Hallucinations

#php #ai #githubcopilot #refactoring

Amongst all the challenges I have faced in my long career, this one stood out as particularly unique and humbling: working with legacy PHP systems, migrating them from a legacy database to PostgreSQL, performing large-scale refactoring, and doing all of this while no test cases existed and business logic had to remain untouched.

At the same time, I experimented with GitHub Copilot as an AI assistant. This experience taught me valuable lessons—not only about productivity but also about the risks of AI hallucinations when business-critical systems are at stake.

Why This Was Different

Legacy PHP codebases are messy. They often come with:

Years of patchwork code layered by multiple teams.
Sparse or missing test coverage, making every change risky.
Hard-coded SQL queries tied to outdated databases.
Business logic that is poorly documented but absolutely critical.

Add to that the mandate to migrate everything to Postgres, clean up code for maintainability, and add at least some testing scaffolding—all without causing downtime. That is the context in which I decided to try Copilot.

Where Copilot Helped 🎯

1. Database Query Migration

Copilot sped up repetitive rewrites of old mysql_* calls into PDO-based queries compatible with Postgres.

// Legacy code
$result = mysql_query("SELECT * FROM clients WHERE status = 'active'");

// Copilot-suggested modernization
$stmt = $pdo->prepare("SELECT * FROM clients WHERE status = :status");
$stmt->execute(['status' => 'active']);
$result = $stmt->fetchAll(PDO::FETCH_ASSOC);

These weren’t always perfect, but they provided a solid draft to build on.

2. Refactoring Repeated Patterns

Copilot was surprisingly good at spotting structural similarities and suggesting standardized functions across files where logic was duplicated inconsistently.

3. Scaffolding Tests

Even though we lacked existing tests, Copilot could draft basic PHPUnit test cases for newly refactored functions. These acted as scaffolds we could extend with real business cases later.

The Caveats ⚠️

1. Hallucinations in Business Logic

Copilot sometimes tried to “fill in the blanks.” In one case, it suggested adding a default value during a Postgres migration. That would have silently changed behavior that had compliance implications.

👉 Lesson: AI does not understand why certain rules exist. Never let it infer business logic.

2. Silent Alterations

Copilot occasionally altered code in ways that looked fine but introduced drift:

Switching AND to OR in conditions.
Using loose == instead of strict ===.
Suggesting simplified queries that ignored edge cases.

These issues are hard to catch without strong test coverage. In our situation, they would have been catastrophic.

3. Over-generalization

Because Copilot is trained on broad internet data, its suggestions often favored general PHP practices over the enterprise-specific patterns we needed. For example, it omitted security hardening steps and assumed lenient defaults.

4. Concurrency Challenges

The real challenge wasn’t just the code. We were juggling three massive initiatives at once:

Database migration to Postgres.
Large-scale refactoring of brittle legacy PHP code.
Introducing at least a layer of unit and integration tests where none existed.

Copilot’s suggestions were often useful in isolation but didn’t “understand” this concurrency. It could create migration code that conflicted with ongoing refactors, or scaffolding that assumed test coverage we simply didn’t have.

Strategies That Made It Work ✅

Golden Documentation of Business Logic

Since tests were missing, we documented business-critical rules in plain text. Every Copilot suggestion was validated against this manual “source of truth.”

Test Scaffolding First

Before touching legacy code, we had Copilot draft simple tests. While not complete, they gave us guardrails to validate refactoring later.

Use AI for Syntax, Humans for Meaning

Copilot was great for rewriting syntax, PDO migration, and scaffolding boilerplate. But the final say on business rules always stayed human.

Incremental Refactoring

We avoided large, sweeping Copilot refactors. Instead, we worked in small, reversible commits, validated each one, and only then moved on.

Pair Programming Mindset

We treated Copilot as a junior developer—fast and enthusiastic, but needing constant supervision. This mindset reduced errors and kept the team cautious.

The Human Side

The emotional reality of this work shouldn’t be underestimated. Developers already felt anxious working without tests. Copilot reduced some burden by accelerating routine tasks, but it also introduced a new type of anxiety: can we trust its suggestions?

In practice, Copilot acted like a pair programming partner. It helped generate ideas, but humans had to make sure those ideas didn’t drift from compliance or introduce hidden risks. Interestingly, newer team members gained confidence because they could see AI’s suggestions, critique them, and learn from both its strengths and weaknesses.

Final Thoughts

Using Copilot on a greenfield project is one thing. Using it on a legacy PHP system undergoing refactoring, database migration, and test scaffolding simultaneously is something else entirely.

Here’s what I learned:

Copilot accelerates migration and syntax cleanup.
Copilot cannot replace missing tests.
Copilot must never be allowed to invent business logic.
Copilot suggestions must be reviewed in the context of concurrent workstreams.

If there’s one mantra I’d leave you with, it’s this:
👉 AI can rewrite code. Only humans can protect meaning.

Thanks for reading. If you’ve faced similar challenges with Copilot on legacy systems, I’d love to hear your experiences in the comments.

DEV Community