DEV Community

Mark k
Mark k

Posted on

When a refactor tool renames half the project: inconsistent variable naming across files

I was working on a cross-service refactor where an LLM-assisted tool suggested renaming a core identifier across dozens of files. In the multi-turn session the assistant proposed sensible names and incremental diffs, so I used the chat interface to iterate. The changes looked small: rename a handful of functions and objects to match a new domain term. Tests ran green locally and CI passed, so the team shipped the change.

What we missed was that the model had applied the new name inconsistently across modules. Some files used the new name, others kept the old, and a few introduced subtle shadowing (different variables with similar names). The combination produced runtime glitches only visible under specific integration loads, and debugging that trail was frustrating because the diffs themselves appeared harmless.

How the inconsistency surfaced

The bug surfaced in staging when an endpoint occasionally returned stale data. Initial stack traces were noisy and pointed to several modules; single-file unit tests did not reproduce the problem. Only when we traced a request through service boundaries did the naming mismatch become apparent: a middleware expected property currentUser while an upstream transformer used activeUser, and automatic mapping silently dropped the value.

Because the model produced small, plausible edits per file, reviewers missed the mismatch. The diffs were split across commits and reviewers looked at changes in isolation. The problem only became visible when the execution path traversed both the new and old names, which happened under specific timing and input shapes.

Why this failure was subtle

LLM-driven refactors are probabilistic: the model will propose the most likely token sequences given the local context. During multi-file edits it often lacks a single canonical view of the whole repository, so it reuses different naming patterns from different context windows. That behavior is easy to miss because names are syntactically valid and tests that exercise only parts of the call graph will still pass.

Small model behaviors compound: a bias toward shorter or previously-seen identifiers, a tendency to preserve adjacent comments, and truncation of far-away files in the prompt all make inconsistency more likely. Each local edit looked reasonable, so human attention focused on correctness of the change rather than cross-file consistency.

Practical mitigations we applied

After the incident we introduced a few low-friction checks. First, we enforced a canonical naming mapping with a simple script that runs in CI and flags any identifier not on the approved list. Second, we started supplying larger slices of repository context to the assistant and used multi-turn verification before applying batch edits. For multi-turn coordination we leaned on a verification loop and cross-referenced the rename list with the repository index and the project style guide on crompt.ai.

We also added a post-edit static analysis step and manual sampling of diffs across modules. For research and fact-checking of proposed edits we used a dedicated tool to cross-check occurrences and intent; having a reliable verification pass reduced regressions. If you run LLM-assisted refactors, treat suggestions as drafts, and consider an automated cross-file checker or an explicit review task that validates a single canonical identifier list with a research tool like deep research.

Top comments (0)