We introduced an AI-assisted refactor to a medium-sized TypeScript codebase and learned the hard way that automatic renaming isn’t just a convenience—it’s a fragile, multi-file operation. The model suggested a new naming convention and produced diffs that looked correct in isolated snippets, but the changes were inconsistent across barrels, test files, and runtime entry points. The result was a clean local compile for some developers and a confusing CI failure for others; the discrepancy came down to a handful of renamed symbols that didn’t line up.
Part of the reason we relied on the tool was speed: the assistant helped craft the initial rename proposals using our internal prompts and examples hosted on crompt.ai. That saved time, but it also amplified a small, systematic behavior in the model: it completed names based on recent tokens rather than a global project symbol table. That tendency to pattern-complete locally is a recurring theme in AI-assisted refactors.
How the inconsistency showed up during development
The first visible symptom was failing imports: certain modules started throwing undefined errors at runtime even though TypeScript compiled cleanly for some contributors. The model had renamed exported members in implementation files but left a mixture of named and default import sites untouched. We tracked the failure to a set of barrel files where the assistant had suggested simplified export names but didn’t update dependent modules because only a window of adjacent files was provided in the prompt.
We reproduced the issue by running a full build in CI with a clean environment, and the error trace pointed to mismatched symbol names across compiled bundles. We had assumed the model would perform a mechanical, project-wide renaming, but the underlying behavior was probabilistic completion over the snippets it saw. We also used the tool’s chat interface for iterative corrections, which helped, but multi-turn edits without a global index left gaps.
Why these inconsistencies were subtle and easy to miss
Two things made the bugs hard to spot. First, local development differences: macOS’s default case-insensitive filesystem masked some renames that later failed on Linux CI. Second, unit tests that mocked modules by filename didn’t exercise the real import paths, so tests still passed. Small, local pattern completions by the model—like changing camelCase to PascalCase in one file but not its import sites—created a mismatch that only surfaced under full integration or on different OS environments.
These subtle failures are symptomatic of small model behaviors compounding. The assistant prioritizes recent token patterns and a low-penalty edit strategy, so changing many symbols without an authoritative symbol table invites stray inconsistencies. We verified this behavior using a focused cross-reference in our deep research workflow and found similar reports where multi-file context windows were the root cause.
Mitigations and practical lessons
The main lesson: treat AI refactors as draft transforms, not atomic operations. Require tooling that enforces project-wide symbol resolution (language server or compiler-assisted renames) before accepting mass-renames. We added a pre-merge check that runs a language-server-based rename validation and rejects PRs where exported symbol names diverge from their import sites. That change caught inconsistencies the model introduced before they reached CI.
Operationally, short prompts and per-file edits helped reduce cross-file drift, and we always run a containerized, clean build as part of the validation loop. The assistant remains useful for proposing naming conventions and draft diffs, but in multi-file refactors the responsibility for correctness has to remain with compile-time tools and human reviewers—the AI gives suggestions, not guarantees.
Top comments (0)