How I turned the pain of inheriting legacy systems into an open-source framework that produces structured diagnostics — tested on real production code.
You inherit a system. 265 backend files, 209 frontend files, 80 stored procedures. Spring Boot 2.7 and Angular 14 — both end-of-life since November 2023. Business logic lives in SQL Server procedures that nobody fully understands. There are no tests. There is no architecture document. There is no modernization plan.
Management wants a roadmap by the end of the quarter.
If you've been in this situation, you know what happens next: weeks of reading code, drawing boxes on whiteboards, manually cataloging what's broken, and hoping you don't miss something critical hiding in a catch block.
I've done this exercise enough times to realize most of the work is mechanical — scanning for patterns, mapping dependencies, cataloging business rules, cross-referencing against OWASP. The judgment part (what to prioritize, how to phase the work, where the risk is) is genuinely hard. But the discovery part is tedious, repetitive, and error-prone when done manually.
So I built a tool to automate the discovery and structure the planning.
What Legacy Squad does
Legacy Squad is an open-source CLI that installs into your project and produces a structured modernization diagnostic. One command:
npx legacy-squad install
This scans your repository, runs a deterministic Compliance Engine (OWASP / CWE pattern matching), generates per-module context packs, and installs five AI agents as slash commands in your IDE (Claude Code or Codex CLI).
The agents aren't free-form chat. Each one follows a methodology-bound prompt that forces structured output — evidence with file:line references, framework citations, impact assessment, and actionable recommendations.
After running the 5 agents and 4 generators, you get:
- PRS — Product Refactor Specification (consolidated diagnostic)
- SDD — Software Design Document (current + target architecture with ADRs)
- MMP — Modernization Master Plan (phased roadmap with rollback strategy)
- Execution Specs — atomic YAML files, one per unit of work, individually deployable
The framework never modifies your code. Read-only by design.
What it found on a real production system
I validated the framework against two real production systems — a mobile app (~18k LoC, React Native) handling financial transactions, and a Java/Spring Boot + Angular backend (~550 source files) with end-of-life frameworks. Both in production, both with real users.
The numbers from the backend system:
| 20 findings | 3 critical, 4 high, 8 medium, 5 low |
| 38 business rules extracted | Several implicit, never documented anywhere |
| 22 execution specs generated | Atomic, individually deployable |
| Execution Readiness | 28 → 90/100 across 5 phases |
| Modernization roadmap | 32–44 weeks for phases 0–3 |
The kind of things it surfaces
I can't disclose specifics about the production systems (they're real and active), but the categories of findings illustrate the depth:
- Authentication bypasses — conditional branches that skip credential validation under certain request parameters. CWE-287.
-
Non-expiring tokens — JWT signing without
expclaim, combined with tokens accepted via query string (logged in access logs). CWE-613. - Counter-intuitive architecture recommendations — in one system, stored procedures initially flagged as a code smell turned out to be the safest anchor for incremental modernization, since all business logic was isolated from the application layer. The MMP recommended keeping them untouched during framework upgrades — not the obvious call, but the evidence-driven one.
- Implicit business rules — idempotency constraints, hardcoded calculation percentages in queries, off-by-one period logic in export flows. Things that were never documented but would break silently if refactored without awareness.
Every finding traced to a specific file and line. Every business rule linked to the execution spec that would need to preserve it.
What makes this different from SonarQube / Snyk / Copilot
The honest answer: Legacy Squad doesn't replace any of those. It occupies a different space.
SonarQube does continuous code quality. Snyk does CVE scanning. Copilot does in-editor autocomplete. None of them produce an architecture assessment, extract business rules from code, or generate a phased modernization plan with rollback strategy per phase.
Legacy Squad starts where those tools stop. It pairs deterministic scanning (OWASP/CWE pattern matching — no LLM involved) with methodology-bound AI agents that produce structured engineering documents — not chat history.
The key constraint: the repository is never sent in full to any LLM. The Context Manager builds per-module context packs that are token-efficient, and the AI runs entirely inside your own IDE. Zero API keys required from the framework itself. Zero external servers.
Try it in 2 minutes
Don't have a legacy project to scan? I built a demo project — a small Node/Express HR API with intentional legacy problems (SQL injection, hardcoded secrets, fat controllers, duplicated logic, dead code):
git clone https://github.com/hrpimenta/legacy-squad-demo.git
cd legacy-squad-demo
npx legacy-squad install
You'll immediately see the Compliance Engine findings in .legacy-squad/memory/findings/. Then open the project in Claude Code or Codex CLI and run:
/legacy-squad:security
The Security Agent will analyze the codebase and write a structured assessment to .legacy-squad/outputs/assessments/security.md — with file:line evidence for every finding.
To see where the project stands in the lifecycle:
npx legacy-squad status
Why I built this
I work with legacy systems at organizations where "just rewrite it" is not an option. The systems are in production, handling real transactions, and the business can't stop while engineering figures out what to modernize first.
The pattern I kept seeing was: the decision to modernize was easy. The plan to modernize was the bottleneck. Teams would spend weeks in discovery, produce a Confluence page that was outdated by the time it was finished, and then argue about priorities without evidence.
Legacy Squad exists to compress that discovery phase from weeks to hours — with every finding traceable to code, every recommendation backed by a framework reference, and every execution spec individually deployable with rollback.
V1 is the open-source Discovery Platform: understand and plan. V2 (in design) will add AI-assisted execution — refactoring, PRs, QA gates — as the paid Enterprise layer.
Links
- GitHub: hrpimenta/legacy-squad
- npm: legacy-squad
- Demo project: hrpimenta/legacy-squad-demo
- Full case study: Case Study — Real Organization
If you work with legacy systems and want to try it, I'd genuinely appreciate the feedback. And if it's useful, a ⭐ on GitHub helps with visibility.
Top comments (0)