How We Refactored Our API from Express 4 to NestJS 11 and Cut Bugs by 30%

#refactored #express #nestjs #bugs

How We Refactored Our API from Express 4 to NestJS 11 and Cut Bugs by 30%

For three years, our core REST API ran on Express 4. It powered 12 downstream services, handled 40k requests per minute, and served 2M+ monthly active users. But as our team grew from 4 to 14 backend engineers, Express’s unopinionated structure started to hurt: inconsistent error handling, scattered middleware, no built-in dependency injection, and a 22% monthly bug rate tied to routing conflicts and untested edge cases.

After 6 months of planning and incremental migration, we fully moved to NestJS 11. The result? A 30% reduction in production bugs, 40% faster onboarding for new engineers, and a 25% drop in time spent on API maintenance. Here’s how we did it.

Why We Outgrew Express 4

Express 4 served us well early on. Its minimalist design let us ship fast, but as the codebase grew to 18k lines of JavaScript, we hit three critical pain points:

Inconsistent architecture: Each engineer structured routes, controllers, and middleware differently. We had 7 different patterns for error handling across 42 endpoints, leading to 14% of bugs from unhandled rejections.
No native dependency injection (DI): We relied on manual module imports and global singletons, which made unit testing a nightmare. 60% of our test suite was integration tests, with slow runtimes (12+ minutes per full run).
Middleware sprawl: 28 custom middleware functions were registered globally, causing unexpected side effects for 19% of new endpoints. We once had a auth middleware conflict that took 3 days to debug.

Why NestJS 11?

We evaluated Fastify, Koa, and NestJS. NestJS 11 stood out for three reasons:

Opinionated, modular structure: NestJS enforces a standard controller-service-module pattern, which eliminated architectural inconsistency. Its built-in DI system also made testing 3x faster.
TypeScript-first design: We’d already migrated 70% of our Express codebase to TypeScript, but NestJS’s decorators and metadata support caught 40% more type errors at build time.
Compatibility with existing tooling: NestJS 11 supports Express under the hood, so we could reuse our existing middleware, validation libraries, and OpenAPI specs during incremental migration.

Our Refactoring Process

We avoided a full rewrite (too risky for a production API) and instead used a strangler fig pattern to migrate incrementally over 6 months:

Phase 1: Setup and shared utilities (Month 1): We set up a NestJS monorepo, ported shared utilities (logging, config, error classes) to NestJS modules, and configured the DI container to work with our existing Express middleware.
Phase 2: Migrate low-risk endpoints (Months 2-3): We moved 12 read-only endpoints (health checks, public user profiles) to NestJS first. We used a reverse proxy to route traffic to either Express or NestJS based on path, with no downtime for users.
Phase 3: Migrate core business logic (Months 4-5): We moved 24 write endpoints (user registration, payment processing) to NestJS, rewriting services to use DI and adding unit tests for all critical paths. We kept 100% backward compatibility for downstream services.
Phase 4: Decommission Express (Month 6): We migrated the final 6 legacy endpoints, removed all Express dependencies, and updated our CI/CD pipeline to build and deploy NestJS only.

Challenges We Faced

The migration wasn’t without hurdles:

Middleware compatibility: Some Express middleware relied on mutating the req object in ways that didn’t play nice with NestJS’s request lifecycle. We had to wrap 8 middleware functions in NestJS-compatible decorators.
Testing gaps: Our Express test suite had 45% coverage. We used the migration as an opportunity to boost coverage to 82% with NestJS’s built-in testing utilities, which added 2 weeks to the timeline.
Team training: 6 engineers had never used NestJS before. We ran 3 internal workshops and paired junior engineers with senior NestJS users for the first 2 months of migration.

The Results: 30% Fewer Bugs

We tracked bug rates for 3 months post-migration, comparing to the same 3-month period the year prior:

Total production bugs dropped from 47 to 33 (30% reduction).
Bugs tied to routing/architecture inconsistencies dropped from 19 to 2 (89% reduction).
Time to debug and fix API bugs dropped from 4.2 hours to 1.8 hours on average.
Test suite runtime dropped from 12 minutes to 3.5 minutes, with 82% coverage (up from 45%).

We also saw secondary benefits: new engineer onboarding time dropped from 3 weeks to 1.8 weeks, and we were able to add 8 new endpoints in the 2 months post-migration, compared to 5 in the same period pre-migration.

Lessons Learned

If you’re planning a similar migration, here’s what we’d do differently:

Don’t skip incremental migration. A full rewrite would have taken 9+ months and risked major downtime.
Prioritize testing early. We should have boosted Express test coverage before starting migration to avoid rework.
Use NestJS’s built-in OpenAPI tools. We manually updated our API docs post-migration, but NestJS’s Swagger plugin would have automated this.

Conclusion

Moving from Express 4 to NestJS 11 wasn’t just a technical upgrade—it was a cultural shift toward more consistent, testable code. The 30% bug reduction alone paid for the migration effort in 4 months, and we’re now better positioned to scale our API as our user base grows. If your Express codebase is starting to show its age, NestJS 11 is a worthy successor.