DEV Community

Sergiy Yevtushenko
Sergiy Yevtushenko

Posted on • Originally published at pragmatica.dev

Fail-Safe Your Legacy Java in One Sprint

Fail-Safe Your Legacy Java in One Sprint

The Scaling Wall

Your Java system works. It's been working for years. But lately, it's showing strain. Response times creep up during peak hours. That batch job that used to finish overnight now runs into morning operations. Users notice.

The conventional answer is microservices. Decompose the monolith, deploy to Kubernetes, hire a platform team. Eighteen months, significant budget, and a team with specialized skills you don't have. Meanwhile, every deployment feels like Russian roulette. One bad release, one server failure, and the business stops.

For middle-sized businesses, this isn't a technology problem. It's a survival problem. The system that runs your operations is both essential and fragile. You can't afford to replace it, and you can't afford to lose it.

The 50% Rule

What if half your servers could fail and users wouldn't notice?

Aether, the runtime behind the slice architecture, provides exactly this guarantee. When your code runs across a cluster, failure of less than half the nodes affects only performance, not functionality. Requests automatically route to surviving nodes. No manual intervention, no pager alerts at 3 AM.

This isn't eventual consistency or graceful degradation. It's actual redundancy. The same request, processed by any available node, producing the same result. Your business keeps running while you fix the failed hardware.

For a C-level executive, this translates simply: business continuity without enterprise budget. The system that runs your operations becomes the system that survives failures.

The Simplest Migration Path

Here's how to start. Pick a relatively independent part of your system. Something that's already hitting limits. Something with clear boundaries.

Step 1: Extract an interface. This is mechanical. Any Java developer can do it. The interface defines what the component does, not how.

@Slice
public interface ReportGenerator {
    Promise<Report> generateReport(ReportRequest request);
}
Enter fullscreen mode Exit fullscreen mode

Step 2: Make it idempotent. The same request should produce the same result, even if processed multiple times. Pragmatica Lite provides built-in support for this pattern. For most code, it's a small change.

Step 3: Deploy Ember. Ember runs multiple cluster nodes in the same JVM as your existing application. Your legacy code calls the interface exactly as before. No changes to call sites.

That's it. Your first slice is running.

Important caveat: This initial step is not fault-tolerant. You're still running in a single JVM, which means a single point of failure. But here's the key insight: it's no worse than what you have today. You haven't added risk. You've laid the foundation for removing it.

From Foundation to Fault Tolerance

The foundation is in place. Now you can build on it.

Moving from Ember to full Aether deployment is a configuration change, not a code change. Your slices, your interfaces, your business logic—all unchanged. You're just telling the runtime to distribute across multiple machines instead of running in-process.

Now the 50% rule applies. Your report generator runs on three nodes. One dies. The other two handle the load. You fix the failed node when convenient, not when panicked.

Each step in this path delivers value:

  • Ember in-JVM: Foundation laid, no new risk
  • Ember multi-node: Development and testing environment
  • Aether cluster: Production fault tolerance

You control the pace. Extract another slice when ready. Expand the cluster when needed. The architecture grows with your confidence.

The Peeling Pattern

Once your slice is running, you have a choice: leave the internals as-is, or gradually refactor them. The peeling pattern lets you do the latter without risk.

Phase 1: Wrap everything. Your initial slice wraps the entire legacy method:

private Promise<Report> generateReport(ReportRequest request) {
    return Promise.lift(() -> legacyReportService.generate(request));
}
Enter fullscreen mode Exit fullscreen mode

Phase 2: Peel the outer layer. Refactor into a Sequencer, but keep each step wrapped:

private Promise<Report> generateReport(ReportRequest request) {
    return validateRequest(request)
        .flatMap(valid -> Promise.lift(() -> legacyFetchData(valid)))
        .flatMap(data -> Promise.lift(() -> legacyProcess(data)))
        .flatMap(result -> Promise.lift(() -> legacyFormat(result)));
}
Enter fullscreen mode Exit fullscreen mode

Phase 3: Peel deeper. Take one wrapped step and expand it:

private Promise<RawData> fetchData(ValidRequest request) {
    return Promise.all(
        Promise.lift(() -> legacyFetchFromDb(request)),
        Promise.lift(() -> legacyFetchFromApi(request))
    ).map(this::combineData);
}
Enter fullscreen mode Exit fullscreen mode

Each phase keeps the code working. Tests pass at every step. You can stop anywhere—the system runs fine with mixed JBCT and wrapped legacy code. The lift() calls mark exactly where legacy code remains, making progress visible and the remaining work obvious.

What You Don't Need

Traditional modernization projects require capabilities most middle-sized businesses don't have. Aether is different.

No Kubernetes expertise. Aether manages its own clustering. You don't need to learn pod configurations, service meshes, or container orchestration.

No platform team. The runtime handles deployment, discovery, and failover. Your existing operations team can manage it.

No new infrastructure. Start with Ember in your existing JVM. Add machines only when you're ready for fault tolerance.

No retraining. Same Java. Same IDE. Same debugging. Your developers write slice interfaces exactly like they write any other interface. The patterns are familiar; only the deployment model changes.

The migration path is designed for teams that have a business to run, not a technology transformation to execute.

The Path Forward

Once the foundation is working, possibilities open up.

More slices mean more of your system becomes fault-tolerant. The relatively independent parts you migrated first are now proven. You understand the pattern. The next extraction is faster.

Aether's operational model scales beyond manual management. A three-tier control system handles increasingly complex decisions:

  • Decision trees handle routine scaling—deterministic, predictable, fast
  • TTM (predictive models) detect patterns and scale preemptively
  • LLM agents (planned) handle capacity planning and anomaly investigation

You're not just surviving anymore. You're building toward a system that manages itself.

Conclusion

The legacy Java system running your business doesn't need a complete rewrite. It needs a path forward that doesn't bet the company on an 18-month transformation project.

Start with one slice. Run it in Ember alongside your existing code. Prove it works. Then decide: add fault tolerance, extract another slice, or pause and let the system prove itself in production.

The 50% rule isn't a promise for someday. It's a capability you can reach in weeks, not years. Your business deserves infrastructure that survives failures. Now there's a path to get there.


Part of Java Backend Coding Technology - a methodology for writing predictable, testable backend code.

Top comments (0)