Kazuya

Posted on Dec 8, 2025

AWS re:Invent 2025 - Driving modernization using Mphasis’ Agentic AI framework (MAM219)

🦄 Making great presentations more accessible.
This project enhances multilingual accessibility and discoverability while preserving the original content. Detailed transcriptions and keyframes capture the nuances and technical insights that convey the full value of each session.

Note: A comprehensive list of re:Invent 2025 transcribed articles is available in this Spreadsheet!

Overview

📖 AWS re:Invent 2025 - Driving modernization using Mphasis’ Agentic AI framework (MAM219)

In this video, Anup Nair (CTO) and Bharat (Senior Partner) from Mphasis.ai present their Agentic AI Framework for legacy modernization. They address the common CIO challenge of being unable to innovate due to risky legacy systems built on COBOL, Natural, and Adabas. Their solution uses four autonomous agents: NeoZeta (extracts intelligence from code into knowledge graphs), NeoSaba (generates user stories with INVEST scoring), NeoRena (defines target architecture), and NeoCrux (generates code). The framework centers on Ontosphere, an enterprise knowledge graph with domain ontologies. Live demos show reverse engineering of post-trade processing COBOL programs with 95% accuracy using LLM as a Judge, GPU vs CPU performance comparisons, and BDD generation. Results show 50 million lines of code modernized in 18 months versus traditional 7 years, with intelligence converted to data to eliminate future legacy issues.

; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

The Legacy Modernization Challenge: Why Traditional Approaches Fall Short

My name is Anup Nair. I'm the CTO for Mphasis.ai, and I'm Bharat, Senior Partner in Mphasis.ai. We are going to talk about legacy modernization and Mphasis' Agentic AI Framework for legacy modernization. That's what we're going to discuss today. We have some exciting demos as well as part of this presentation, so I'm hoping you'll like it. If you have any questions, please feel free to get us offline. I don't think they allow us questions here, but yeah, why not, right?

To start off with, let me give you a little background here. Let's discuss the problem statement first. In the last 25 years, every CIO I have met has had this problem: I can't innovate fast enough because it's too risky to touch any legacy platform. Are you guys on the same page when it comes to this? In fact, one of the CIOs actually told me that I have 49 core systems that are built on COBOL mainframes, and if I touch one of them, I have to touch all the remaining 48 as well. So this is the problem we decided to solve.

Mphasis.ai has built many different Agentic AI solutions to solve these problems. Mphasis has an experience building and modernizing legacy applications for many years. We're going to talk about this, but more importantly, I think I want to double click on the problem a little bit more to really understand the genesis of our solution and really describe the solution itself.

The problem with legacy systems is that enterprises are anchored on these systems for years. These systems have code and business logic embedded inside code for years, right? Every time you want to do something, you need engineers. This is in COBOL, this is in Java, or it could be Natural or Adabas, or it could be Assembler for that matter. It could be all of those kinds of systems, but you need specialist engineers to handle this.

These engineers are not easily available. They're not in the market. They're out of the workforce at this point in time. Therefore, you start reinventing the wheel, keep reinventing the wheel on an ongoing basis, and you start creating a lot of technical debt. Now, any change is too costly because you're doing it at ten different places, and you're creating layers and layers and layers to solve the problem because you don't know what's actually going on.

Now, you think about this and you decide to put AI on top of it, Agentic AI, right? You're driving innovation, you need to put agents, right? Whether it is agents around claims or agents around underwriting or whichever agent you want to put on top of this, think how it is going to be. Because all of these legacy systems are so deeply monolithic, it is extremely hard for any agent to give you any productivity. So all your Agentic AI objectives and goals go for a toss just because you have this. This is the reason why modernization, this is the reason why any innovation is so slow, because you have to touch everything out there.

We at Mphasis, we thought about this and we said, okay, you know, it is really not just about technical modernization. It's not just modernizing COBOL to Java or Natural to Java or C or C++ to something new. That is not the point. It's about taking the intelligence out of the legacy systems and converting that into data. That should be the goal, as opposed to just modernizing code from A to B. That is exactly the approach we have taken.

Mphasis' Agentic AI Framework: Intelligence Extraction Through NeoZeta, NeoSaba, NeoRena, and NeoCrux

Let me talk a little bit more on how we are able to solve this thing, but for that, how do you typically modernize? Are you aligned to this, everyone? First, you'll relearn from your legacy systems, then you reimagine, then you rearchitect, and then you recode, and then you start running it. This is how you do it. You take COBOL and you convert it into Java.

and start managing it again. In about three years' time, that becomes legacy again. This is how you typically do it, right? So what we have done is a slightly different approach here. We've built agentic AI agents that are autonomous and semi-autonomous with humans in the loop, because anybody who says that everything is 100% autonomous is joking. It's not.

So we took documents and code, and we built an agent called NeoZeta. This agent actually reads everything and converts everything into human understandable knowledge. How do you do that? We use domain knowledge, emphasis on domain knowledge, to really make that happen because we've encoded them. We created a knowledge graph. And then we thought, now we have a knowledge graph, how do you take this and create a new system? What do you do? You generate user stories out of it.

So we created another agent we call it NeoSaba. This agent takes everything that you've learned, converts it into user stories, and focuses on governance, focuses on compliance, focuses on processes, business processes. SABA stands for semi-autonomous business analyst. We take that and we allow business analysts to work on it. We then convert whatever we create out of this and put it back into the knowledge graph.

Now, what's the next step? You reimagine how to rearchitect. We've created another agent we called it NeoRena. This is an agent that will take everything you've learned and helps you define the target state architecture. This target state architecture is customizable for you, for the client. And everything you create out of this aligns to enterprise standards. The reason why it's customizable is because you need to make sure that it aligns to enterprise standards.

Once you create that, then all the architecture is done. You take that and you rewrite. We created an agent called NeoCrux which takes everything, allows you to prompt, it uses whatever coding agent is available in your enterprise and starts writing code out of it. Continuation to ops as well, right? Why not? You've built everything, you put it into production, you've tested it through Crux and all. Now you're putting it into production, you're going to run operations as well.

But all through this we've built a connective tissue. We've extracted the intelligence out of the whole thing and created an enterprise knowledge graph. That enterprise knowledge graph is not just a database. Knowledge graph is a database, but this is not just a database. It has meaning. It has enterprise meaning because it is connected to your domain. We call it Ontosphere. And then we used a layer on top of it to orchestrate the whole thing, so that we can get it as autonomous as possible. So this is Mphasis' approach to modernization.

Now, you will see some of them are autonomous, some of them are semi-autonomous. As AI progresses, as context abilities grow, this will become more and more autonomous, and essentially you create a live intelligence of the enterprise. So what you are going to see today as a demo, you're going to see NeoZeta extracting intelligence out of code. You're going to see NeoSaba allowing the business user to create user stories. You're going to see NeoRena, you're going to see NeoCrux, and you're not going to see AI ops because we didn't have enough time, but you're going to see a glimpse of Ontosphere as well. So you're going to see five things: Zeta, Saba, Rena, Crux, and Ontosphere.

Live Demonstration: From COBOL Reverse Engineering to Java Flink Architecture Using Ontosphere

I'm going to switch to a demo and let Bharat walk you through the whole suite of things. So let me start with first talking about Ontosphere because this is where the heart of the information lies, right? Every knowledge in the system, whether it is code or document, is going to be brought into this data on the Ontosphere.

Now we're using a couple of domain ontologies which have been built up for the financial industry, insurance, and other industries. We have also put in a couple of engineering ontologies which work through our agents. This is the basis on which a lot of modeling will be done into the system.

Let me go to the first agent, NeoZeta. As you can see, I've selected a program and I've selected a capability. We took a very complex capability in terms of the post-trade processing because this is where you have a process which stops at 4 o'clock. You may have millions of trades which have to be processed and files have to be sent to the federal authorities. We took this problem and said that the only way we will be able to solve this is to pick up a lot of intelligence from each of these processes which have been coded in legacy COBOL. This is an example of COBOL that we're showing you, and then we reverse engineer it and help to modernize it using our agents onto a FIBO-based architecture. I'll show you a glimpse of how we executed it on a CPU as well as on a GPU and how the performance varies.

Continuing with our demo, I have selected the capability. I have the capability to upload my files, which may be documents or different assets. I've uploaded COBOL copybooks and I've also uploaded, as you see, a domain model out here. These are the programs that you see and this is the domain model that I've uploaded using the Relearn agent. I'm actually relearning each and every program or an entire capability. As you see, I've actually selected a program which is QUANTILES along with copybooks, and I'm doing a reverse engineering of it. It has capabilities to generate the data dictionary. Now, we haven't fed any information, but you see that there's a good amount of information that it has brought out from the system and it's starting to map it with the actual functional attributes.

I'm also showing you the document that it produces. The document has a certain format. It produces a summary, then for each and every business rule, it will start providing me a lot of data elements, as you would see. It'll give me the logic, it'll give me the input, it'll give me the data affected, and it'll give me a lot of other processing elements that I would need as part of relearning a certain code.

Now, the most important piece is how do you verify that the system has done it very accurately. We've introduced what is called LLM as a Judge, and it gives me a confidence score for every rule that has been reverse engineered. As you see, it tells me that one of the rules has got an average coverage which is 85 percent. For us, 85 percent is average. Anything above 95 percent is something which we count as a better one. There is a human-in-the-loop interface over here. I can go and make corrections over here. The first 100,000 lines of code typically need such corrections, but then for the next million lines of code, it's an automated way where you get 95 percent accuracy from the Relearn agent.

We've produced a data dictionary, we've produced business rules, and you've seen the business rules verification as well. The third part is to convert everything onto the Ontosphere, which is the knowledge graph, and let me demonstrate that to you. It's just showing you some of the attributes from LLM as a Judge. I'll go to the knowledge graph right now. Actually, before I go to the knowledge graph, it has produced information also for the entire capability. You have a model to take program by program or for the entire capability. Typically if you have a big job which is running and multiple programs in it, multiple processes, you would like to see such a view in terms of bringing the entire intelligence out.

Finally, the third step is converting it to a knowledge graph. If you look at it, I've picked up one of the programs which are reverse engineered and it is showing me which are the domain models that have been associated with it. I can double click on functions, on the attributes, on the information, and this is something that an SME can use for verification purposes. Essentially, everything that you saw earlier was the first agent. This is where the information comes to the second agent.

All information is put up as part of the business process. This was exactly the graph that you saw earlier. It's coming now in a business as a workflow item, and I can leverage this, look at it, and start reimagining my application, which is in the top section that you would see. I can even pick up business rules, combine multiple business rules, and create a new rule. So we've given those capabilities out there.

On the top, what you're seeing is essentially an overall agile remodeling of the application. So I'm completely reimagining it. It's not a lift and shift model that we have. I'm defining, I'm helping the BSA define the epics, features, user stories, and for every user story there is an assessment done on the quality of the user story through the INVEST score, as well as you have a prompt mechanism by which you can generate acceptance criteria and generate a lot more information. So you see the Gherkin output over here. We actually create BDD at this point in time, so quality essentially starts for us at this point in time. This is the prompt where I can generate information in the form of, if it doesn't meet my INVEST criteria, I can further break it down into many different rules.

So I'm going to the third agent in the interest of time. From a third agent perspective, now my important part is to rearchitect or redesign it. So over here what is needed is three things, right? Actually two things. One is a playbook about the steps which we have to do in terms of reverse engineering, and second is the standards that I will use. We've given both as an input, and I'll just show you how we do this, but it started to give me all information whether it's a logical model or it's a physical model, or it's a sequence diagram or it's an observability pattern that has to be leveraged. Everything which has been fed as a standard is being brought up by the agent, which is Raina, the third agent that Anoop spoke about, right?

The job of this particular agent is to build the entire context and make a prompt ready for my next agent from a development perspective. So if you look at it, I'm generating right now. I've fed the information that I need to generate it using a Java Flink model. I've just shown the step how fast you can actually do it in the interest of time. So I'm trying to create a new design, and I've given to it, I'm giving to it the playbook as well as the standards which have to be used. I'm using data engineering architecture. I'm giving the playbook for the standards which is the Flink one, and I'm generating the entire context. So all information pretty much gets generated from that perspective.

Now I'm going to my fourth agent, which is the code generation agent. Now this agent gets the entire information and generates code from a Java Flink architecture perspective. What you essentially see is an example of how this ran on the GPU as well as it ran on the CPU. So there's a big distinct difference in terms of the timing that you would see. If we modeled it for multiple sets of trades, so for example, if I go from 20,000 trades to 100,000 trades, what will be the cost from a CPU perspective versus the GPU perspective, and we help them to make a call from a job by job perspective. So that was one I had to show. Let's go back to the deck.

So guys, this is what we have seen so far. From a results perspective, we've achieved this. A typical modernization program with about 50 million lines of code takes around seven years. We've achieved this kind of progress that infinity at the end, guess what it is? It is the fact that your entire intelligence moves into data and lives with you forever. You never have a legacy, right? That's the thing, and this is kind of how we put together how we are different from anybody else. There are a lot of people you'll get who move A to B, but very few who will extract intelligence and give you a state where you'll never have a legacy anymore. So that's what we wanted to cover.

Our agents have actually run on Databricks. So you know they scale up pretty high on that. So we've constantly been measuring it, monitoring it from that perspective. Thank you guys. That was all we had. The time is up. Thank you so much.

; This article is entirely auto-generated using Amazon Bedrock.