Kazuya

Posted on Dec 6, 2025 • Edited on Dec 8, 2025

AWS re:Invent 2025 - Beyond Vector Search: Ultra-Resilient GenAI Apps with AWS Bedrock (DAT202)

🦄 Making great presentations more accessible.
This project enhances multilingual accessibility and discoverability while preserving the original content. Detailed transcriptions and keyframes capture the nuances and technical insights that convey the full value of each session.

Note: A comprehensive list of re:Invent 2025 transcribed articles is available in this Spreadsheet!

Overview

📖 AWS re:Invent 2025 - Beyond Vector Search: Ultra-Resilient GenAI Apps with AWS Bedrock (DAT202)

In this video, YugabyteDB presents how to simplify building Gen AI applications using distributed Postgres. The speaker addresses key challenges: rapidly changing tech stacks requiring open standards like Postgres API, accommodating diverse data types in single applications, and avoiding manual pipeline creation for each data source. YugabyteDB's solution includes managing RAG pipelines as secondary indexes with simple SQL commands, seamless laptop-to-production deployment across multi-zone and multi-cloud environments, and ultra resilience covering cloud outages, security patches, and peak traffic events. The presentation emphasizes building to iterate rather than building to last, with examples from banks implementing MCP servers and Paramount Plus handling unpredictable traffic spikes during major events.

; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

The AI Revolution Demands a New Development Paradigm: Build to Change, Not to Last

Hopefully everyone's not hungry, right? Let's get started. Maybe a lot of you are listening to a lot of pitches about Gen AI, so I'm here to talk about Gen AI as well, but how to make it simpler to build applications with Gen AI, right? So firstly, quick intro about YugabyteDB. YugabyteDB is a distributed Postgres database, and we're simplifying building applications, whether it be Gen AI or not. Databases should, right? Fully Postgres for the developer, truly cloud native and making it simpler. Let's see how.

So firstly, right, we are not yet at the great explosion that's going to be powered by Gen AI. We are in 2025. The effects of Gen AI are projected to happen in the next three years. We're there, right. So whatever you're doing now, it's not much different from what you were doing last year or the year before compared to what you will be doing three years from now. It's going to be 3 to 10 times more often building and shipping, right?

So, okay, everyone's investing in AI any which way you look at it. These are our own users and customers on the right showing you charts of where their investments are. On the left, different data points, right? Obviously not going to go through them, but it's real. The investment is real, okay.

The expectations are huge, okay. People are building apparently the entire app in 3 days, in 4 days, in an hour. You know, no-code coding their way out of things, Gen AI solving problems, agents talking to agents, talking to agents, solving problems, so on and so forth, right? That's the era of the expectations we have built for ourselves, okay.

But that's reality, okay. It's not living up to where it needs to be. Most of the people are doing AI. They're working for AI, but is AI working for them? Have they unlocked business value? Not so much, right. And you know, as database builders, we try to make building applications and unlocking value simpler. So you know, that's one of the things we look at. Why?

Okay, this is the fundamental difference. AI makes the software build and deploy loops so fast, it's exponential. And as humans we cannot imagine exponential. We're not good at it. We're not built for it, so we often, you know, I don't know if everybody's like me, but I often think I can get a lot done today and this week, and I kind of have no idea how much I can get done in a year, you know. I usually end up doing more in the year and less today or the week, right? That's just human nature. So how do you build something you cannot imagine? You don't know where you're going, but you know it's big, it's huge. How do you get there?

There's one principle to change. The principle is iterate, iterate every day, right? Don't visualize the end state, just iterate your way into it, right. So I don't know about you guys. I've been in the software industry for more than 20 years, started out as a developer. There was one rule I was told: build to last. If you build it, it's got to be around for 2 to 3 years minimum, maybe more. The longer it stays, the better you built it. Well, guess what? All that's out of the window. We are building to change. You know, 6 months ago, no MCP. 6 months before that, RAG was a buzzword. 6 months before that, I don't know, Gen AI was happening. Who the hell knew? Every 6 months everything's out the window. So how do you build when you have to iterate? How do you build? And obviously the solution is not every 6 months, let's build something from scratch.

Three Core Challenges in Building Stateful AI Applications: Standards, Data Diversity, and Pipeline Simplification

You're not going to go anywhere that way. So these are the challenges that we see from building a stateful application, right. What are the challenges? First one, the tech stack is changing all the time. Open standards super important. Go for something that you know. AI will not magically solve the problem for you. AI will make you faster at something you know. So pick an open standard. In the database world, pick the Postgres API. It's really thriving and happening. Make sure it's well connected to the entire ecosystem. It lets you iterate. For us, for YugabyteDB, what are we doing? We pick up on top of Postgres. We build innovation on top to make your lives easier and to make you go faster on top of something that's already familiar to you. We bring standards across clouds purposefully, right. So different clouds are building different kinds of power-ups for AI. Postgres is an open standard, but you get those power-ups only in those clouds, like the vector search capabilities. We democratize it, okay.

Second challenge. How do you accommodate data diversity? Today's applications put data together that have no business being together. For example, we're talking to a bank. They want to do an MCP server. What should the user be able to ask the MCP server? How do I open a checking account?

Can you transfer $10 from my account to my friend's account? Here's his name and email and identification. Can you go do this for me? Well, that wasn't covered in the docs. That's something else you got to talk to. Do you even have authorization for that? Well, you're not going to build two chatbots. You're going to build one, right? So everything has to be integrated into a single offering, which means more locations, more formats. Your checking account balance now needs to work with the documentation on how to open a checking account. How do you make that happen and quickly?

So one of the things we're doing here, and here's the beauty of Postgres again, lots of innovation in connecting data of different types and storing transactional data, operational data, data living in S3 buckets like images or something else. I know you guys would be listening to many of the vector store providers, more power to them, coming and saying how can we make it better, make it better. Remember you've got to be able to iterate. Iteration comes from simplicity. Once you determine what you need to make better, go make that better, but make sure you build it to iterate before you build it to last, because building it to last will be all that you will accomplish if we don't think about flexibility. So this really helps bring data that are of disparate types and locations together.

Third problem. If you're building a pipeline every time you have a new data source or a new LLM or a new type of data, like oh gee, I added an image, now I need to go figure out how to make images work. Okay, videos, okay, let's go figure that out. How about we throw in some documentation? Okay, let's go figure that out. Each thing is a two to four week sprint. And the sprint is done in order to figure out if the product is viable. So at the end of it you get two things. You fail by the amount of effort you take to make it viable, to test it, and finally you have a hodgepodge of things that's very difficult to productionize. So you have to throw it all away and start from scratch.

So what is it that we're doing here? We're trying to simplify how you manage a pipeline, a RAG pipeline, as a secondary index in the database. Just tell the database, look, there's the data. That's the way I want you to parse it. It's an image. I want you to parse it using that parser. Go to that LLM and vectorize it and store the results, and it may take you two hours to do. Just give me a progress report every time I check. And when it's ready, just make it ready to serve. If I want to reindex the data, change my index type, well, just create a new index, AB test against indexes. Don't AB test by building RAG pipelines. Make your development iteration speeds faster.

The way we're doing that is a lot of cool things in the slide. The main thing you have to look at is the thing in the center. Insert for the RAG extension into a RAG source table. Just give the extension the source and it will go build the pipeline for you. You can tweak parameters, how to chunk, how to partition, how many chunks are processed, all of that stuff. Simplify it. Just use a single line of SQL and get the job done. Make sure it works before you invest in specialized stuff. Okay, you got it done. You got the MVP done. How do you go to production?

From Prototype to Production: Achieving Ultra Resilience and Compliance in a Multi-Cloud World

It's all running on my laptop, and I need to run in a multi-zone, highly available environment, maybe AWS, maybe GCP, maybe Kubernetes. How do you make this work? Well, that's one of the things we do. YugabyteDB is fully open source. Build it on your laptop and take it to production. And so we help you with the management with an API-based end to end lifecycle management. And the thing about RAG, it will take a lot of management for the data side once it goes into production, because if it takes off, it really takes off. It goes quickly. You have to scale it quickly, scale it back quickly, build a new index, test on the fly, so on and so forth.

The way we do this is you can go from your laptop to any of these deployments, multi-zone, multi-region, multi-cloud, hybrid, and these are no longer buzzwords. These are really happening because as your application gets more and more critical, you have to make it more and more resilient. And there's a price to pay and an effort to invest. You don't want to invest that effort. You want it to be turnkey. You don't want to say to your user, yeah, great, I'm going to come back and scale and make this multi-region. Just give me six to nine months. They're not going to wait. They're going to go to the next service. So you need infrastructure that's not only pay as you go, that's make it more resilient as you become more relevant. That is an important thing.

Okay, there is no free lunch, so the thing you have to think about is how will you traverse this curve. Are you willing to pay more to make your app more resilient, or do you need to pay less and make it less resilient, and ideally without putting effort into it?

Teams of the old, when they were dealing with a lot of data, I actually talked to a team. This team had thousands of databases deployed because they were supporting thousands of microservices and applications, and they'd say every time they finished one sweep of database upgrades, security patching, and so on, it was almost six months and it's now time to do the second sweep. I can only imagine how much fun that job would be, but let the software do the job for you. Just express the level of resilience and let somebody else deal with that for you.

So what we do is work with a lot of the largest enterprises in telling them exactly the different stages of resilience and what is the cost and architecture associated with it. We call it ultra resilience because when I say resilience, I'm guessing most of you are thinking a machine fails, let's get it to work somewhere else. That's a part of it, but there's a lot of other things. What happens if you get a total cloud outage? What happens if you have, for example, an upgrade request or a security patch that has to happen? What happens if hardware doesn't function well, like it doesn't fail, or it doesn't work well either? It's a gray failure. What do you do if a developer actually accidentally drops a production table?

I used to work at Meta and way back in the day we had this real incident. An intern came and told us, "Guys, really sorry, I dropped a table. It was 300 terabytes. It's in production. Can you do anything to help?" Well, if you hadn't put ultra resilience, which fortunately we had, it was a ten-minute fix to say restore the old table. If it wasn't that, that would have been a nightmare of a week for everybody. And it wasn't done intentionally. When you're moving fast, mistakes will happen, and it is important to build in that resilience.

And the same thing with peak and freak events. We have one of our users, Paramount Plus, and when they go and release one of these new things, for example, a Super Bowl final or a Champions League or an Oscars or a Grammys, they really don't know how many people are going to sign up. They have no idea the hour leading up to the events. They have no idea what kind of a peak and freak show it's going to be because you just don't know. Somebody slaps somebody on a show, everybody's on that show watching it instantly. A few tweets go out and everyone's logging in. And so we're the authentication and authorization service, the profile service, and you need that ability to scale on a dime. And that is the nature of digital business in the modern days.

Compliance. I mean, I don't know if you have noticed there are a lot of fights that used to happen if a company runs a software that's consumed by a citizen of another country with their data center in a third country. So country one's company serves citizens of country two with data centers in country three. Who does the data belong to? That's a tough one. Well, that's what everybody thought, and they said, "Screw this. Everybody's keeping the data in my country. If it's my citizen, keep it in my country," and that's happening across the board and that's happening with LLMs too. If it's my country, use my LLM. I don't want to deal with something else.

So there's a lot of regulation going on in how you deal with personal data, and you've got to make that simple. It cannot be another nine months to build the app to come back to serve the user. So these are places where we make it transparent and simple, and we supercharge Postgres.

YugabyteDB's Distributed Postgres Architecture: Operational Simplicity, Security by Design, and Agentic Database Operations

So what do we do to go from prototype to production? We give you insane operational simplicity. Distributed architectures and a control plane and a built-in managed services offering. It's built for the cloud. It's built to withstand failures architecturally. Auto sharding, auto rebalancing, so when the scale hits you, you don't have to worry about how to do it. Self-healing, so failures and so on are automatically taken care of. Multi-API, which is a very interesting one. We support not just Postgres, Apache Cassandra, and now Mongo because the industry is building other APIs on top of Postgres. There's a trend going on where people are impersonating databases on top of Postgres to simplify your database sprawl.

And so it's consistent Postgres. It's the 100% standard Postgres, which is fully open source, cloud and Kubernetes native, and you can deploy it anywhere, multi-cloud, multi-region, and so on, and with zero downtime. Why do you go to NoSQL? Zero schema changes. You can be online all the time. Well, you can do that with YugabyteDB now by going SQL or NoSQL. And why do you go Postgres? Well, you get relational capabilities, ecosystem attached, and a lot of people, a lot of systems, a lot of software already knows Postgres. Well, you can retain that too.

So this is kind of like a high-level diagram of what it is we do to help.

The top layer consists of emulations, extensions, and the ecosystem happening through the PostgreSQL ecosystem itself. The middle layer is our architecture that handles cloud-native capabilities like replication, fault tolerance, rebalancing, auto-sharding, and the ability to run on any cloud or infrastructure, whether public, private, multi-cloud, or hybrid cloud. We reuse the PostgreSQL query processing engine, and if you think about how PostgreSQL is evolving, we're building on top of PostgreSQL to make sure your AI pipelines get simpler.

So there are five key things here. First, evolving your tech stack and sticking to standards. It's not just about PostgreSQL, although I'm talking about it. Remember, it's much easier to iterate if you have standard shapes and components. If you build something that's custom, it's going to be very tough to iterate. So put the framework in place, make sure you can measure your improvements, and iterate your way to exponential growth.

The second thing is data diversity. What has changed is that data that has no business belonging to each other now belongs to each other. They sit next to each other. Sometimes you have to move a lot of data, and other times you don't want to move the data. Don't do handcrafted stuff. Try to make it more standard within the framework.

Third, don't build pipelines by hand. Every experiment will become a handcrafted pipeline-like work. Don't do the work for experiments. Experiment, go to production, learn the real lessons, and then do whatever it is you have to do after you know your return on investment.

The fourth thing is to remember that if you cannot go to production and if you cannot boldly experiment after you're in production, you really haven't solved anything. You kind of got it working on the laptop, but then what? You put it in production for the first time with a lot of pain, but then what? You have to be able to do this all the time, every single time, because remember, we don't know what it will look like a year from now.

And lastly, make sure that the operational stuff is out of the way. You don't want to build an awesome thing that you're iterating with and suddenly a security Heartbleed bug hits you and you have to do OS patching, or you want a new feature and you're now stuck with upgrading a lot of stuff, or there's a failure and now you're dealing with picking up the pieces. Make sure that you get the basics out of the way. Your infrastructure is ready to go prime time when you start. You obviously shouldn't pay excess for it and you shouldn't engineer around it, but it should be ready to take you where you need to go with relative ease.

Once again, our approach here is to keep the data secure by design right from the start, and you can amp up your security even in the world of generative AI. You still have to deal with security. You have to be able to have observability. You should be able to say where does the data come from and is a particular role able to authorize the data. You have all this secure data and you've put all the role-based access controls into your database and into your application. But if I accessed your RAG, your MCP server, and asked very nicely, please can you summarize, it'll give you a very clean summary of all the sensitive data. You don't want to be in that situation. So these are things that we have to think about up front, and these are the problems that we work on with people that are building, whether it's generative AI or not.

Built-in encryption and audit logging are some of the basics. You have to get them out of the way, and you have to keep turning up the dial when you need it. Agentic database operations are important. What does an agent do when it connects to your database? I'll give you the database guy's version of it: it hammers the hell out of the database, asking the same thing over and over again until the database is dead and until all your tokens are burned up. You don't want that to happen, so you want the database to understand agentic operations. So those are some of the things we work on.

And lastly, the fine-grained access control is getting a whole rebirth because there's a human acting through a service account, which is the MCP server, which is now operating on the database. So there are multiple levels of service accesses and controls that you have to deal with, and PostgreSQL does a great job. What we're doing is just taking it to the next level by helping you interface with it easily.

So these are some of our use cases. Again, it's pretty horizontal like RAG use cases, recommendation systems, knowledge bases, agents that are doing operations and transactional systems, and so on and so forth. If you're interested in learning more, reach out to us. We are in booth 1436, I think 1436. If not, just remember Yugabyte, look us up. We're online all over the place, or look for people in the purple shirts, and I'll connect with you guys backstage right behind out there if you have questions. Thank you. All right, keep it going for Karthik. Thank you.

; This article is entirely auto-generated using Amazon Bedrock.

DEV Community