Kazuya

Posted on Dec 5, 2025 • Edited on Dec 8, 2025

AWS re:Invent 2025 - AI-powered SaaS Observability using OpenSearch (ISV313)

🦄 Making great presentations more accessible.
This project enhances multilingual accessibility and discoverability while preserving the original content. Detailed transcriptions and keyframes capture the nuances and technical insights that convey the full value of each session.

Note: A comprehensive list of re:Invent 2025 transcribed articles is available in this Spreadsheet!

Overview

📖 AWS re:Invent 2025 - AI-powered SaaS Observability using OpenSearch (ISV313)

In this video, Ulli Hinze and Rueben Jimenez demonstrate how AI enhances troubleshooting in multi-tenant SaaS applications using Amazon OpenSearch Service. They showcase three AI use cases: natural language query generation in OpenSearch Dashboards to quickly identify rate limiting issues, semantic log search using Amazon Titan text embeddings V2 to find relevant logs without knowing exact keywords, and Model Context Protocol (MCP) with Claude CLI to automate root cause analysis and remediation. Using a modified OpenTelemetry demo application deployed on Amazon EKS, they diagnose a noisy neighbor problem where the eagle tenant overloads shared shipping services, causing checkout failures across all tenants. The demonstration includes configuring OpenSearch Ingestion pipelines for log sampling, connecting OpenSearch to Amazon Bedrock for vector embeddings, and using an MCP server to automatically scale the shipping service from 1 to 5 replicas, successfully resolving the issue.

; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Introduction: AI-Powered Troubleshooting with Amazon OpenSearch Service

In this talk, we're going to explore how AI can help us become more effective at troubleshooting. We'll look at how we can restore service to our customers more quickly, how we can find the root cause of a problem, and then mitigate it more easily and faster. This is a level 300 code talk, which means we're going to show you hands-on demonstrations using Amazon OpenSearch Service. We'll show you code, but we're not necessarily going into the level 400 detail of explaining every single line of what we're doing. This is more meant to be an inspirational session about what you can do in this area and how you can achieve it using OpenSearch.

We have prepared a thorough guide for you that we'll link to at the end. It's a GitHub gist which will contain all of the references, all of the code snippets, and all of the material that you will need to dive deeper on this yourself afterwards. There's no need for you to take photographs or note things down and research on the fly. We'll reference everything afterwards. My name is Ulli Hinze, and I'm a Senior Solutions Architect based in Berlin, Germany. My primary job is to work with software companies running SaaS on AWS, and I'm also part of the OpenSearch community here at AWS.

Today I'm very happy to be accompanied by Rueben Jimenez, also a Senior Solutions Architect dedicated to the cloud ops community, with a focus on observability and optimization. Similar to me, Rueben works with SaaS customers and SaaS customers at scale. Before we go into the hands-on part, let me give you a brief tour of the demo application that we've developed for this talk. When we talk about observability and do a hands-on demo for that, we need an application that generates observability data. What we've done here for you as background is we've taken the OpenTelemetry demo application and modified it.

The OpenTelemetry demo application, if you don't know it, is an open source project you can check out on GitHub. It's basically a web shop where you can buy telescopes, binoculars, and similar items. This web shop is backed by a series of different microservices that perform different tasks such as shipping, billing, checkout, accounting, and so on. The main modification we made to that application is to transform it into a multi-tenant SaaS application. We're pretending here that we're a SaaS provider that offers these web shops to different web shop providers in a multi-tenant fashion.

What we did for that is we duplicated this demo application across our different tenants. When you see these animal names, these are our tenant identifiers. We have ten of those, and we also extracted some shared services from these tenants, which is a pattern we often see in multi-tenant applications. Some things are run locally for each tenant, but we also have shared services that all of our tenants have access to. Now this demo application is deployed into Kubernetes and Amazon EKS. There is another component to be aware of, which is the OpenTelemetry collector component.

The different tenant applications and the shared services export their observability data—logs, metrics, and traces—into this OpenTelemetry collector. The OpenTelemetry collector then forwards this data into Amazon OpenSearch Service. If you're not super familiar with OpenSearch and Amazon OpenSearch Service, the main database or the thing that stores the data in OpenSearch is called an index. In OpenSearch Service, we also have a component called OpenSearch Ingestion, which is an ETL tool aimed at OpenSearch. OpenSearch Ingestion takes the data from the OpenTelemetry Collector, transforms it, and puts it into OpenSearch.

Last but not least, we have OpenSearch Dashboards, which is an exploration and visualization component in the OpenSearch ecosystem that we're going to use to look at our data. What you see here is a very classic architecture for observability. Many AWS customers running their applications in Amazon EKS and running their observability using OpenSearch have such an architecture. This is nothing new.

Three AI Use Cases and Discovering the Checkout Problem

What we want to talk about today is the addition of AI to this whole setup, and we're going to cover three different AI use cases with the goal of helping us speed up the search for the root cause of an issue. The first thing that we'll talk about is natural language query generation in OpenSearch Dashboards, which will help us query our log data quicker so that we can get to the root of the problem faster.

Secondly, we'll talk about semantic log search, which will enable us to find pieces of log data that we don't know how they look. Maybe you remember that if you look for a specific log line, you have to know how that log line looks. You have to know the specific words that are mentioned in the log line to actually surface that. But sometimes you just don't know what you're looking for, and this is what the second section will be about.

And last but not least, we're going to tie everything together using MCP, or Model Context Protocol, where we have an AI agent in our case Kiro CLI to do a semantic search on our log data and do natural language interaction for us to not only troubleshoot or find the root cause but also mitigate the underlying issue. So let me switch over to our demo environment here. I brought up this same architecture that I just showed you on the left, so this should be familiar. This is our hotel demo application running in a multi-tenant fashion, and I just want to quickly show you how this looks on a technical level.

If you're not familiar with Kubernetes, kubectl is the standard tooling to interact with Kubernetes and Amazon EKS. What we can do here is say kubectl get namespaces, and then we'll see exactly what you see here. We have our different animals, which are our tenants. We have this shared namespace down here, and we also have our OpenTelemetry collector namespace up here. So this is all currently deployed in Amazon EKS and running.

We can also have a deeper look into these namespaces. We can say kubectl get pods and then take one of our tenants, let's take the Falcon tenant and just have a look at what's running in there. You can see there are these different microservices that are powering this webshop that I talked about. You see here things like the cart and the product catalog and so on. Same thing with our shared services. We can have a look at those as well, and you see here the shared services that all of our tenants are centrally accessing.

Now you see here that we have an error, but this has already been there for 44 days, so nothing is going on right now. Everything else seems to be running smoothly. No big issues to see so far. But unfortunately, not everything is good. In fact, just before our session, we got a call from our CEO, and he's really upset because our customers are really upset. Actually, the webshops are not working. None of our customers' webshops are working at the moment.

He said that people are able to shop on the webshops, like they're able to put things into their carts and so on, but they're not actually able to purchase anything, which of course is bad for any webshop. So we should go and fix that as soon as possible. As a first step in troubleshooting this, I'll do what probably an ops person would do as well and try to reproduce the problem on my own system. The way I do this is using kubectl port-forward, which is basically a tool to bring an application that is running in Kubernetes to my localhost so I can access it in a browser. I'll do that again using the Falcon tenant, and now I can access this demo application from the Falcon tenant at port 8080 of my localhost.

So let's go ahead and do that. Localhost 8080. And this is the webshop I was talking about. This is the classic hotel demo application, and down here you have different products at your disposal that you can purchase. Let's take this telescope here. This looks good, and then let's scroll down here to place the order.

Nothing happens, right? I click this and nothing actually happens. Let's open the developer console of the browser to see what's going on in terms of network activity here. We see that there's a call to checkout, and this is still pending, so this is taking a long time. Let's see if that comes back. It's actually timed out now with a 504, right? So there seems to be something wrong. A 504 is a gateway timeout HTTP error, right? So something is timing out somewhere.

Natural Language Query Generation: Identifying Rate Limiting and the Noisy Neighbor

In a distributed system like this, it's quite hard to find the root cause of this now. So maybe Ruben, we can have a look at OpenSearch Dashboards to figure out what the problem is. Awesome, thank you Gilli. So as Gilli mentioned, what we can do is go into the console or go into the OpenSearch service and open OpenSearch Dashboards. As you can see, it's an OpenSearch UI or dashboards that specifically have connectivity into our clusters or a multi-tenant cluster. What we have here is an application or endpoint that we've configured, and what we'll do is launch this dashboard here. We have our particular cluster and we can go ahead and click on that. Can you zoom in a bit? I think it's a bit, yeah. How are we doing there? Good. Maybe a little too much.

What we have is the logs in terms of the multi-cluster showing up here. So what we have is the look and feel for most of you that have been working with OpenSearch or OpenSearch Dashboards. This should look very familiar to you. We have the histogram here really showing the activity of those logs that are actually put into that particular index. I'll shrink it just a little bit so we can get down here. Then from there, what we're really looking at is how do we take it and really look at it from signal to noise, right? So we have a lot of logs that come through or that are there, and how do we make it more palatable or how do we really start to search for what is that root cause or how can we start looking for that needle in the haystack?

If we were to just peruse it here, which we all know and work with on a daily basis, especially from an ops perspective, looking down, it's really a lot of information, right? Nothing that we can really index on or anything that we can put our finger on in terms of that. So what we want to do here is ask ourselves, could we use AI to actually help us do that, right? Could we use AI to give us a helping hand in order to figure that out or what that looks like?

So what we're going to do now is put in a prompt and try to narrow it down, right? Kind of narrow the logs down and narrow overall what's happening or what's going on. So what I have here is a cheat sheet from here in terms of that. So what I'll do is pop into here, copy this, and this is really the prompt that I'm putting in for AI. So what I'll do is pop it into here, and what I'm actually mentioning or bringing attention to is how can we narrow it or get to the cause.

In the past, I've worked with this before and since I've been familiar with the infrastructure and things like that and working with teams, I'm able to say this is really barking like a dog and acting like an exceeded rate limit, right? So we've had this in the past and I've seen it, right? So I'm heading towards that direction and see if I can get confirmation or not overall for that, right? So I'll go ahead and hit this prompt here.

So what's nice about AI as well is not only did it actually narrow it down a little bit for me in terms of the logs and things that are happening, what I'll do is open it up a bit. You can kind of see a little bit of the instrumentation that went on here, right, looking for different things. And it actually picked it up, right? Rate limit exceeded, and it looks like it's actually doing the ship order, right, or the ship order maybe being affected or maybe something going on with that in terms of a tenant that might be affected or something could be going on that causes problems with the other tenants.

Overall, it could be a tenant that's oversubscribed and actually causing that shared service to have issues. What it also did for us is it normally, as I mentioned before, allowed us to do a PPL query. Not only did it give us AI recommendations, but it also gave us a PPL query that if we weren't familiar with how to craft or execute that, it provides it right there for us. So we have a reference that we can actually use and run. It's very similar to the different PPL queries that you would run normally in your operations.

If we go to the AI summary, which gives us a little bit of help here, it's taking a bit to generate. Again, it's saying we did a sample of this and it looks like it's the rate limiter in the shared namespace that could be the smoking gun. It gives us a bit more detail and says it could actually affect these downstream processes or these other shared services because of high traffic and inability to process it overall. We're now formulating a picture of what that looks like. We're starting to really hone in on what is that needle in the haystack overall.

We could do this and have some confirmation here, but what we want to do is really look at this and go through some visualization. I find it pretty useful to work through, but what I want to do is show you how this works. We can actually do a natural language query for a visualization. It's not really ideal in terms of where it shows things, and I think that might be an upgrade that's needed. But what we're going to do is put in a natural language query here and see where that ends up and what kind of visualization that will confirm whether our shared services are having a problem with other tenants.

What I'll do is come back here. For this visualization, I'm going to log the activity and give me the activity of these particular tenants and show me if there's a problem or if there could potentially be a problem. I like to think of it as a picture is worth a thousand words. You can really see what's happening or what's going on. As we can see here, it actually generated a particular visualization. You can see the eagle tenant is really busy, really trying to process, and you can see that the other services are really competing. What we can really index on or make that leap is a noisy neighbor problem. Many of us in EKS clusters and the operational space know how many times we actually run into the noisy neighbor problem, especially in a multi-tenancy environment.

Now we're starting to say we really have this noise out there in terms of logs, but now we're able to make that funnel a little more finite. We have some answers or some areas that we can actually start to dive deep into. These are just a couple of ways to go about how we make sense out of chaos in terms of where we can start moving and how we can start troubleshooting. It gives us a little bit of insight into the AI perspective. I'll hand it over to Yuli at this point, and then we'll go a little deeper into the next mechanism or tool in our toolbox to do semantic search and then really hone in on what could be the issue or what's going on with the actual multi-tenant clusters.

Semantic Log Search: Finding Issues Without Knowing What to Look For

To summarize this piece, we've now figured out both the issue that is going on. There's a rate limiting problem in our shipping service in our shared namespace, and this is caused by a noisy neighbor problem caused by our eagle tenant. Apparently, they're doing a big sale or something, overloading our system and affecting everyone else. In this next section on semantic search, I want to take a step back. You saw that Rueben in the very beginning put in this prompt like "Is there a log that is mentioning rate limit exceeded?" which basically tells you that he already knew what he was looking for. He had this problem in the past and knew a bit what to dig after.

But in lots of cases, you don't really know. There's something weird going on in your system and you don't really know. There's this 504 gateway timeout and maybe you've never seen that before. You may be completely blank, and in this case, semantic log search could be of good use.

Now for semantic log search, I will pull up this architecture here. Let me briefly show of hands—who of you knows in general what semantic search is?

Okay, so that's about a third of the room. Let me briefly explain semantic search in general. Semantic search is an evolution of classic lexical search. With lexical search, we compare words with each other. We have a user query and we compare this user query word by word with what we have in our database, which in this case is log data, but it could be other types of data as well.

With semantic search, we evolve this a bit and make search available by meaning. This means we're not only looking at the words that the user put in, but we're looking at the meaning of what the user put in and comparing that to the meaning of things that we have in our database. Semantic search is powered heavily by vector embedding models. We have a couple of them running on Amazon Bedrock, and for this particular purpose, we chose the most cost-effective one available, which is Amazon Titan text embeddings V2. I'll show you how the output of those models looks later on.

For semantic log search, we have to consider two specifics about log data versus other types of search. The first specific is that log data is typically very large in volume. It's not just a product catalog where we have a couple of thousand entries. Even in our little demo application, we're ingesting thousands of log lines per minute already, and of course in production systems it will be orders of magnitude more than that.

We cannot just run every single log line against this embedding model because it's an AI model and this will stack up quite a bill if we do that for every single log line. What we have to do instead is work with a subset of our logs and just do embeddings for those. We're doing that using a technique called sampling, and I'll dive into that.

What's good on the other hand with the log lines is that log lines are typically semantically not very different from each other. When you look at one specific microservice and the log lines that this microservice is producing, it's typically repeating itself over and over again. It's the same log lines maybe with different IDs, maybe with different numbers like the time something took, but this doesn't change anything semantically. It doesn't change the meaning of the log line. It's only a different identifier or something, which doesn't change anything in terms of semantics. So we're good to do this sampling. We shouldn't miss out on too much information because it's the same things repeating themselves all the time anyway.

Let me first show you how we actually achieve this sampling. For that, let's dive into OpenSearch Ingestion, which I mentioned in the beginning. This is the ETL tool that we have available in the OpenSearch context.

Now let's dive into the logs ingestion pipeline, which takes logs from our OpenTelemetry collector, transforms them, and puts them into OpenSearch. The log data that we've been working with so far was our raw log data, and you see that we have a little sub-pipeline here that is basically just putting every single log line into OpenSearch. What Ruben showed you before was working with this raw data. But we have a different sub-pipeline here, which we call the sampled sub-pipeline.

When we look at this definition down here, we see that we've introduced a rate limiter in here. This rate limiter does the sampling for us. It takes only ten events per second—ten log lines per second per service—and drops the rest. This basically means we have a good little selection of log lines from each and every single service that we have in our system. Then what we're doing with those sampled logs is ingesting them into a new index in OpenSearch called sampled logs.

So we now have two indexes. We have the main one that stores all of our logs, and then we have the sampled logs that contain a subset. But with the subset of logs, we're not able to do semantic search yet. We need to do some configuration on the OpenSearch side to actually be able to do this kind of semantic search. So let's do that together.

Configuring Semantic Search: Connecting OpenSearch to Amazon Bedrock

I'll use the OpenSearch dev tools, which is basically a lightweight developer environment where you have HTTP requests on the left-hand side and the results of those requests on the right-hand side. We can fire off HTTP requests and see what they come back with. Now, setting up semantic search involves three steps that we'll walk through one by one. The first step is connecting Amazon OpenSearch Service to Amazon Bedrock because Amazon Bedrock hosts the actual embedding model that we need, and OpenSearch will be the database and handle the communication to Bedrock on our behalf.

The requests I'll fire off here are pretty boilerplate, and I'm not going to dive too much into the details. What we first need to create in OpenSearch is a connector to Amazon Bedrock, and we'll specify the model here that we want to use in Bedrock. Then we'll take this connector ID that OpenSearch generated and put it into our register call. We also need to create this so-called model group and we'll take that ID and put it into our register call. I'm going through that a bit quicker, but it's all in the GitHub if you want to follow up in more detail.

When registering the model, we get this model ID which we'll use for the remainder of this section. We need to deploy this model first so it's known throughout our OpenSearch domain, and then we can actually go and predict, which means we're testing this model now. The cool thing is we're still talking to the OpenSearch API, so through the OpenSearch API, this will call Amazon Bedrock under the hood and give us a response again through the OpenSearch API. The two systems are now connected with each other.

This embedding model produces a long list of floating-point numbers, and this is the output that an embedding model generates. In a mathematical way, it encodes the meaning of what I put in. I put in here "what is the meaning of life," and this question is now encoded into this long list of floating-point numbers, which is our vector embedding. We have established the connection now, but our sampled logs index is still not able to do semantic search, so we need to configure the sampled logs index to perform this embedding automatically on our behalf to enable semantic search.

Let's go ahead and configure this index. When I look at the index so far and just spit out the definition of this index here, we don't find anything in terms of embedding. It's just containing all of our plain log lines. What I would do first is create a so-called ingest pipeline, which is a configuration object that will enable OpenSearch to automatically embed log lines that are flowing into the index. I'm creating this ingest pipeline, and what this does is process all of the log lines that are flowing in, taking the log body and producing a new field called body_embedding that will store this array of floating-point numbers, which is the vector embedding we saw before.

Then we're going to create an index template for our sampled logs which contains this pipeline, basically now saying please prepare this index for automatically embedding all of the log lines that are flowing in. You can see here that I also specified this new body_embedding field which is a vector field in OpenSearch. What I'll then do is delete the sampled logs index, and what's happening now under the hood is that OpenSearch ingestion will continue to ingest data into this index. Since the index does not exist, OpenSearch will recreate it automatically applying the configuration that we've set up here. So now the index will be recreated using this embedding pipeline and the new embedding field.

Let's have a look at this sampled logs index now. It's not yet found, so apparently OpenSearch ingestion has not ingested new log lines yet. Let me repeat that a few times. It should usually take just a few seconds, but sometimes it can take a bit more.

And there it is back up. When I now search for embedding, we have this body_embedding field here that is now part of my index definition. When I go down here, I also have this pipeline as part of my index setting. What is now happening is exactly what I wanted: all of the new log lines that are flowing in are automatically embedded by OpenSearch using Amazon Bedrock. This is the last piece of setup I needed.

I can now perform semantic search on this index. We have been talking so much about setting this up that it is worth recapping what the value of this actually is. Let me take a step back. Remember we have this 504 gateway timeout. I know that when there is a 504 gateway timeout, it indicates that something just takes very long to process. I do not know anything about rate limiting. I do not know about exceeded or acute or anything. I do not know how the specific log line would look, but I have an idea that something is just taking very long to process.

Let us do that. Let us search. I can put in something very vague here. I can ask for something like something is taking too long, and I can do a semantic search for that to find log lines that are semantically relevant for this query. Let us have a look at what it came back with. It is exactly again the shared shipping service that we saw before. It has found the log line rate limit exceeded request queued. You see that lexically those things are not related. Something is taking too long is not contained anywhere in this log line, but in meaning, in semantics, they are very similar to each other.

I have now gone from a kind of blank page problem where I have no idea what to even look for to the log entry that is going in the right direction. Now I can have a deeper look at my shipping service to figure out what exactly that problem could be. Having finished our semantic search, let us do the third part of the session and do everything together: natural language processing and semantic search, bringing it all together using Model Context Protocol, or MCP.

Model Context Protocol (MCP): Building an AI Agent with Claude CLI

So as mentioned, this is really the latest and greatest. How can we have an agent process these particular searches for us with a prompt? How can we use that same prompt to get information back? How can we use it as a tool in order to interrogate that particular OpenSearch service? As we are moving into that agentic framework, and I am sure you are hearing throughout the conference what that agentic enterprise or agentic framework looks like, we have standalone agents, we have autonomous agents, and things like that.

What we are going to do is use Claude CLI as our client. We are going to interact with an MCP server that we have actually built out in Python. From there it has all of the scaffolding in terms of being able to use the model, use the embeddings, and then use that reasoning and bring back a particular response. This was all connective tissue leading up to and delivering this piece overall. I am going to go into the configuration file in terms of the actual MCP server itself. But again, in the gist itself it has that scaffolding of how you would set it up, how you would configure it in Claude CLI, so if you are wanting to dive a little bit deeper into those settings of that configuration, it is in the gist. We will share it up there, and you can go ahead and look at it and try it out or kick the tires on it.

I am a huge fan of Claude if nobody has noticed. I am going to go into the file itself in terms of where the actual downloads are or where we have that MCP server. It is kind of painful for you guys to watch me type here.

Let me add a main.py file here. So what we have is the setup in terms of where we're configuring this, specifically the main.py file itself. We have our import for the boto3 client, and we're referencing the fast MCP for that particular component. Then we're looking at the OpenSearch connection. There's documentation available that really brings this to bear, or you can look at and reference. We have the host, which is the actual OpenSearch endpoint that we're going to interrogate or hit. We have the region as well. Of course, we have services and credentials. We use the session and things like that which is in the AWS account so you can access it, use that SigV4 access, and get or query that information.

Then we use the OpenSearch client here based on the Python piece overall in terms of that client. It's calling it and using the host and the variables above in order to make that connection or hit that endpoint. Then we're describing the MCP server itself. Fast MCP seems to be within Python the best way to go ahead and do that query or do that interrogation of that endpoint. From here, if we really look at it and boil it down, we can just put in a few lines of code for that MCP server to actually do a really dynamic fetch of that particular model or that embedding model that was initially set up. There really isn't a whole lot of work. We already have the plumbing that can actually make that connectivity or talk to that endpoint.

What we have here is a stanza in terms of MCP tooling. We can say we want you to do that semantic search and we want you to go ahead and give us that body information coming back. As displayed before, we actually have that embedding that will return a response based on that very finite query on those ten logs. We also have the model ID, so dynamically we're actually pulling in that model ID so that we can use it and leverage it within the semantic search. And again, as referenced before, we have the sample logs here that we could actually leverage and utilize.

What I'm going to do now is kick off our client, the MCP client from there. So what this is doing is going into the MCP.json settings file. It's reading all of the MCP servers that I have configured already, and one of them is the OS logs, which is the main.py that's coming through. As you can see, it went ahead and welcomed and it actually loaded successfully. You can actually go in and type MCP itself and you can validate that I actually did load it or it did actually come into your environment within your MCP client through Claude.

What I'm going to do now is type in a particular input or an NOP input and see what I actually get from there. I'm going to come down to my cheat sheet one more time here. What I'm going to do at that point is ask it. I'm going back a little bit to what I know in terms of the exceeded rate limits. We can actually get it sight unseen, but again, I want to see what we're going to get back with a very targeted NOP or prompt. So on everything here, we'll go ask: do we really trust it or do we want it to do what we say it wants to do? So we'll push it through here again.

Automated Remediation: Scaling the Shipping Service and Verifying the Fix

And here we go. So it's actually showing us that it found five hundred rate limit logs from the shared.shipping service. Through the whole talk, we've been dancing around the question of how we really kind of close the funnel or actually start to hone in on the particular root cause.

We see it again, so it's a problem with the shared shipping service. It's actually a noisy neighbor issue causing that component altogether. We have confirmation at this point. So what we also want to do is figure out how to remediate this. We can ask it directly: remediate issue for me.

Kiiro and the whole AI component here will go ahead and give me suggestions on what we can do in order to remediate this. If I didn't really know or understand what was actually going on with that piece, it would show me exactly what could be happening. So what we see at this point is that it's actually scaling itself. It's going ahead and scaling the shipping service. It's going through and scaling it for me overall. It actually went ahead and looked and says the recommendation is scaling that shared shipping service. Then if we go ahead and scale it, will it remediate the problem? It went ahead and gave us that or kicked off the remediation for us. We went ahead and said OK, trust it, go ahead and do it. It's very similar to an "are you sure you want to delete" kind of confirmation, right? You want to make sure that you're adhering to it and responding to it. So basically it actually scaled, giving you actions here: scaled shipping deployment from 1 to 5 replicas. It did a few other things there.

This is what it fixed here with distributions, and it gives you an overview of what it did and what it didn't do at that point. So what we're going to do now is go back into the application to see if it actually proved to be successful. We'll go back out on our command line and see what actually happened. We'll quit that piece and move forward. We'll take a look at the shared services real quick and see if it actually did scale. Yes, that actually happened. Kiiro did what we recommended it could do to solve that problem. What we want to do now is come over here and port forward. As Julie had kicked off before, we just port forward on one of these front ends and then we'll go and test the application to see if it's functioning the way we wanted it to.

We'll go back here to the hotel, exit out of that, and continue shopping. We'll come back in and choose this particular product. We'll stay with what we wanted to before. Our nephew wanted this particular thing, so let's see if we can actually complete it. We'll add it to the cart as we did before, walk it down here. Everything else is pretty much the same, and we'll see if we can actually place that order. Boom, successful.

Conclusion and Q&A: Resources and Cross-Region Capabilities

Now we've really come full circle. We actually went through and looked at it from NLP, really narrowed it down. We went from semantic search, not really understanding what could be the issue, but then we're actually homing in more and more in terms of what the needle in the haystack would be. Then we used MCP, an external agent tool, to actually give us that same confirmation to actually fix it and move forward. With that, I'll go ahead and hand it back over to Yuli to take us home. This already brings us to the end of the session.

This is the GitHub gist we promised a few times during the session, so do check this out. It contains all of the code snippets, documentation references, and other references and explanations that we talked about during this session. Let's have a look at the time. I think we have time for one or two spontaneous questions. If any one of you has a question for the room that we can answer, you can do that now if you want to raise your hand. Anyone here? OK, yes. Just repeating the question for everyone: the question was about the connection between OpenSearch and Bedrock, whether we can have a cross-region connection for that. Yes, in fact you can. You can do that. In our case we did it same region, but you can have it cross-region as well. Basically how it works is that OpenSearch is assuming a role on your behalf and calling the Bedrock service endpoint on your behalf. Any other questions from the ops side of the house? Go ahead. Just a question: when you were demoing all those tools, what part was that? Is that part of OpenSearch? Yes, it's part of a standard part of OpenSearch Dashboards. OpenSearch is fully HTTP API based, right? In contrast to other search tools that are out there, it's fully HTTP API, and this is the main playground that people use in OpenSearch to play with different endpoints. This is not nothing new; this has been around for pretty much ever. Cool. OK, so with that, let's close off the session. Thanks a lot for spending your valuable time with us. We can take further questions in the hallway. We just need to move out in a bit for the next speakers. We're happy to take any questions offline. Please do fill out the session survey. I know everyone reminds you of that. Please do for this session as well, especially if you liked it. We read every single piece of feedback that you put in there, so please do fill that out. Have a great rest of re:Invent, enjoy the conference, and hopefully see you all soon. Thanks folks.

; This article is entirely auto-generated using Amazon Bedrock.