DEV Community

Cover image for AWS re:Invent 2025 - Architecting multicloud solutions from data mesh to generative AI (HMC210)
Kazuya
Kazuya

Posted on

AWS re:Invent 2025 - Architecting multicloud solutions from data mesh to generative AI (HMC210)

🦄 Making great presentations more accessible.
This project aims to enhances multilingual accessibility and discoverability while maintaining the integrity of original content. Detailed transcriptions and keyframes preserve the nuances and technical insights that make each session compelling.

Overview

📖 AWS re:Invent 2025 - Architecting multicloud solutions from data mesh to generative AI (HMC210)

In this video, Don Simpson and Deevanshu Budhiraja from AWS discuss architecting multicloud solutions, focusing on data mesh and generative AI. They address multicloud scenarios driven by M&A, regulatory requirements, and differentiated capabilities. Key architectural patterns covered include federated queries, materialized views, and data mesh for managing data across clouds. The session explores challenges like data gravity, governance, and skill gaps, presenting solutions through AWS services like Athena, S3, and Bedrock. The latter portion examines multicloud generative AI patterns including RAG (Retrieval Augmented Generation), structured RAG with databases, graph RAG with Neptune, and agent architectures using Model Context Protocol (MCP) and LLM gateways for unified model access across cloud providers.


; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Thumbnail 0

Introduction: Architecting Multicloud Solutions from Data Mesh to Generative AI

All right, can everyone hear me? Great. If you have any trouble, just get the attention of someone in the back and they will come over to help you. First of all, thank you very much for coming to our session today. We are going to talk about architecting multicloud solutions from data mesh to generative AI. My name is Don Simpson, and I am a principal technologist on our cloud and AI innovation team. Our team focuses on things like multicloud, data mesh, generative AI, and many other things.

I want to give a chance to my colleague Deevanshu to introduce himself. Hi everyone, my name is Deevanshu Budhiraja, and I work as a senior solutions architect with our financial services customers. I have been dealing with many financial services customers and insurance customers who are in multicloud situations and wanted to build an effective data strategy. So we are here today to help with that.

Thumbnail 60

All right, so we are going to get started. I want to set some expectations. Today we are going to focus a lot on cloud agnostic patterns. Primarily we will secondarily talk about AWS services that relate to the patterns we are discussing. Then at the very end, since this is a silent session, we will have an opportunity afterwards if you have questions to talk in the hallway.

Thumbnail 90

The first thing we want to talk about is our agenda. We are going to go through a data strategy overview. We are going to level set. We are going to go through voice of the customer, things that we are hearing from customers. We are going to talk about scenarios and challenges, and then we are going to go into recommended design and architectural patterns. Lastly, I think we could not get away with not having a generative AI component, so the very last part of this will be about multicloud generative AI.

Thumbnail 110

Defining Multicloud: Maturity Models, Primary Drivers, and Data Strategy Fundamentals

First of all, we want to level set here on what we mean by multicloud. When we talk about multicloud at AWS, there are a few things that we think about. One is where you have multiple workloads, IT solutions, or applications that are running on more than one cloud service provider. We do not necessarily put things like SaaS into that, so I think it is good for us to have that understanding.

The other is that we base our understanding and what we will talk about today on the cloud maturity model, which is part of our open alliance for cloud adoption. It is also important for us to understand the primary and secondary drivers for multicloud. One example of a primary driver would be mergers and acquisitions, or M&A. A lot of times we have customers that end up with a multicloud operating environment because of an acquisition or multiple acquisitions. Oftentimes those acquisitions come with sudden and unexpected transitions, and there are transition challenges that we see with customers going from diligence to integration.

The other is that we see customers looking at multicloud for differentiating capabilities, so differentiated services. We also see it from a regulatory perspective, and that could be data sovereignty or cloud concentration risk. These are things that we hear from customers that are coming via regulators. So as we are talking about this definition of multicloud, I just want to do a quick show of hands from the audience here. How many of you have data that spans multiple clouds, and that can be siloed or that you are trying to integrate and correlate? And then also, if you answered yes to that question, how many of you have challenges actually achieving your business outcomes because of it being spread across multiple clouds? Almost everybody who raised their hand the first time.

Thumbnail 250

When we also look at that multicloud approach that I mentioned a minute ago, the cloud maturity model, this is focused on people, process, and technology. We believe that a top-down vision and strategy of creating data as a strategic asset, or value that is created by either efficiency of operations or additional business value, is extremely important. When we look at people, one of the areas that we find our customers are struggling with is the multicloud skill gap. From a people standpoint, it is important for us to upskill our employees and augment them where we can with agentic AI and AI assistance. When we look at the process, the maturity model tends to look at organizations based on how they've constructed their Cloud Center of Excellence, or CCOE, in terms of specialization.

We look at whether they have specialization at the individual cloud service provider level across certain areas like architecture, platform engineering, and technology. We tend to look at technology first because we're technologists, but multi-cloud architecture, multi-cloud services, and some recent announcements like our Interconnect for multi-cloud that we just announced are steps that AWS is taking to offer services that allow our customers to operate in a multi-cloud environment better than they could before.

Thumbnail 350

Let me go through a few definitions here. I like to put up these two definitions because I think they complement each other, but they're also different enough. The first is from Gartner, and this data strategy definition focuses on a few things that are really important. It describes a highly dynamic process, and I think we've all experienced this. You build something today, and you find out tomorrow that you have additional requirements or something has changed. This definition tends to focus on the acquisition, organization, analysis, and delivery of data, so it's really about that process.

Thumbnail 390

The AWS definition here shows some similar themes: people, process, technology, and rules. As you think about this, one is a little bit higher level, while the other drives into the technology. You'll start to see that in our discussion later today. We'll get into things like data mesh and federated query, which are patterns you can use. I'll use a few examples with some voice of the customer. In this particular case, I've highlighted the things that I think are relevant to these quotes.

Thumbnail 420

We talked about M&A a minute ago, and I mentioned the integration phase. You hear this often from customers—this is really where they struggle with the integration. They've acquired a company and they're at day zero or day one, depending on how you look at it, and they've started their process. There's a lot of business process associated with M&A, things like workload placement strategy—where do my workloads go? I have certain workloads in an acquisition that are on one cloud service provider. Do I migrate those? Do I repurpose them? Do we re-platform them? What do I do with that?

Thumbnail 480

Then we get into technology integration, like whether I have specific integration patterns that allow me to bring in an acquisition. For example, we can use AWS PrivateLink to transfer data safely between the two organizations without granting full network access. What's most important here is to come away with a proactive strategy. I mentioned things like workload placement, having intent when we make acquisitions so that we can actually be proactive about how we integrate them.

Thumbnail 500

The next area we're going to focus on is multi-cloud architecture. This is a differentiated capability—having the right data and technology in the right cloud. There are some things here that aren't said but that are actually present, such as data gravity, which is a real challenge. The more data we have in a particular location, the more data and the more applications and services we attract to that location. This becomes a challenge for us when dealing with multiple clouds, and it's about leveraging differentiating capabilities.

Thumbnail 530

When we're thinking about differentiating capabilities, we like to come in with a good understanding of why we're doing something, and I think that's extremely important when we're thinking about multi-cloud. The last quote here is an interesting observation about a lot of talk regarding dynamically moving from cloud to cloud and throttling compute up and down, which is overhyped. The reality is that for some workloads in some industries this may be okay, but for a highly regulated customer this might not be the right strategy.

Thumbnail 550

Thumbnail 580

We often hear that everyone else is doing this, so we need to consider and do that as well. We just like to say start with a good intention. What is the business outcome you're trying to achieve, and work backwards from that. Ensure grounded guidance—make sure that you have done your research and that you know whether this is the right strategy for you. At this point, I'm going to pass it over to Deevanshu Budhiraja to talk about multi-cloud data scenarios and challenges. Thank you.

Thumbnail 620

Thumbnail 630

Multicloud Data Scenarios and Challenges: From Data Gravity to Governance

As Don mentioned, we're going to talk about the definition of multi-cloud, some of the scenarios related to data and the challenges, and then how we solve these challenges. Let's start by discussing some of the scenarios.

We just talked about data gravity, which is a real challenge for multi-cloud. Although you would want to use more than one cloud for different reasons—maybe because of cost, regional availability, or best-of-breed capabilities—the reality is that data has gravity. Wherever you have your key applications, that cloud pulls all your data towards it. This data gravity challenge discourages you from using multi-cloud effectively, so we're going to talk about how to solve such challenges.

Thumbnail 670

Thumbnail 690

With multiple clouds, you also have data governance and control challenges. These include how to implement governance and security policies and how to implement controls, since each cloud might have different ways of doing things and different APIs. Don also talked about mergers and acquisitions, which I sometimes call a forced multi-cloud situation even if you didn't want to do it. The intention was to get the best out of another business that has been acquired, but the data you wanted to extract from that business and the time to realize or monetize that data sometimes becomes a challenge because you don't have the right data architecture or controls for the two clouds.

Thumbnail 740

The scenarios we discussed don't have a one-to-one relationship with the challenges. For example, mergers and acquisitions can be responsible for all these challenges you see on your right side, like gravity and governance. There's really not a one-to-one mapping between them. Any of these scenarios could give you the problems listed here. Let's talk about some of the capabilities for dealing with these challenges.

Thumbnail 770

First, I wanted to ask you: has anyone heard about materialized views or federated queries? Some of you probably have. That's great. For those who don't know, let me give you a real-world analogy of understanding materialized views. I like to give the example of pre-cooked meals. Sunday night we cook meals for the rest of the week so that we don't have to gather those ingredients and food items again to cook on Monday, Tuesday, and the following days.

In the case of multi-cloud and materialized views, we have raw data in different clouds. Instead of computing and joining the data from different sources, we pre-calculate and pre-join that data and call it a snapshot of that data. Whenever we have to refer to that data, we don't have to go to these clouds again and we don't have to recompute everything because any data access requires compute. To save that processing, we precompute it and create a materialized view out of it, which gives us the benefits of saving time. It does cost some amount of space because you have to store that precomputed data, but you have to balance between cost and time. In most cases, in fact, time is of the essence.

Federated queries is another way we can solve the data gravity problem. Think about a booking app like Booking.com or ClearTrip, or whatever you're using. Booking.com gets data from all different airline providers and hotel providers without actually moving that data to any central location. It's like a federation layer created which can crawl all these data sources through APIs or whatever mechanism it uses and gives you the data without moving any of the data. Even though you have a data gravity problem, to fix that you don't have to do an anti-data gravity thing or move the data out of your primary cloud, whatever you designate as your primary cloud.

Thumbnail 920

Thumbnail 950

You can still have the ability to access the data without doing a lot of back and forth movements. In order to solve some of the data governance and data monetization, time to value and acceleration, another effective way is to build decentralized architectures, and we'll dive deep into more about it. How can you build decentralized architectures? What are different architecture patterns to do that? In a sense, it's like you have data distributed across multiple clouds and how do you create a governance layer, a data access layer which can talk to these different sources in an effective way.

Thumbnail 970

Thumbnail 990

In order to capture the data lineage, which is very important because you should know where the data is originating from, how do you trust that data, you need to have an effective metadata strategy and management to solve the problem of data lineage across multiple clouds. One of the concepts I was just talking about is federated query, which serves as a base for materialized views and, in some cases, decentralized architectures as well. Federated query enables these architecture patterns.

Thumbnail 1030

Let's talk about the practical examples of materialized views. Today, different cloud providers have different ways of defining materialized views. For example, Google Cloud has BigQuery Omni to define materialized views. Amazon has something called S3 and Athena to support materialized views. It also has support for Apache Iceberg tables, wherein it can refresh your materialized views. Federated query can also be created using Athena Federated Query, wherein we have a built-in connector for Google BigQuery.

Thumbnail 1070

You can define and actually choose the data source as BigQuery within Amazon. Using the Athena connector, you can access that data without moving that data out of Google Cloud while sitting in the AWS cloud. The same thing you could do with Google Cloud. It's an either or choice, depending on what's your primary cloud, what's your secondary cloud, and where your data architecture is, like where the majority of the data is sitting. As I was talking about decentralized architecture, you may have heard about the words data mesh or data fabric.

Thumbnail 1110

Data mesh defines a way to share data between different cloud providers or between different data sources by creating data domain owners, creating data as a product wherein each domain owner creates their own data products. There are consumers who would consume that data product through the data mesh architecture, and we'll talk about this in a bit more detail in the next few slides. The metadata catalog, as we just talked about, is how do you effectively manage the data across different cloud providers. You should be tracking the metadata for that data, like the size, the origin, how do you access it, using different cloud native tools or third party tools which can work across different clouds.

Thumbnail 1130

Thumbnail 1140

Thumbnail 1150

Data Mesh Architecture: Enabling Boundaryless Data Sharing with Trade-offs

Let's dive a little bit deeper into the data mesh architecture. In a typical data mesh architecture, you create a data marketplace, like a central market, a central data marketplace. The producers of the data would create the datasets. I'm going to take an example of an insurance company again because I've been dealing with a lot of insurance customers. Think of if an insurance company wants to create a data mesh. Think of data producers as one of the teams which is responsible for the management of the customer data, for the policies, for the claims. They're creating a lot of data about the customers and the insurance policies.

Thumbnail 1180

Then on the right hand side, there are consumers of that data, the pricing team maybe, the marketing team maybe, who wants to consume that data and see how effectively they can sell the insurance policies to the new region, to the new customer base. The team in the middle is enabling both the producers and consumers with that data marketplace so that they can all share the data boundaryless. Even though the data is sitting in different clouds or different sources or on-premises, the abstraction layer which we are creating with the data mesh enables boundaryless data sharing.

Thumbnail 1220

There are some trade-offs with this architecture pattern, some obviously pros and cons, if not done correctly. The autonomy of the self-service BI or the self-service data access through data mesh is an advantage that can sometimes come with the cost of data silos because different domain owners, like in our previous example the customer relationship management team or the policy or underwriting team, sometimes create overlapping data sets which already exist, duplicate data sets, and sometimes they create data sets which other teams are not able to use due to strict compliance issues.

We have to be very careful when we design a data mesh that we have agreed upon principles and agreed upon standards for how we want to create data sets and how we want to share them. It should not become a technological or systematic challenge for the other teams to consume the data which has been produced by the data producing team. If we do not do that, then we end up creating many data sets which could be duplicate and create silos. Likewise, for standardization, the data mesh creates a standard for creating, sharing, and producing data and the marketplace, and everything that requires a lot of upfront engineering complexity to begin with. Oftentimes there is a lack of support from the leadership and technical support, and there are limitations around the capabilities. All these things must be considered before we jump on to create a data mesh strategy.

The data mesh gives a lot of control and flexibility when you create it versus a central data architecture when you are dependent on the IT teams. This control and flexibility, because you as a domain owner or data owner can define your own data sets and create your own data sets and let the other teams access them, can sometimes come with the cost of inconsistent implementation between different data teams and different data owners because one team might want to create some security control or some encryption, whereas the other team may not be able to follow those controls due to their own limitations.

Thumbnail 1380

Thumbnail 1390

Thumbnail 1410

Federated Query Patterns: Core Components, Advantages, and Implementation Considerations

Let us talk about the other architecture pattern which is federated query. In case of federated query, we submit a query to a federated query engine, and that engine has some core components. For example, metadata management and discovery is one of the key components. This query engine maintains the metadata or the technical data about your data sets, things like where data exists, what is the size of the data, and how data can be accessed. All these things are defined here. Because this information is already there in the federated query, your query actually knows how to fetch that data.

Similarly, it also has a data access and security layer. In this layer we define, for example, if I have my primary workload in Amazon and I have a secondary or maybe some data analytics related workload that I want to push out to another cloud, let us say Google Cloud in this case, then I should have the right controls configured in Google Cloud and in Amazon, like the right keys and certificates, API keys and so forth, within this layer so that I can access data seamlessly without doing any sort of manual intervention or without any dependency on the IT teams to configure permissions manually.

Thumbnail 1460

The last layer or the last core component of the federated query engine is the optimization and the performance, which is very important in this case, especially in case of multi-cloud because there could be a lot of back and forth with data traveling over the network. There has to be the right set of configuration settings in order to optimize how we let that data travel faster and in the most cost effective way, saving the egress cost and the network cost when we want to enable federated query in case of multi-cloud.

Thumbnail 1490

As a result, when the query travels through these core components, these core components function in tandem with each other, and you get the result of the query. I would highly encourage you to try this out with Athena connector for Google BigQuery. It is seamless and easy to configure.

Thumbnail 1520

You would appreciate the qualities of federated query and the advantages of this pattern. Like every architecture pattern, this would come with its own pros and cons. If not configured correctly, while there is less data movement in the case of a federated query because we are just pulling the data from a remote source rather than copying that data from that source, this low data movement can come with high latency if not configured properly from a network point of view. We just announced network connectivity between Amazon and Google Cloud, so I would highly encourage you to take advantage of that. We have experts here who would want to talk about this newly launched capability after this session if you have any questions.

When we create a catalog, the metadata that goes into that catalog sometimes has metadata sync issues. For example, if you dropped something in another cloud and in your primary cloud, that definition of the table when you dropped it or altered the definition was not synchronized in the metadata catalog. Your federated query engine would not know the data has changed. If you don't have a mechanism to synchronize that metadata, you would end up in a situation where your query might give you incorrect results or runtime errors.

Thumbnail 1650

The same applies to access control. When permissions change between data sources, you need a mechanism in place to update those permission sets. It's sometimes complex to manage credentials, rotate certificates, rotate keys, and update keys. You should consider all these factors before you create a federated query architecture pattern. At this point, we will talk about data augmentation from an AI standpoint, and I'm going to hand it over to Don.

Thumbnail 1670

Multicloud RAG Architectures: Retrieval Augmented Generation, Structured RAG, and Graph RAG

Thank you. First, we're going to level set. Many of you may be aware of what RAG, or Retrieval Augmented Generation, is. To walk through this for people who do not know, we're going to take a user query in step one. We're actually going to do a vector embedding of that query to achieve and execute a semantic search against a knowledge base. We're going to do that similarity search based on cosine or Euclidean distance, and we're actually going to come back with document chunks that are relevant. These can be fed in as facts into the context, and ultimately we'll get an augmented prompt that comes out of the RAG engine. You can think of that RAG engine in this discussion today as being an AI application RAG engine. Those are the basic concepts we want to get across first.

Thumbnail 1750

Thumbnail 1760

Thumbnail 1770

Thumbnail 1780

Next, we want to start talking about what that could look like in a multicloud context. As we talk about these patterns, we always want to start with a backdrop of security being the top priority. Things that we need to think about are security and performance, as mentioned earlier, latency in the network. These are things we want to take into consideration in terms of design. As we're looking at this, one approach that we can take is if our CSP1 is where our AI application or RAG engine is, we can have our knowledge bases in CSP2. Due to data gravity, we don't necessarily want to or cannot move the data, so we can leave that in CSP2. However, there are some key considerations that we need to be aware of. One is consistency of our embedding model that we're using across cloud. Secondarily, there's the vector store, and they don't have to necessarily be the same. But there's opportunity here as we think about highly available systems and resilience where that vector store could be replicated. We could replicate that data store either on the same vector store on the same CSP or we could replicate it across things like S3 vectors as a cost-efficient mechanism for storing vectors, which is one potential source or destination for that.

Thumbnail 1830

Minimizing data movement is crucial, similar to the strategy we use with federated query approaches. However, we must account for considerations such as network latency and security. Importantly, you must maintain a consistent embedding model across your two cloud service providers. Additionally, you need to consider resiliency through vector store replication and potential sharding.

Thumbnail 1860

Thumbnail 1870

Let's discuss a couple of other additional types of RAG. First, we'll cover structured RAG, often referred to as SRAG. In this case, we follow a similar flow to what we discussed before. A user query comes into our structured RAG system. We perform text-to-SQL conversion in this case. Currently, as one example, we have databases and data warehouses that could be accessed through a federated query engine. Today, Bedrock is an example where our knowledge bases support structured RAG with Redshift, which represents a particular service you could implement.

Thumbnail 1890

Thumbnail 1900

Thumbnail 1910

Thumbnail 1920

Thumbnail 1930

Thumbnail 1940

In this scenario, we execute the query and retrieve records. We then create an augmented prompt from those results. There are significant benefits here. Instead of only performing semantic search across unstructured data, we can now bring in structured data from our enterprise. We're also leveraging the investments we've made in database assets. We don't necessarily have to move this data, which becomes particularly valuable in scenarios like mergers and acquisitions, providing one mechanism to start experimenting. Of course, security, network reliability, and performance remain important considerations.

Thumbnail 1990

Thumbnail 2010

I want to touch on another RAG pattern: graph RAG. What's interesting about this approach is that we gain access to concepts within a graph that can relate across different document chunks, allowing us to span beyond document scope. We can perform hops, iterate over relationships, and traverse connections. Similar to structured RAG, a user query comes in and we attach it to a graph query. This graph query could be Cypher, Gremlin, or SPARQL, and we execute it against the graph database. On AWS, we have Amazon Neptune as a graph database you can leverage. Currently, this is supported with Bedrock knowledge bases, which offers the ability to use Graph RAG with Neptune as your vector store. We then retrieve the graph and move to the augmented prompt.

Thumbnail 2050

Thumbnail 2060

Thumbnail 2090

One other scenario to consider in a multicloud graph RAG situation involves the graph, its concepts, and the relationships it contains. We don't necessarily have to move all the data. You could think of this as creating what customers sometimes refer to as a semantic layer, where you build connective tissue across data that may not live in the same cloud service provider. This represents a really good implementation and pattern for certain use cases where you want to achieve this capability while leaving data where it resides. The primary benefits include the ability to traverse entities and relationships. You can think of spanning and iterating over relationships that could apply to financial services scenarios such as anti-money laundering or know-your-customer checks, where you traverse from an individual through associated identities, bank accounts, locations, and other connected information to ground your analysis.

Thumbnail 2100

Some key considerations remain consistent across these approaches. The same considerations tend to emerge repeatedly. When we talk about network reliability and performance, and we've mentioned this at least twice, we'll say it again: in multicloud scenarios, our interconnect product and the reliability of your network are critical factors.

Thumbnail 2120

Thumbnail 2130

The performance of that is something we think is extremely important and beneficial to our customers. So now we're going to talk a little bit about multi-cloud from an agent perspective. We built this up in a very structured way so that data is key to our agents in terms of their ability to perform, generate accurate results, and reduce hallucination.

Agentic AI in Multicloud: Model Context Protocol, LLM Gateway Routing, and Next Steps

We're going to talk about Model Context Protocol as just being one standard, a very popular standard for allowing agents to leverage tools. MCP is something many of you have probably heard about. In this scenario, let's assume a few things. We talked about federated query before and some of the trade-offs, but one of the trade-offs and things you'll have to consider is what if your data source is not supported by your federated query. Let's say Amazon Athena, which allows you to do federated query, doesn't have a data connector for a particular database. In that scenario, you either have to build a custom one or as an alternative you could consider an MCP server that implements tools and allows you to expose data from a data source.

Thumbnail 2200

On the far left here, I'm going to fill in the picture a bit. We have an agent, and we could use Strands as an agent framework. We have an MCP server and in this case we're putting the MCP server in close proximity. We want to localize and optimize the efficiency there, take advantage of the network boundary being close, the same partition. Associated with CSP 1, we've got an MCP server there and let's say we've got another one over here, an MCP server 2, that's with our data store 2.

Thumbnail 2250

MCP has a client and so we have a client for each one of those. You can imagine that our agent is now capable of leveraging MCP to actually federate that access to the data. That tool can go beyond just access to data; we can do other things, but in this context we're really talking about gaining access to the data. It could be read or write, depending on the MCP server. By leveraging the protocol, just about any agent out there is going to support MCP. So we've got access, we've got interoperability, and then also we're leveraging streamable HTTP transport. Everything here is standard, and we're leveraging the best of MCP servers and the wide support we have across many different cloud service providers.

Thumbnail 2300

Let's go through some of the challenges. We talked about data gravity, which is our constraint and what we're trying to work around here. I talked a little bit about data source compatibility. Take it one step further—for example, there may be a discrepancy in terms of data type support. One may not support a blob, just as an example. These are scenarios where you may have to consider something else. In this case, the agents can access the relevant data, which gives us control over what happens with that MCP server.

Thumbnail 2330

Thumbnail 2350

Thumbnail 2370

From the benefits, as I mentioned, we have the standard and widely adopted protocol, locality optimization, and what we call federated tool execution and processing. All right, so next we want to talk about a scenario where you have a model that may or may not be supported on AWS in Bedrock. Maybe it's not supported in a particular region, or maybe there's a capacity issue, so you want to gain access to that LLM in a different cloud service provider. We're introducing the concept here of how an LLM gateway and router can help us in a multi-cloud context. This LLM gateway and router is actually going to provide us a standard unified interface on the input, the request side. Commonly we'll see OpenAI as the standard for that. We also have LiteLLM and Agent Core gateway. So we've got a couple of different scenarios and things that we could leverage here that are services.

Thumbnail 2410

Thumbnail 2430

We have our agent actually communicating and leveraging different LLMs and different models that are running in different cloud service providers. As I mentioned, these are the driving factors. Model selection, availability, and capacity are key considerations. We can also round robin to the various LLMs if one is not available. With throttling and other mechanisms, we can send requests to another LLM. These are the primary benefits of that specific protocol.

I have a unified writing to one request format specification, and then I have it translated for me. I'm prioritizing and routing, so one of the other things that we could do with this gateway, depending on how much functionality we want to put in there, is introduce cost-based routing. Based on cost and the request, we can route that separately based on different features and capabilities. Then, as I mentioned, there's that dynamic load balancing.

Thumbnail 2480

Thumbnail 2490

To bring the whole picture together, both the LLM gateway router and the MCP client to server communications that we talked about in the previous slide can all be utilized by an agent. We'll talk about a couple of next steps and resources that we have. There are many different sessions available. We have one-on-one sessions starting with best practices from a multicloud perspective. We also have deeper dives, so this is a 200-level conversation today, but we have a 300-level session that will drill down into more detail on the things we just discussed with respect to generative and agentic AI.

Thumbnail 2510

We also have security best practices and network best practices sessions. We have a kiosk in the expo hall that you can visit. We have demos and use case deep dives, and we can have conversations with you. As part of the multicloud program, we spend a lot of time talking to customers like you about multicloud strategy, so we're happy to engage with you via your account teams. Feel free to reach out to us, and if you have any questions, we'll be available in the hallway to answer them. Thank you.


; This article is entirely auto-generated using Amazon Bedrock.

Top comments (0)