Kazuya

Posted on Dec 5, 2025 • Edited on Dec 8, 2025

AWS re:Invent 2025 - Scale agent tools with Amazon Bedrock AgentCore Gateway (AIM3313)

🦄 Making great presentations more accessible.
This project enhances multilingual accessibility and discoverability while preserving the original content. Detailed transcriptions and keyframes capture the nuances and technical insights that convey the full value of each session.

Note: A comprehensive list of re:Invent 2025 transcribed articles is available in this Spreadsheet!

Overview

📖 AWS re:Invent 2025 - Scale agent tools with Amazon Bedrock AgentCore Gateway (AIM3313)

In this video, AWS Principal Generative AI Architect Dhawal Patel and Principal Engineer Nick Aldridge introduce Amazon Bedrock AgentCore Gateway, a purpose-built AI gateway that provides a unified communication point for agent-to-tool interactions with one-click MCPification of enterprise APIs. The session demonstrates how to scale from a few MCP tools to thousands across hundreds of agents, addressing challenges like fine-grained access control, multi-tenancy, and infrastructure management. Sumo Logic's Kui shares their production deployment of Dojo AI agents using AgentCore Gateway, showcasing autonomous security investigation agents that achieved 50% faster analysis time and 75% reduction in MTTR. Key features covered include semantic tool discovery, Lambda and OpenAPI target integration, MCP server support, gateway interceptors for custom authorization, and VPC connectivity. The presentation emphasizes best practices like working backwards from user queries to identify appropriate tools, optimizing tool descriptions to reduce LLM context size, implementing delegation-based security models, and maintaining a tools registry for governance.

; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Introduction: The Challenge of Scaling AI Agents and MCP Tools

Good day, everyone. It's amazing to be here. This is such a great venue, and I'm grateful you've all joined us in person. Let me start with a quick show of hands. How many of you are building AI agents? Almost everyone, right? How many of you have deployed agents in production? Wow, perfect. Most of you. How many of you are developing MCP tools? A lot of you, great. And how many of you have more than 10 MCP tools? A few of you, about 25 percent. How about more than 50 MCP tools? Just two folks. Excellent, thank you.

My name is Dhawal Patel, and I'm the Tech Lead and Principal Generative AI Architect at AWS. In this session, you're going to learn how to scale from those first few tools to thousands of tools across hundreds of agents, securely at enterprise grade without operational headache.

Across industry verticals and multiple companies, including healthcare, finance, security operations, and observability, companies are transforming their customer experience using agentic AI. We're honored to have one of our customers, Sumo Logic, who will share their journey in agentic AI within the security operations space. I'm really looking forward to their story as well.

Here's the critical point: your AI agents need real data, your data. The data could be sitting in knowledge bases, private data stores, enterprise APIs, or even on-premises. The magic happens in the agent-to-tool data integration layer. This is really critical for you to scale and succeed in your agentic AI journey.

From Proof of Concept to Production: The Reality of Agent-to-Tool Integration

Let me walk you through a scenario. Imagine you're developing a customer support agent for retail. Users can shop, get support, and check order status. You develop a great proof of concept that works well with a few user queries. Your stakeholders are impressed, and now they ask you to deploy it in production at scale. That's when the reality hits.

You need to scale from those first few users to thousands of users. You need to manage infrastructure scaling, but before that, you need to ensure your AI agents can talk to real data and integrate with your enterprise APIs so they can accurately respond to user queries without hallucinations.

So what do you do? You start integrating with existing enterprise APIs that can help your customer support agent, like product catalog, pricing, and customer details. You slowly and steadily integrate these tools with your AI agent. Then you think about integrating MCP tools or MCP servers as well, or maybe having agents talk to other agents using the agents-as-tools pattern. Now you try to decouple agents and tools to scale, creating a clear distinction between them.

You might have 20, 30, or 50 tools to manage, which is still manageable. But then other team members start onboarding and developing different agents, like a pricing agent or shopping agent. You decompose that large customer support agent into manageable specialized agents, or you develop new product features. Now you have N number of agents to manage with M number of tools, creating exponential growing complexity. That's exactly why you need to scale, and that's where protocols like MCP come in.

The Complexity Problem: Managing Hundreds of Agents, Thousands of Tools, and Multi-Tenancy

You have the Model Context Protocol , which you are all aware of, and which really takes care of interoperability and ensures unified communication between agents and tools. However, this is clearly not enough because you still have a sprawling architecture where agents are talking to a number of tools, and you need to handle governance, fine-grained access control, and security. You need that single unified communication point where you can govern all these things and scale at the enterprise level.

When your customer base grows, you are going to have a lot of users, maybe thousands or hundreds of thousands of users , accessing these agents. Then you bring multi-tenancy into play where you are developing these agents as a service, and now you have to manage hundreds of tenants. You have different permutation combinations of fine-grained access control determining who gets to access which tools. It is really a lot of heavy lifting for you to manage.

In this diagram, I am showing a pricing agent being accessed by two different users. You want some of those users to have access to promotional discounts while you do not want other users to have access to those promotional discounts. How do you do that? How do you do that dynamically as well? You need to scale that as well.

Here is the approach: you start with a business use case and user queries, you define your agent goals, you decompose your agent goals into manageable agentic tasks, and then you work backwards to identify existing enterprise APIs or data that you can use to satisfy those agentic tasks . As you continue working backwards, you identify those data custodians and the existing microservices that can satisfy those tasks, and then you map it out and work backwards in identifying the right MCP-targeted tools for your agents.

What you end up with is lots of MCP servers, maybe hundreds of MCP servers . On the other end, you have lots of agents, maybe hundreds of agents accessing these thousands of MCP servers. It is clearly a lot of heavy lifting again. You are going to have to manage the infrastructure, scaling, and containerization of these MCP servers. It is a lot of heavy lift for you.

What do we do? What is the answer? Before we go into the solution, here is the thing: you need to scale maybe hundreds of agents with thousands of tools with hundreds of MCP servers to manage with these different types of API types to manage and MCP file them . Then add this multi-tenancy into the mix, and probably you are going to end up having hundreds of thousands of tenants with hundreds of thousands of permutation combinations of tools policies to manage dynamically.

How do we scale? What you really need is a single unified communication point for all of your agent-to-tool interactions . When I say tool, I do not just mean only the APIs or data stores. I also mean other agents as well acting as tools for the other agents. What you really need to have is a very fast, quick way of MCPifying your existing APIs. You need to have purpose-built features like tools, semantic tools discovery, fine-grained access control, integration with authorization, and integration with your existing identity provider solutions for authorization, integration with observability, and do not forget the evaluations. This is what you need in order to scale.

Introducing Amazon Bedrock AgentCore Gateway: A Unified Solution for Agent-to-Tool Communication

That is exactly how we built Amazon Bedrock AgentCore Gateway. The Amazon Bedrock AgentCore Gateway offers the single unified communication point. It is a purpose-built AI gateway for all of your agent-to-tool interactions with one-click MCPification of your existing enterprise APIs. You can also MCPify your AWS Lambda functions. You can also integrate your existing MCP servers and unify that in that unified endpoint.

The unified endpoint handles all the heavy lifting of MCP protocol handshakes for you, so you're not in the business of managing and upgrading your containers and servers to new protocol versions. Consider the fine-grained access control and all the problems I shared before. You need to scale as MCP complexity increases. It's completely serverless, pay-as-you-go, and you can also connect from your private VPC and have secure communication with your MCP gateway and your agent core gateway.

Technical Deep Dive: Gateway Architecture and Integration Methods

I'm going to hand this over to Nick to dive deeper into agent core gateway features. Hello, can everyone hear me? Fantastic. My name is Nick Aldridge. I'm one of the principal engineers in the Agentic AI organization at AWS. I've been at Amazon for about seven years and have worked on a bunch of AI products like Amazon Lex and Amazon Bedrock. I led the team that built knowledge bases, and now I helped create Agent Core. I also happen to be one of the core maintainers of MCP and serve on the steering committee of the Agentic AI Alliance.

What I want to do is dive a little deeper into some of the things that Dhawal mentioned and talk about how this works from a technical lens. What you see on the screen is a high-level architecture diagram of the gateway. The gateway is acting as an intermediary, where on the left-hand side you see your agent interacting with other agents, tools, and resources. The gateway is really acting as an intermediary and also as a translator, providing a standard interface for all of those tools, APIs, and agents to talk to. The cool thing is that those tools, APIs, and agents can live on any cloud. They might be on GCP, they might be on AWS, they could be running on EKS, or they could be running on EC2. The gateway is fronting all of those things.

On the left-hand side, you can see the gateway is providing a few operations: list tools, search tools, and invoke tool. These are standard MCP operations. It's also providing a standard authorization interface, so you can decide whether you want to use OAuth or IAM. The gateway will perform that standard authorization check, talk to your identity provider if necessary, and then perform routing to figure out whether the API you need to talk to is running on this server or that server, and it will make the appropriate call with the appropriate credential. The gateway will also perform credential exchange, and all of those calls will have standard logging and auditing.

As a business that wants to solve specific business problems, you can use the gateway to not worry about the integration points. You can use the gateway to give your developers who are building agents that standard interface so they can just focus on agent building, and the gateway can manage the integration points. There are different ways that you can provide these tools, APIs, and agents to the gateway. One way you can provide them is by hooking up a Lambda function. You can hook up any arbitrary Lambda function, and the gateway will be able to expose that Lambda function as an MCP tool or group of MCP tools. The gateway will be able to do the translation necessary to authorize to that Lambda function by assuming an IAM role, all the while you're getting that log and auditability that I talked about. Something similar works for APIs.

When I talk to customers about OpenAPI specs, they often think about these as existing APIs that are production grade. This is not necessarily a spec for a production-grade API. This could be a spec that defines the input to your server that's running on EKS with Kubernetes. It could be an API spec for something that's running on GCP. It could even be a spec for another company's API which you don't own. All of those API specs can be hooked up to the gateway. You can choose from different credential types like API key or OAuth token, and then the gateway will again provide that standard interface to those APIs for your agent.

The third way you can provide APIs and tools is through a Smithy model or Smithy specification. This may not be a familiar term to you all.

But this is how we define APIs at AWS. We use this specification called Smithy. The amazing thing about this integration is that there are 400+ AWS services that offer these Smithy specifications on GitHub. You can actually take those specifications, plug them into Gateway, and get an interface to any of the APIs that those services use. Suddenly your agents can do things like download files from S3 buckets, execute jobs on Transcribe, and do all sorts of other things. This is an incredibly powerful integration and one that a lot of teams in AWS are also very excited about.

MCP Server Integration and Semantic Search: Scaling Tool Discovery

We had those integrations and we were excited about those integrations, and customers told us that was all well and good, but they have some MCP servers in their organization today plus MCP servers outside their organization that they want to connect with. Ideally they want your fancy gateway thing to compose all of them in a single interface. In response to that problem, we launched MCP server as a target.

What MCP server as a target does is it lets you attach any MCP server, as many as you want, to a gateway. We will fill all of their tools in the list of tools alongside the API spec ones and others. When a customer invokes a tool, we invoke the corresponding tool on the MCP server, and we even support search over those tools. When you are listing tools, we do this by caching the tool schema of all of the tools that your APIs and MCP servers expose.

Gateway first caches all of those tools, your agent can ask for the list, and we respond to the list even without making additional downstream calls. The agent can then decide it wants to invoke a tool and Gateway will facilitate that invocation while also doing the appropriate credential exchange. Gateway will also offer search, so it will rely on this cached copy of tools to create an internal index which it is able to search in real time in response to an agent query. What this unlocks is that rather than giving an agent a list of a few hundred tools, you can actually give it just the appropriate list of the ten most relevant tools.

There are two ways that Gateway actually does this synchronization of your tool set, and the entire fact that we cache tool sets is something that is based on feedback from customers as well. Customers told us they do not want the tools to be dynamically changing under the hood. They want those tools to be static. They want to control which tools are exposed, when they are exposed, and what the interfaces are. They want to evaluate those interfaces on their agent and make sure they are getting the appropriate accuracy. So we cache those tools.

One way we do that is implicit synchronization. Whenever you change a target or whenever you update a target, we cache the tool set. The other way that we let you do this is something called explicit synchronization where you can explicitly invoke this API and we will look at all of the tools available on your MCP server and cache them so that they can be used in search and listing.

Now, to dive a little deeper on search, a lot of customers are building MCP servers with a lot of tools. At the beginning, a bunch of you raised your hand that you had dozens of tools in your MCP servers, and that will likely increase as you continue to scale up your applications. You will have more and more tools and what we found from customers is when they had more and more tools, it was bloating their context window.

They had these agents which suddenly had like 300 tools and were spending hundreds of thousands of tokens of context on just loading up the tools, which was not only high latency and very expensive but also impacting accuracy. The agents were not able to orchestrate effectively over so many tools. So we introduced this thing called search. We may have even been first to market with this feature where we allow you to call a search tool and get back the ten most relevant tools. This drastically reduces latency and cost and improves accuracy. It is also pretty remarkable that we are able to do this across so many targets.

We are managing all that heavy lifting of pulling in all the tools from all of the different targets so that you can search them together. Otherwise, if you hook up many servers to your agent, it will have to do search across them, which is complicated.

Security and Connectivity: VPC Integration and Authorization Models

The other thing that customers told us is that if they are going to have a unified layer that integrates with all their agents, tools, and resources, then they absolutely need this to connect to their VPC. We quickly launched private link integration so that you can communicate with this gateway over the secure AWS network backbone, ensuring that traffic never goes out to the public internet to reach this gateway.

As important as secure inbound connectivity is, secure outbound connectivity is just as important. I want to be able to access APIs that may live inside of a VPC. With our Lambda integration, you can call any API in any VPC and have that call go over the secure AWS network backbone so you can have this entire lifecycle of requests that never actually goes over the public internet.

The reason I think Gateway is so amazing is that I really believe the protocols are going to become like TCP and UDP. In other words, developers are not going to be diving deep into protocols like MCP in the future. I want my stack to be continuously cross-cloud and cross-framework, and I want an infrastructure product that can solve all of that integration stuff. In this picture, you can see just how powerful Gateway is. We have hooked up AWS services, a private MCP server running in a VPC, a private agent running in a VPC, another Agent Core Gateway that another team in our organization has built, public MCP servers which we may have discovered online, and APIs, some of which may be running on Agent Core runtime or maybe running on EC2, and we are able to expose all of that in a single interface.

Diving a little deeper into how we do this authorization, which is part of the magic, we offer two types of ingress authorization. One is OAuth, and the awesome thing about the OAuth integration is that it uses your proprietary identity provider to do authorization. When a caller tries to access a gateway, the gateway will say you need to go talk to authorization server X, which you own, and then the client can go talk to that authorization server, get an access token, and then make calls through the gateway, which the gateway will authorize against your identity provider. So you have this complete loop which can be using your proprietary identity provider instead of IAM.

The other approach for those of you that are deep in the IAM ecosystem is to use IAM. Gateway supports IAM and we really did this in response to customers who were saying they love IAM today and use it for a bunch of their machine-to-machine communication. We said that is fine. We know MCP does not support IAM today, so we added IAM support to the gateway and we actually launched a secure proxy that you can run on the client side to make your MCP client IAM or SigV4 compatible, and you can discover that on GitHub as well.

The other thing that customers started telling us was that they want to expose gateway in front of their real end users, and those end users may not be OAuth. They might want to make this thing public. We actually ended up launching a mode of gateway that does not have OAuth. In fact, many of you may have seen that AWS launched an MCP server for our documentation. It uses no-OAuth gateways to make that public documentation available. We will also see later how you can integrate this with a custom Lambda authorizer to run your custom auth only.

On the egress side, I mentioned that we do secure credential exchange. On the egress side, what Gateway is essentially doing is picking the correct token to call the downstream by calling our Agent Core Identity service, and that service is responsible for securely storing your credentials, whether those are API keys, tokens, and so on, and also caching the actual access tokens so that when you make those downstream calls, you have the right credentials available.

Interceptors: Enabling Fine-Grained Access Control and Custom Logic

This eliminates the overhead of continuously asking your users again and again for access to the same downstream API. Now that was all well and good, but when we talked to most of our customers, they said they have fine-grained access control logic they want to run. They have multi-tenant scenarios that require them to implement some custom logic. They want to prefill some parameters and not prefill other parameters. In response to this overwhelming demand, we launched last week something called Interceptors.

Interceptors are groundbreaking because they allow you or your central teams to run some standard logic in between the agent invocation and the tool invocation. This allows you to do things like fine-grained access control. It allows you to do things like filtering of parameters or schema translation. It's an incredibly powerful feature and it's powered by Lambda today. A request comes in and we first pass it through this Lambda, which gets the MCP request and outputs another MCP request which can be totally mutated. The Lambda can even say that this caller is not authorized to access this tool. If the request is authorized, we'll actually invoke the target process, get back the result, and send the result through the Lambda as well. You can choose whether to do this or not, but on the result side, you can also do filtering or inject additional parameters. It's an incredibly powerful feature.

This is what it would look like if you just did it for fine-grained access control. I'm going to run that Lambda. I will look at the caller information which we provide. I'll look at the tool name and I'll say, okay, this caller is in the finance group and they're trying to access a non-finance tool—denied. Or this caller is allowed to access this tool, so I'll let it through. They can also do this in conjunction with ListTools. When they use this feature in conjunction with ListTools, they can decide independently whether they want to run the Lambda on ListTools or not. In this case, we're showing they're not running the Lambda on ListTools, so the list of tools is sort of public and the Lambda is running only on invocation, essentially only preventing you from incorrectly invoking a tool which you don't have access to.

Alternatively, you could run this Lambda on ListTools as well. You could actually filter out tools proactively that the caller will not have access to, or you could proactively remove parameters which you don't want the agent to fill. It gives you a totally newfound control over the interface you are providing to the agent, and you can make it totally custom. This also works for search, so you can also filter that search list so that it's only the tools which the caller has access to. Many customers are interested in using this for things like token propagation. They want to actually inject their own downstream token and take over some of that secure credential exchange. Your Lambda can add this authorization header with a new token so that they can do federated calls downstream.

This is a real game changer and is really helping customers to actually take this thing live in production. The last thing I want to quickly mention is that we are continuously hearing from customers that setting up your tool set to expose it to an agent is actually pretty hard. Nobody wants to expose their APIs as is. Everybody wants to write some code on top of them and then you have to edit your descriptions and figure out the right format and all this stuff. We actually built a custom agent which helps you do that. This custom agent, which you can find on GitHub, walks you through the process of getting a schema which is useful to an agent, which is valid right from whatever internal schema you may be using and which can actually be integrated in the gateway. This really simplifies the onboarding process.

Sumo Logic's Journey: Building Dojo AI for Security Operations with AgentCore

I know that was a lot of information, and with that I want to turn it over to Kui to talk a little bit about how Sumo Logic has deployed this in production and how they're solving some of these problems for real. Thanks. Thank you, Nick. Great presentation to kick off the use case of AgentCore Gateway as well as dive into the nuts and bolts of how Agent Core Gateway works.

For the next 15 minutes or so, I'm going to take you through our journey to Agentic AI using Amazon Bedrock Agent Core.

Let me start by asking a few questions. How many of you have heard of or used Sumo Logic before? How many of you have solved reliability and observability issues with your workloads? How many of you have worked with security investigations, things like triage or investigation for vulnerabilities? That's a good ratio here. For all the agentic AI builders, this is our experience that I'd like to share with you.

Before I show the demos, I want to give you a brief highlight about Sumo Logic. We are a leading cloud-scale intelligent operations platform, and we were born in the AWS cloud 10 years ago. Right now we are running on 10 AWS regions worldwide. To give you a sense of our scale, we ingest multi-petabytes of new telemetry data every day and scan multi-exabytes of data every day. That gives you a sense of the scale Sumo Logic can handle. We have thousands of enterprise-grade customers, and the demo I'm going to show you is how our agent works.

I'm going to show you two parts of the experience. The first part demonstrates how several autonomous investigation agents work simultaneously. Secondly, I want to show you how to leverage your everyday apps beyond the Sumo Logic console, and how the agent works seamlessly with your day-to-day applications thanks to Agent Core Gateway. Let's show the demos. Can everyone hear the voice? Good.

Our security operations with Sumo Logic Dojo AI running on Amazon Bedrock Agent Core is displayed here. We have the Sumo Logic console where a security analyst has fire alerts. Before performing any manual investigation, the SOC analyst agent has automatically triaged each insight with a verdict assigned such as benign, suspicious, or malicious. Jumping into an insight, AI investigation details are immediately surfaced. This provides concise context to help understand the event, all thanks to the SOC analyst agent. Along with the AI verdicts, the agent recommends an adjusted severity level using evidence-backed reasoning from insight data, signals, and enrichments.

By synthesizing the related signals, a concise narrative-style summary is generated, allowing analysts to quickly understand what happened, who was involved, and why it matters. Specific detailed key findings are also presented, providing analysts an immediate sense of threat impact. When deeper analysis is required, the SOC analyst agent supports hypothesis-based investigation using your log data and the help of Mobot launching directly from the insight. We can begin to use natural language to investigate further. Notice that the context of our incident is already inherited.

To dive a little deeper, we will ask Mobot to run a quick investigation. Leveraging our security data, the agent will begin to automatically execute its own analysis by running several search queries and summarizing its findings. With the help of the SOC analyst agent running on Agent Core, we can bring agentic AI reasoning directly into your security operations workflows and enable analysts to effectively capture the information they need. Here we have another cloud sign insight that has notified us in Slack.

Before requesting more details about this event, we will take a look at what Sumo Logic tools are available for our MCP server by prompting it directly from our Slack thread. In the background, our Slack application is leveraging an orchestrator agent hosted on Agent Core runtime, which then communicates with Agent Core Gateway via the standard MCP protocol. The Agent Core Gateway is fully managed and security-integrated with Sumo Logic platform APIs and Dojo AI agents. Several actions are ready to use, like capturing more details related to insights and running queries.

Since multi-tool calls are supported, we will ask the mobile agent to provide any triage details that the SOC agent generated and update the status of this insight to in progress. We quickly see our triage details summarized and the response confirms that our change has been reflected. To capture more details, we will request a list of related entities tied to this event. Here we can see any user account, IP, device information, and other entities associated with this insight. We can even run a search directly into our raw logs in Sumo Logic and see if other users may have experienced similar activity in the last three days. Upon retrieving the results from our log search, we see that other users have in fact been impacted. To wrap this up, we will add a comment to the insight saying that other users appear to be affected. Hopefully this gave you a clearer picture of how you can leverage Dojo AI either directly in the Sumo Logic console or from within your favorite apps and tools, all thanks to the help of Amazon Bedrock AgentCore.

Sumo Logic Platform Architecture: Multi-Agent System Design and Integration Patterns

That was a pretty interesting demo, wasn't it? For the next couple of slides, I'm going to show you how we build this. Let me start with the Sumo Logic platform. From the left-hand side, you can see we ingest our customers' data and telemetry data regardless of where they sit, whether in Amazon Cloud, Google Cloud, or whatever cloud you're using. We seamlessly ingest this at a super high scale into our platform, and then we have a single pane of glass to operate, monitor, and analyze your data for your applications.

In the middle, as you can see, we have a rich set of analytics toolsets. We automatically convert the raw telemetry data into meaningful insights and recommendations so the team can make decisions. On the right-hand side, we have hundreds of out-of-the-box integrations for actions such as reporting, ticketing, and remediation or containment. Last but not least, there is the top tier which I highlighted: the Dojo AI. It's the multi-AI agent system that we're building and launching this year. So what is Sumo Logic Dojo AI?

Dojo is a Japanese word that demonstrates a collection of intelligent personas living and working together and learning their skills as they go. I think that's a perfect fit for our agentic AI vision. Dojo AI in a nutshell is a multi-AI agent system for proactively doing security and incident analysis and response. There are four important principles when we build Dojo AI. First, we treat the Dojo AI agent as a digital teammate for the existing security operations teams. They can work seamlessly with humans and other agents using natural language and multi-turn conversational natural language interfaces.

This is very important and why we decided to use agentic AI versus just vanilla generative AI. We want each of our agents to be fully context-aware so they can seamlessly share and update context between agents. The agents are also smart enough to automatically make decisions about which tool to use and discover tools as we go. The agents can access very important domain knowledge in our use cases, so they can dynamically access domain knowledge and be grounded with domain knowledge. Last but not least, all our agents are secured by design. Let me show you the architecture of how this whole thing looks.

Let me start with the Sumo Logic platform tier. There are three pillars of resources exposed as an MCP-based resource. The first one is what we call the platform APIs. We have hundreds of open APIs and public APIs that expose the functionality of the core data platform. In the middle, this is the new family of Sumo Logic platforms we call the Dojo AI foundries. As you see at this re:Invent, we announced three agents. They are our query agents that can seamlessly translate natural language into highly scalable DSQL, which we call the Sumo Logic query language. This is where we are able to run natural language questions from Slack or from our console and then directly compile into the query.

Results can be brought back and summarized in natural language as well. The summarization agent is another one where we see G88, and then the interesting part about the more autonomous stock analyst agent is where you see in our triage agent as well as our investigation agent. You can fire up a pretty high-level question to say run a quick investigation for me or run a deep investigation, and then the agent behind it, leveraging the reasoning models, is able to create an investigation plan and then generate actionable steps. Each step can automatically run by our platform and bring the result back. The human is also in the loop about this, which makes these things very interesting.

The third pillar is our MCP servers. So why we need MCP servers, we see there are more and more new contexts during the investigation. We want to be able to expose them as a resource to the other tiers. This is pretty much our Sumo Logic platform tier. Now let's look at our client tiers. The client tier is quite interesting as well. In our default personas, we have developers or the so-called SREs. They are looking at the code and then they want to be able to, without leaving their own IDEs, whether it would be VS Code or Cursor or Cloud Code, directly leverage our platform APIs or use our AI agent remotely through the MCP protocol, which is pretty interesting.

The other part of the personas is our so-called IT analysts and security analysts, and they want to stay within their favorite Slack tool, for example. They want to be able to do the same thing. How can I reach the powerful Sumo Logic platform to build integrations and use cases? Each of those customer types is also building their own customer AI agent. Last but not least, look at our Sumo Logic platform tiers or the partner tiers. More and more partners are also building their own MCP servers that became publicly addressed and security managed, but then we can enable the end-to-end integration using this. So we're missing something here. How to make this super complex multi-tier cross-functional thing work together? How to enable agent-to-agent collaboration using MCP?

The answer is Agent Core Gateway. There are at least two integration patterns enabled by this architecture. The first one we saw is coming from the client side. The client shall be able to reach out to Agent Core Gateway using the standard MCP protocol to leverage either our API or our Dojo AI agents or reach out to our MCP servers. The other way is also quite interesting. Our agent using the same patterns shall be able to reach out to our rich ecosystem of partner tier, leveraging, for example, Amazon has launched nearly hundreds of MCP servers and there are some multi-hundreds of very rich tool sets, and similar for our Threat Intel partners, similar for our action automation partners. So our Dojo AI agent can now leverage their rich ecosystem to build a very rich integration.

Three Integration Patterns and Business Impact: APIs, Agents, and MCP Servers as Tools

Next, I'm going to take you down 11 levels. I want to show you three patterns to how to make these things happen. The first thing, let's start with the client here. As I said, we have a Slack app, and Slack is not super friendly with AI-native things like Cursor or VS Code, so we built a client-side AI agent just to emulate how our customers are building their own custom AI agent. This AI agent needs to be hosted on Agent Core Runtime, which is a great way to simplify a lot of things.

Why we need this agent with the tools? The whole topic for today is about using reach that over to chains. There are three important capabilities. We use LangGraph to build this orchestration agent. The first one is called the planner nodes, and then we have execution nodes. We also have summarization nodes. Now let's start to jump in. My first pattern is what we call the API as a tools. So typically there are two types of APIs. One is the API exposed through OpenAPI protocols, and Agent Core Gateway happens to natively support this using OpenAPI target. And you have another type, perhaps a more older or legacy API.

For legacy AI endpoints, such as those using GraphQL, we can use a Lambda Translator in between to translate the protocols and then wire it up with AgentCore Gateway using the Lambda target. That's the first pattern: Expose API as tools.

The second pattern I want to mention is Expose AI agent as tools. This is where you see a number of AI-native agentic AI agents. Using simply a Lambda target, you can expose the functionality through the Lambda target under AgentCore Gateway and then expose them to the client tier agent. The client tier agent will be able to discover the new AI-native functionality through this pattern.

The third pattern I also want to highlight is interesting because we see growing demand for MCP servers, either as third party or first party. In that case, AgentCore Gateway has done a fantastic job to do this piece of heavy lifting for us so we don't have to reinvent the wheel. They have native MCP target support, and even better, they have fine-grained access control. So in the case you want to provide more custom access control policies, the fine-grained access control is your friend. Those are the three powerful patterns I want to offer with this audience.

To wrap up, I want to highlight that we already adopt several important AgentCore Gateway features and the patterns on the left. We are in the process of working with the wonderful service teams from AWS Bedrock AgentCore to evaluate and test all the fine-grained access control. We're also working closely on end-to-end OOS integrations. We are also exploring for next year more advanced features, how to leverage AgentCore memory features because it's very important to share memories and share context across agents in our Dojo agent families. We're also exploring how to enable more advanced collaboration using the A2A protocols. There's a lot of exciting things coming up.

To wrap up my talk, we found AgentCore is a great way for us as agentic application builders. We can now focus on our core value proposition: building the agentic stack of intelligence for our customers. To summarize the business impact by shipping Dojo AI via AgentCore, we are able to see about 50 percent faster analysis time, up to 75 percent reduction in so-called MTTR (mean time to resolution), and for that we see millions of dollars in savings from incidents. With that, if you want to see more demos, we have a live demo booth in the expo. Feel free to explore. Our demo team will show you live demos and you can try to ask all kinds of tricky questions. Our agent will be able to give you a surprise.

Best Practices and Getting Started: Security, Performance, and Tool Management at Scale

Now let's go through the key takeaways and some of the best practices I want to share as you scale your MCP tools and agents. Always start with your user queries and start with the agent goals. Decompose your agentic goals into agentic tasks and then start working backwards from identifying the right set of enterprise APIs, data stores, and then start MCPifying them. Do not convert all of your enterprise APIs into MCP tools. The agents are autonomous and nondeterministic, and you want to make sure these agents are invoking the right tools based on your user queries and agentic goals.

When you create a new MCP tool, make sure you select and create such a type of tool. Maybe consolidate multiple APIs into a targeted MCP tool that is aligned to your agentic tasks. Now there are lots of questions about how to improve the performance of the MCP tools and overall agents. Well, the number one thing that we must take into consideration is how effective your MCP tools are in driving the accuracy and reducing the hallucinations.

While delivering business value and achieving agent goals, it is extremely important to have a continuous feedback loop. You need to understand how well your agents are invoking your tools, how many times these agents are invoking your tools, and what impact these tools have on accuracy. This feedback loop must be closed, and the way to do it is to have the metrics of the MCP tools streamed into the agent core observability. You need end-to-end agent core observability with traceability and a full set of metrics to understand how well your agents are invoking these tools.

Now, talking about performance, the performance of AI agents depends on various different factors, but one of the major factors is the LLM context. One of the major complaints from our customers is that their agents are latent and not responding on time. The first question I ask is how many MCP tools does your agent have, and how much context size each tool brings in and accumulates as an overall LLM context size. What does your MCP tool description look like? When I see that it is pages and pages of MCP tool descriptions, I recommend being very cautious about how you describe your MCP tools. You should prompt engineer these tools because tool descriptions are extremely important in driving accuracy. However, they can also introduce different types of attack vectors like command injection, tools poisoning, command control, or different types of prompt injection attacks. Make sure that you are very deliberate about what you put there as part of the tool's description.

By reducing the LLM context size and making sure that you have effective, right-sized tool descriptions for all of yours, you can improve performance. There are many other factors contributing to overall agentic performance, but I am focusing on tool-specific recommendations for now. There are other things that you should also consider while scaling with your MCP tools. Security is job zero, and when it comes to the security of agents, it is extremely important because they are autonomous and nondeterministic. There are different ways of handling security for agents, including the delegation model and the impersonation model.

The impersonation model is a well-known model where you trust an upstream service and then call downstream services by sharing credentials in a typical microservices world. However, that will not work for agentic AI because these agents are autonomous and non-administered. You need to have fine-grained access control with a delegation model where your agents are acting on behalf of the end user's persona. They assume the role dynamically from the end users and then act on behalf of the user without exchanging the end user's credentials or JWT tokens to the downstream systems. Make sure you have this act-on-behalf-of delegation model in place.

In terms of fine-grained access control, use gateway interceptors to have fine-grained access control of who is able to call which tool dynamically. Customers tell me that they have a growing number of agents and tools and ask how to scale. The first thing I ask is whether they have a single source of truth of their enterprise-approved tools and agents in a single place. This is where the agent registry and tools registry come into play. Whenever your developer pushes a new tool or a new agent, have the development and deployment pipeline to your registry or agent registry that does tools checking statically or dynamically when the tools are being executed.

When I say static tools checking, check for security vulnerabilities, the confused deputy problem, command injection, and tools poisoning. Make sure you use the semantic search capability that Nick introduced, which is a native feature within the agent core gateway. When your number of tools grows, your list tools and MCP list operations need to be optimized.

You're going to have large payloads being sent to your MCP clients. Use semantic search to take care of that and improve the performance of your MCP tools. There are lots of other best practices that I want to quickly project. The importance of having a single source of truth for tools and agents registry and bringing multi-tenancy into play. Use gateway interceptors to isolate these tenants if you're sharing the agent core gateway across multiple tenants.

Here's the getting started guide. You can use AWS Console, Boto3, and AWS CLI. I really recommend you try the Agent Core starter toolkit. It's the easiest way to get started with Agent Core Gateway and integrating with other Agent Core primitives as well.

There are about fifty plus resources that you can go and check out. It's not just about the tutorials. It's also end-to-end use cases. You'll find lots of use cases out there in this repository. And then obviously, what Nick introduced about the schema repair agent, do check that out. It'll help you in MCPifying your existing OpenAPI APIs and make sure that your schemas are valid.

Thank you very much. Do not forget to visit the Sumo Logic booth and I really appreciate the time. If you have any questions, we're going to be available outside of this hall. I'm happy to talk and happy to answer your questions. Thank you very much.

; This article is entirely auto-generated using Amazon Bedrock.

DEV Community