Overview
In this post, I share how to build an AI agent that leverages logging tools such as Splunk to investigate transactions, identify errors, and analyze application logs. Splunk is a critical component in enterprise environments, enabling organizations to gain real-time visibility into their systems and quickly diagnose issues at scale. This agent is developed using the Strands Agents SDK and integrates the MCP (Model Context Protocol) to connect to Splunk, allowing it to query logs programmatically. The agent is deployed using AgentCore Runtime, a serverless environment purpose-built for hosting AI agents. We will also use AgentCore Observability to trace the agent's execution flow and verify that Splunk queries are executed correctly.
Installing the Splunk MCP Server
Model Context Protocol (MCP) has transformed the agentic AI landscape by enabling agents to connect to tools seamlessly and securely. Although you can create your own MCP server implementation, several companies nowadays provide their own implementations which adhere to the specification set by the protocol. Splunk is one such provider, offering its own MCP server, which has various tools such as generating SPL queries, retrieving index information and even has a tool that converts natural language questions into SPL queries. Fortunately, hosting the MCP server on our own would not be needed as Splunk has several options to install the server on their platform. Splunk Cloud ships the MCP server automatically in AWS regions, while Splunk Enterprise requires manual installation. This section covers the latter process.
First, download the Splunk MCP server from Splunkbase. This will be saved as a compressed TAR archive file. Remember the location of this file. Next, log into the Splunk Enterprise Platform, install the MCP server by navigating through Apps > Manage Apps.
After which, click the Install App From File button and select the Splunk MCP Server .tgz file.
After successful installation, the Splunk MCP Server will appear in the Apps page. The app page displays the MCP server status, endpoint URL, and configuration details.
With the MCP server running, it's ready to be integrated with an AI agent. To connect and list tools, generate an authentication token with a role that has the mcp_tool_execute capability. Note that in production environments, ensure tokens are scoped minimally and stored securely.
Testing the Splunk MCP Server via the Inspector
To verify connectivity before integration, we test the connectivity of the server by using the MCP Inspector.
We can copy the configurations shown on the Splunk MCP server app page but also add the token we've generated to be used as the Authorization header. When we’re sure the configurations are correct, we can list the tools and run a tool to test our server is indeed running and connection is stable.
Developing the AI Agent
Now the fun part is building the Agent itself. I’ve used the Strands Agents SDK as the agentic orchestration framework as it's easy to learn and use to develop even complicated agents. It is simple enough that all you need is the model, the tools and a prompt to create an AI agent. Additionally, it supports multi-agent patterns, MCP tool execution, memory management, and integrates seamlessly with Amazon Bedrock as an AWS-native framework.
Creating the MCP Client
The first step, though is to create an MCP client class that will bridge and manage the connection to the MCP server.
from __future__ import annotations
from contextlib import AbstractAsyncContextManager, asynccontextmanager
from typing import AsyncIterator
import httpx
from anyio.streams.memory import MemoryObjectReceiveStream, MemoryObjectSendStream
from mcp.client.streamable_http import GetSessionIdCallback, streamable_http_client
from mcp.shared.message import SessionMessage
from strands.tools.mcp.mcp_client import MCPClient
StreamTuple = tuple[
MemoryObjectReceiveStream[SessionMessage | Exception],
MemoryObjectSendStream[SessionMessage],
GetSessionIdCallback,
]
class SplunkMCPClient:
def __init__(self, base_url: str, token: str, *, timeout: float):
if not base_url:
raise ValueError("SPLUNK_MCP_URL is required")
if not token:
raise ValueError("SPLUNK_TOKEN is required")
self._base_url = base_url
self._token = token
self._timeout = timeout
@asynccontextmanager
async def open_transport(self) -> AsyncIterator[StreamTuple]:
async with httpx.AsyncClient(
headers={"Authorization": f"Bearer {self._token}"},
timeout=self._timeout,
) as client:
async with streamable_http_client(self._base_url, http_client=client) as streams:
yield streams
@asynccontextmanager
async def open_session(self) -> AsyncIterator[MCPClient]:
def factory() -> AbstractAsyncContextManager[StreamTuple]:
return self.open_transport()
client = MCPClient(factory)
with client:
yield client
def list_tools(self, client: MCPClient):
tools = []
pagination_token = None
while True:
page = client.list_tools_sync(pagination_token=pagination_token)
tools.extend(page)
pagination_token = page.pagination_token
if pagination_token is None:
break
return tools
Its constructor takes a base URL, bearer token and timeout settings, which are various configurations needed for the Splunk MCP server.
The open_transport and open_session methods work together to establish and manage a connection using the Streamable HTTP transport method.
The former method creates the raw HTTP client with authentication headers, while the latter wraps it in an MCPClient instance that the Strands SDK can use, which handles automatic cleanup via async context management.
The list_tools method accepts an MCPClient instance and queries the server for available tools, handling pagination automatically to return a complete list of tools from the Splunk MCP server.
Creating the SplunkAgent class
Now that we have the MCPClient, we can now move on to creating the SplunkAgent class which orchestrates the interaction between a Strands agent and the Splunk MCP server.
import logging
from strands import Agent
from strands.models import BedrockModel
from src.splunk_mcp_client import SplunkMCPClient
class SplunkAgent:
def __init__(
self,
model: BedrockModel,
mcp_client: SplunkMCPClient,
system_prompt: str,
):
self.logger = logging.getLogger(__name__)
self.model = model
self.mcp_client = mcp_client
self.system_prompt = system_prompt
async def invoke(self, prompt: str) -> str:
if not prompt:
return "Missing prompt"
try:
async with self.mcp_client.open_session() as mcp_client:
tools = self.mcp_client.list_tools(mcp_client)
agent = Agent(
model=self.model,
tools=tools,
system_prompt=self.system_prompt,
)
response = agent(prompt)
return response.message["content"][0]["text"]
except Exception as exc:
self.logger.exception("Agent invocation failed")
return f"Error: {exc}"
The class above accepts three injected dependencies: a BedrockModel from strands.models, a SplunkMCPClient that was just described on the first step, and a string which contains the system prompt for the Splunk Agent to use.
The invoke method orchestrates the workflow: it opens a session with Splunk, retrieves available tools, creates a Strands agent, and invokes it with the user prompt. As you can see, the creation of the agent itself is relatively straightforward, with only the list of tools, the model and the prompt needed, but this can be highly customizable as well with more arguments.
Wiring it all together with Bedrock Agentcore
The app.py file is the entry point where all components are instantiated and wired together. Environment variables are loaded first (not shown here for obvious reasons but the load_environment() call ensures they're available before components access them), then a BedrockModel, SplunkMCPClient, and SplunkAgent are created with their dependencies injected.
import logging
from bedrock_agentcore.runtime import BedrockAgentCoreApp
from strands.models import BedrockModel
from src.config_loader import load_environment
load_environment()
from src.settings import (
BEDROCK_MODEL_ID,
MCP_HTTP_TIMEOUT,
SPLUNK_MCP_URL,
SPLUNK_TOKEN,
SPLUNK_WEB_URL
)
from src.splunk_mcp_client import SplunkMCPClient
from src.splunk_agent import SplunkAgent
logging.basicConfig(level=logging.INFO)
model = BedrockModel(model_id=BEDROCK_MODEL_ID)
mcp_client = SplunkMCPClient(
base_url=SPLUNK_MCP_URL,
token=SPLUNK_TOKEN,
timeout=MCP_HTTP_TIMEOUT
)
agent = SplunkAgent(
model=model,
mcp_client=mcp_client,
system_prompt=f"""
You are a splunk investigator agent that has access to splunk logs regarding a payment microservice written in java.
You also have tools at your disposal to run SPL queries through Splunk. Given a transaction reference number, search through the logs related to
the user query. Always use the payments index. (i.e. index="payments").
Create a brief investigation/analysis report.
Do not include sensitive data (such as card data or technical links) on your investigation report.
At the end of your response, recreate the Splunk URL. The Splunk base url is {SPLUNK_WEB_URL}.
Do not offer specific recommendations or next steps unless asked to. Your tone should be neutral.
""".strip()
)
app = BedrockAgentCoreApp()
@app.entrypoint
async def invoke(payload, context=None):
prompt = payload.get("prompt", "")
return await agent.invoke(prompt)
if __name__ == "__main__":
app.run()
The app.entrypoint decorator is essential as it registers the
invoke function as the handler for Bedrock AgentCore. When the runtime receives a request, it calls this handler, which extracts the prompt and delegates to the invoke of SplunkAgent.
The BedrockAgentCoreApp() instance integrates the agent code with the AgentCore Runtime. It registers the entrypoint handler and enables the containerized application to process user inputs and maintain context within AWS's managed runtime environment.
And that's it! We now have a working AI agent that can talk and analyze logs from Splunk.
Deploying to Bedrock AgentCore Runtime
Now that we have our code, and after locally testing it, we can deploy to Bedrock AgentCore Runtime. AgentCore Runtime is a serverless and secure way of hosting AI agents in AWS. It provides a managed infrastructure that handles containerization, scaling, context and session management, and integration with AWS services like CloudWatch for monitoring. Agents deployed to AgentCore Runtime are immediately available for invocation and can process requests at scale without requiring manual infrastructure management.
Agents can be deployed using various methods, including the AWS CDK, IaC tools like CloudFormation and Terraform, or programmatically with the Bedrock AgentCore starter toolkit. For simplicity and demonstration, this guide uses the AgentCore CLI, which wraps the toolkit's functionality into convenient CLI commands.
The first step is to first use the agentcore configure command to set up the agent's deployment configuration. This captures essential settings including the entrypoint file, agent name, requirements file, IAM execution roles, ECR repository details, deployment type (container or direct code deploy), memory and other configurations. It generates a
.bedrock_agentcore.yaml file that serves as the blueprint for all subsequent deployments, which eliminates the need to specify these parameters repeatedly during the launch phase.
The next step is to deploy the agent via the agentcore launch command. This command reads the configuration from
.bedrock_agentcore.yaml and initiates the deployment pipeline using CodeBuild by default, which builds the container image in the cloud without requiring local Docker. Once the build and provisioning complete, the agent is deployed to the AgentCore Runtime and is ready to receive requests. We are then shown a Deployment Success message on our terminal to let us know of various information such as the agent name, ARN the GenAI Observability Dashboard which can be used for traceability and monitoring.
To test the deployed AI agent, we can use the agentcore invoke command. This command sends a JSON payload containing the user's prompt to the deployed agent on AgentCore Runtime.
By invoking the agent, we can see that it successfully produces a response after retrieving and querying Splunk via the MCP server. The response is based on logs from a demo microservice simulating payment transaction logs.
We can confirm this by going to the AWS Management console and viewing AgentCore Runtime page to see the list of deployed agents, and we can see the agent we just deployed in Ready status.
Tracing Agent calls with AgentCore Observability
To further confirm that the agent called the Splunk MCP Server and ran an SPL query, we can examine AgentCore Observability. It provides detailed visualizations of each step in the agent workflow, so that we can inspect the execution path and intermediate outputs. The CloudWatch console provides trace visualizations, performance graphs, and error breakdowns, allowing us to verify that the MCP server was invoked, trace the exact SPL queries executed and verify the agent response.
This can be viewed under CloudWatch's GenAI Observability: Bedrock AgentCore Observability dasboard.
From the trace, we can see that the agent indeed call the run_splunk_query tool from the Splunk MCP Server. Along with this trace, we can see other details such as metrics, the model used and the agent steps/timeline.
Conclusion
This demonstrates how MCP, the Strands Agents SDK, and Amazon Bedrock AgentCore can be combined to build an investigative AI agent that queries and analyzes application logs in Splunk. We showed how to connect to the Splunk MCP Server, deploy the agent using AgentCore Runtime, and trace tool execution and agent behavior using AgentCore Observability in CloudWatch.










Top comments (0)