DEV Community

Cover image for Building Production-Ready AI Agents with LangGraph and Amazon Bedrock AgentCore
Danilo Poccia for AWS

Posted on

Building Production-Ready AI Agents with LangGraph and Amazon Bedrock AgentCore

In this fifth and final deep dive of our multi-framework series, I'll show you how to build a production-ready AI agent using LangGraph and deploy it using Amazon Bedrock AgentCore. The complete code for this implementation, along with examples for other frameworks, is available on GitHub at agentcore-multi-framework-examples.

LangGraph takes a different approach to agent workflows. Rather than linear chains of prompts or simple tool loops, LangGraph models agent behavior as a state graph where nodes perform actions and edges define transitions. This graph-based approach enables control flows with cycles, conditional branching, and human-in-the-loop interactions—capabilities that work well with the AgentCore persistent memory system.

LangGraph makes state management explicit—every node in the graph receives the current state, transforms it, and passes it along. Having explicit state makes debugging easier and helps integrate with AgentCore Memory for state persistence across sessions.

Setting Up the Development Environment

I'll begin by navigating to the LangGraph project in our repository:

cd agentcore-multi-framework-examples/agentcore-lang-graph
uv sync
source .venv/bin/activate
Enter fullscreen mode Exit fullscreen mode

The project builds on the LangChain ecosystem with these dependencies:

langgraph                # Graph-based agent framework
langchain                # Core LangChain library
langchain-aws            # AWS integrations including Bedrock
langchain-tavily         # Tavily search integration
bedrock-agentcore        # AgentCore SDK
bedrock-agentcore-starter-toolkit  # Deployment tools
Enter fullscreen mode Exit fullscreen mode

Understanding the LangGraph State Machine Architecture

LangGraph introduces a fundamentally different way of building agents through its StateGraph abstraction. Instead of imperatively calling tools or chaining prompts, I define a graph where each node represents a step in the agent's reasoning process.

Defining Agent State

The foundation of any LangGraph agent is its state definition. I use a TypedDict to define what information flows through the graph:

from typing import Annotated
from typing_extensions import TypedDict
from langgraph.graph.message import add_messages

class State(TypedDict):
    messages: Annotated[list, add_messages]
Enter fullscreen mode Exit fullscreen mode

The add_messages annotation is special—it tells LangGraph to append new messages to the list rather than replacing it. This creates a growing conversation history as the graph executes, maintaining context throughout the workflow.

Building the Graph

The graph construction follows a declarative pattern that clearly shows the agent's decision flow:

from langgraph.graph import StateGraph, START, END
from langgraph.prebuilt import ToolNode, tools_condition

# Initialize the graph with our State type
graph_builder = StateGraph(State)

# Configure the LLM with tools
llm = init_chat_model(
    "us.anthropic.claude-3-7-sonnet-20250219-v1:0",
    model_provider="bedrock_converse",
)
llm_with_tools = llm.bind_tools(tools)

# Define the chatbot node
def chatbot(state: State):
    return {"messages": [llm_with_tools.invoke(state["messages"])]}

# Add nodes to the graph
graph_builder.add_node("chatbot", chatbot)
tool_node = ToolNode(tools=[tool])
graph_builder.add_node("tools", tool_node)
Enter fullscreen mode Exit fullscreen mode

The init_chat_model function provides a unified interface for initializing chat models across providers. By specifying model_provider="bedrock_converse", I'm using Amazon Bedrock's Converse API, which provides consistent behavior across different foundation models.

Conditional Edges and Control Flow

The control flow in LangGraph is defined through its edges:

# Add conditional edge from chatbot
graph_builder.add_conditional_edges(
    "chatbot",
    tools_condition,
    {"tools": "tools", END: END},
)

# Add edge from tools back to chatbot
graph_builder.add_edge("tools", "chatbot")

# Add edge from START to chatbot
graph_builder.add_edge(START, "chatbot")

# Compile the graph
graph = graph_builder.compile()
Enter fullscreen mode Exit fullscreen mode

The tools_condition is a pre-built function that examines the chatbot's output. If the model called a tool, it routes to the tools node; otherwise, it ends the conversation. This creates a loop where the agent can make multiple tool calls before providing a final answer.

Integrating Tavily Search

For this implementation, I've integrated Tavily Search as the primary tool, demonstrating how LangGraph agents can access real-time information:

from langchain_tavily import TavilySearch

tool = TavilySearch(max_results=2)
tools = [tool]
Enter fullscreen mode Exit fullscreen mode

Tavily provides an AI-optimized search API that returns clean, relevant snippets rather than full web pages. This reduces the overhead of parsing HTML when agents need current information. The integration is seamless—LangGraph automatically handles tool invocation and result incorporation into the conversation flow.

AgentCore Runtime and Memory Integration

The integration with AgentCore Runtime provides the production infrastructure for the LangGraph agent. Let me explain how the entrypoint function processes requests and manages memory.

The Entrypoint Function

from bedrock_agentcore import BedrockAgentCoreApp
from bedrock_agentcore.runtime.context import RequestContext

app = BedrockAgentCoreApp()

@app.entrypoint
def invoke(payload: Dict[str, Any], context: Optional[RequestContext] = None) -> Dict[str, Any]:
    """Main entrypoint with AgentCore memory integration."""

    logger.info("LangGraph invocation started")
Enter fullscreen mode Exit fullscreen mode

The @entrypoint decorator marks this function as the handler for incoming requests. The function receives:

  • payload: Contains the request data, including the user's prompt
  • context: A RequestContext object providing session management

Memory Enhancement Before Graph Execution

Before executing the graph, I retrieve relevant memories and add them to the input:

    # Extract parameters with context priority for session_id
    actor_id = payload.get("actor_id", DEFAULT_ACTOR_ID)
    session_id = context.session_id if context and context.session_id else payload.get("session_id", DEFAULT_SESSION_ID)

    prompt = payload.get("prompt", "No prompt found in input")

    # Enhance prompt with AgentCore memory context
    memory_context = memory_manager.get_memory_context(
        user_input=prompt,
        actor_id=actor_id,
        session_id=session_id
    )
    enhanced_prompt = f"{memory_context}\n\nCurrent user message: {prompt}" if memory_context else prompt

    # Create messages for LangGraph
    messages = {"messages": [{"role": "user", "content": enhanced_prompt}]}
Enter fullscreen mode Exit fullscreen mode

The memory context retrieval uses two AgentCore Memory APIs:

  1. get_last_k_turns to load conversation history
  2. RetrieveMemories to search for relevant memories

Executing the Graph and Storing Results

After the graph processes the enriched input, I store the conversation and return the result:

try:
    # Invoke the LangGraph
    response = graph.invoke(messages)
    response_message = response['messages'][-1].content

    # Store conversation in AgentCore memory
    memory_manager.store_conversation(
        user_input=prompt,  # Store original prompt, not enhanced version
        response=response_message,
        actor_id=actor_id,
        session_id=session_id
    )

    return {"result": response_message}
Enter fullscreen mode Exit fullscreen mode

The store_conversation() method calls the create_event API, which:

  • Stores the raw conversation
  • Triggers memory strategies to extract preferences, facts, and summaries
  • Makes these insights available for future retrievals

I store the original user prompt, not the enhanced version with memory context. This prevents recursive memory expansion where each retrieval would include previous retrievals, keeping the memory focused on actual conversation content.

The function returns a dictionary that AgentCore Runtime automatically serializes to JSON for the HTTP response.

Advanced Graph Patterns

While our implementation uses a simple tool-calling pattern, LangGraph enables more complex workflows that work well with AgentCore Memory.

Human-in-the-Loop Workflows

LangGraph's interrupt and resume capabilities allow building agents that pause for human input:

from langgraph.checkpoint.memory import MemorySaver

# Add checkpointing for state persistence
checkpointer = MemorySaver()
graph = graph_builder.compile(checkpointer=checkpointer)

# Add interrupt before critical decisions
graph_builder.add_node("human_approval", human_approval_node)
graph_builder.add_edge("chatbot", "human_approval", interrupt_before=["human_approval"])
Enter fullscreen mode Exit fullscreen mode

Combined with AgentCore Memory, this enables workflows where agents remember not just conversations but also approval patterns and human feedback over time.

Multi-Agent Collaboration

LangGraph excels at orchestrating multiple agents working together:

# Define specialized agents as subgraphs
research_agent = create_research_graph()
analysis_agent = create_analysis_graph()

# Compose them in a parent graph
parent_builder = StateGraph(State)
parent_builder.add_node("research", research_agent)
parent_builder.add_node("analysis", analysis_agent)

# Route based on task type
def route_to_agent(state):
    if "research" in state["messages"][-1].content.lower():
        return "research"
    return "analysis"

parent_builder.add_conditional_edges(START, route_to_agent)
Enter fullscreen mode Exit fullscreen mode

Each subgraph can maintain its own memory namespace in AgentCore, enabling specialized agents with distinct knowledge bases while sharing conversation context.

Parallel Execution

LangGraph supports parallel node execution for concurrent processing:

from langgraph.graph import END

# Add nodes that can run in parallel
graph_builder.add_node("search_web", search_web_node)
graph_builder.add_node("search_memory", search_memory_node)
graph_builder.add_node("synthesize", synthesize_node)

# Both searches run in parallel
graph_builder.add_edge("chatbot", "search_web")
graph_builder.add_edge("chatbot", "search_memory")

# Both complete before synthesis
graph_builder.add_edge("search_web", "synthesize")
graph_builder.add_edge("search_memory", "synthesize")
Enter fullscreen mode Exit fullscreen mode

Combining web search with memory retrieval is useful, allowing the agent to gather information from multiple sources simultaneously.

Deploying the Agent

The deployment process leverages the same AgentCore Starter Toolkit workflow used throughout this series, with some LangGraph-specific considerations.

Configuration and Local Testing

First, I configure the agent for AgentCore:

agentcore configure -n langgraphagent -e main.py
Enter fullscreen mode Exit fullscreen mode

For local testing with Tavily search, I need to provide the API key:

agentcore launch --local --env TAVILY_API_KEY=<YOUR_TAVILY_API_KEY>
Enter fullscreen mode Exit fullscreen mode

This launches the containerized agent locally. Testing with memory context:

agentcore invoke --local '{"prompt": "AI multi-agent architectures - Also, what did I say about fruit?"}'
Enter fullscreen mode Exit fullscreen mode

The agent performs a web search for current information about AI architectures while also retrieving the stored memory about fruit preferences, demonstrating the power of combining real-time data with persistent context.

Production Deployment

Deploying to AWS requires passing the Tavily API key as an environment variable:

agentcore launch --env TAVILY_API_KEY=<YOUR_TAVILY_API_KEY>
Enter fullscreen mode Exit fullscreen mode

AgentCore securely stores the API key in AWS Secrets Manager and injects it into the Lambda function environment at runtime. This keeps credentials protected, never exposed in code or configuration files.

Monitoring the deployment:

aws logs tail /aws/bedrock-agentcore/runtimes/<AGENT_ID_ENDPOINT_ID> --follow
Enter fullscreen mode Exit fullscreen mode

The structured logging from both LangGraph and AgentCore makes it easy to trace the graph execution flow and debug any issues.

Observability and Debugging

LangGraph provides excellent visibility into agent execution through its graph structure. I can even visualize the graph running this code locally:

# Generate a Mermaid diagram of the graph
image = graph.get_graph().draw_mermaid_png()
with open("graph_diagram.png", "wb") as f:
    f.write(image)
Enter fullscreen mode Exit fullscreen mode

This visualization helps understand the agent's decision flow and is invaluable for debugging complex workflows.

Tracing Graph Execution

LangGraph's execution is fully traceable through LangSmith or custom callbacks:

from langchain.callbacks import StdOutCallbackHandler

# Add tracing to graph execution
response = graph.invoke(
    messages,
    config={"callbacks": [StdOutCallbackHandler()]}
)
Enter fullscreen mode Exit fullscreen mode

Combined with CloudWatch Logs and observability from AgentCore, this provides complete observability from infrastructure to application logic.

Production Considerations

Running LangGraph agents in production with AgentCore has taught me several important lessons.

State Management Strategies

LangGraph's explicit state management requires careful consideration of what to persist. While the graph maintains state during execution, AgentCore Memory handles cross-session persistence. I've found it best to keep graph state focused on the current task while using AgentCore Memory for long-term context.

Comparing Graph-Based to Traditional Approaches

After implementing agents with multiple frameworks, I can see where LangGraph's graph-based approach works best. Traditional agent frameworks often struggle with complex, multi-step workflows that require backtracking or parallel processing. LangGraph makes these patterns natural and explicit.

The graph structure also makes it easier to implement safety constraints. By controlling edges and adding validation nodes, I can make agents follow approved workflows even when using powerful foundation models. This matters in production environments where predictability is as important as capability.

However, this power comes with complexity. Simple question-answering agents might be overengineered as graphs. The key is choosing the right tool for the job—LangGraph for complex workflows, simpler frameworks for straightforward tasks.

What We've Learned

This LangGraph implementation completes our journey through five different agent frameworks, all unified through Amazon Bedrock AgentCore. The graph-based approach offers unique advantages for complex workflows while maintaining the same production-grade memory and deployment infrastructure we've used throughout the series.

AgentCore's flexibility enables each framework to use its strengths while providing consistent operational excellence. Whether you're building simple tool-calling agents with Strands, multi-agent systems with CrewAI, type-safe applications with Pydantic AI, data-centric agents with LlamaIndex, or complex workflows with LangGraph, AgentCore handles the production challenges so you can focus on agent logic.

The shared memory architecture we've used across all frameworks shows the benefits of standardization. By creating a common memory interface, we've enabled true portability—agents built with different frameworks can share memories and even hand off conversations to each other.

Next Steps and Future Directions

While this series focused on Runtime and Memory, AgentCore offers additional services that further enhance production deployments. Future explorations could include using AgentCore Identity for inbound and outbound agent authentication and authorization, implementing AgentCore Gateways to transform existing APIs and AWS Lambda functions into MCP servers, or leveraging AgentCore Monitoring for advanced observability.

The complete code for all five frameworks is available on GitHub. I encourage you to explore the implementations, experiment with combining different frameworks, and build your own production-ready agents.

The AI agent ecosystem is evolving rapidly, with new tools and patterns emerging constantly. What remains constant is the need for production-grade infrastructure that can adapt to these changes. Amazon Bedrock AgentCore provides that foundation, enabling you to experiment with cutting-edge agent technologies while maintaining enterprise-ready deployment and operations.

Thank you for joining me on this multi-framework journey. Now it's your turn to build something amazing!

Top comments (0)