DEV Community

samadhi patil
samadhi patil

Posted on

Building Advanced AI Agents with LangChain's DeepAgents: A Hands-On Guide

Building Advanced AI Agents with LangChain's DeepAgents: A Hands-On Guide

When Simple Tool-Calling Isn't Enough—Building Agents That Actually Think


TL;DR

From my experience working with LLM agents, the biggest challenge isn't getting them to call a function—it's getting them to handle complex, multi-step workflows without falling apart. Traditional agents are like assistants who forget what they were doing mid-task. LangChain's DeepAgents changes this completely. In my opinion, it's the first framework that truly enables agents to plan strategically, remember context persistently, delegate to specialists, and iterate toward quality. This guide walks you through building a real AI policy research agent that I designed to demonstrate these capabilities. You'll see actual code, understand the design decisions behind each piece, and get a working system that produces professional research reports autonomously.


Introduction

Let me tell you about a problem that drove me crazy for months.

Building an agent that calls a single tool? Easy. Getting an LLM to search the web or query a database? Done it a hundred times. But when I tried to build something more sophisticated—an agent that could research a complex topic, synthesize findings, review its own work, and produce a polished report—everything broke down.

The agents I built were what I now call "shallow." They'd execute one step, maybe two, then lose track of what they were doing. Token limits would overflow. Context would get muddled. Quality would suffer because there was no review process. As per my experience, this is the wall that most developers hit when moving from demos to production AI systems.

Then I discovered LangChain's DeepAgents library, and honestly, it changed how I think about agent architecture entirely.

What struck me first was the philosophy behind it. The DeepAgents team studied production systems like Claude Code and Deep Research—real applications handling genuinely complex workflows—and extracted the patterns that made them work. The result is a framework that gives agents four critical capabilities that shallow agents lack:

Planning tools that let agents break down tasks strategically before diving in. File system access that provides persistent memory outside the conversation context. Sub-agent creation that enables delegation to focused specialists. And long-term memory through LangGraph's Store that maintains state across sessions.

In my view, these aren't just nice features—they're fundamental architectural requirements for any agent doing serious work.

So I decided to build something real to prove it out. Not a toy demo, but an actual policy research system that could rival human analysis. The kind of agent I wish I'd had when I was doing regulatory research manually. In this guide, I'll walk you through exactly how I built it, the decisions I made along the way, and why each piece matters.


What's This Article About?

This article is my attempt to show you—through actual working code—how to build an AI agent that thinks strategically rather than just reacting to inputs.

Here's What We're Building:

Based on my experience with various agent frameworks, I designed a policy research system that demonstrates every key capability of the DeepAgents architecture. This isn't theoretical—it's a complete implementation that I've tested extensively.

The agent can:

  • Accept complex research questions about AI regulations (like "What's the latest on the EU AI Act?")
  • Break down the research into logical steps using planning tools
  • Delegate the actual investigation to a specialized research sub-agent
  • Save intermediate work to files so context never overflows
  • Invoke a critique sub-agent to review draft quality
  • Iterate based on feedback to produce professional reports

What You'll Learn (From My Mistakes and Successes):

Through building this, I learned several critical patterns:

  1. Strategic Planning: In my opinion, the write_todos tool is underrated. I initially skipped it, thinking agents could just "figure it out." Wrong. Explicit planning transforms chaotic execution into methodical workflow.

  2. Context Management: This one burned me hard. My early agents would hit token limits mid-research and forget everything. File system operations (read_file, write_file, edit_file) solve this completely. As per my experience, this is non-negotiable for complex tasks.

  3. Sub-Agent Delegation: I used to cram everything into one mega-prompt. Bad idea. Specialized sub-agents—each with focused responsibilities—produce dramatically better results. One agent researches, another critiques. Clean separation of concerns.

  4. Custom System Prompts: Generic prompts produce generic results. I learned to design detailed, workflow-specific instructions that guide agents step-by-step through complex processes.

  5. Tool Integration: External capabilities like web search aren't add-ons—they're core to agent functionality. I'll show you how I integrated Tavily search seamlessly.

  6. Model Flexibility: One thing I love about DeepAgents is model-agnostic design. I've run this same system on OpenAI, Gemini, and Anthropic models interchangeably.

The Technical Architecture (How I Designed It):

When I sat down to architect this system, I thought about three layers:

Layer 1: Main Orchestrator

  • Receives research queries
  • Plans the workflow
  • Coordinates sub-agents
  • Manages file system state
  • Delivers final output

Layer 2: Specialized Sub-Agents

  • Research Sub-Agent: Conducts deep investigation using web search
  • Critique Sub-Agent: Reviews outputs for quality, accuracy, completeness

Layer 3: Infrastructure

  • File system for persistent state
  • LangGraph Store for long-term memory
  • Tavily API for real-time information gathering

From my perspective, this layered approach is what enables scalability. Each component has a single, clear responsibility.

Why This Pattern Matters:

In my experience building production AI systems, I've learned that architecture matters more than model choice. A well-designed agent with GPT-3.5 will outperform a poorly designed agent with GPT-4. The patterns you'll learn here—planning, delegation, state management, quality control—apply regardless of which LLM you're using or what domain you're working in.

Whether you're building content creation pipelines, code generation systems, data analysis workflows, or customer service automation, these fundamentals remain constant.


Tech Stack

Component Technology Purpose
Agent Framework LangChain DeepAgents Core library for building deep, planful agents with context management
LLM Provider (Primary) OpenAI GPT-4o Main language model for agent reasoning and generation
LLM Provider (Alternative) Google Gemini 2.5 Flash Alternative model option (fully interchangeable)
Default Model Claude Sonnet 4.5 DeepAgents internal default when no model specified
Web Search Tavily API Real-time internet search tool for research gathering
State Management LangGraph Store Long-term memory and session persistence
Model Initialization LangChain init_chat_model Unified interface for multiple LLM providers
File Operations Built-in File Tools Context management through read_file, write_file, edit_file, ls
Planning Built-in write_todos Task breakdown and progress tracking
Sub-Agent Management Built-in task tool Creation and delegation to specialized sub-agents
Environment Management Python os.environ API key and configuration management
Development Environment Jupyter Notebook Interactive development and testing

Why Read It?

If you're building AI agents that do more than simple tool calls, this article is essential reading. Here's why:

1. Move Beyond Basic Agents

Most tutorials show you how to build agents that can call a function or search the web. But when you try to scale those patterns to real-world applications—research assistants, code generation systems, multi-step workflows—you hit a wall. This article shows you how to break through that limitation with architectures designed for complexity.

2. Solve Real Production Challenges

You'll learn solutions to problems you'll actually face:

  • Context overflow: How to handle tasks that exceed token limits through file-based state management
  • Task planning: How to make agents think strategically instead of reactively
  • Quality control: How to build self-reviewing systems through sub-agent delegation
  • Memory management: How to maintain state across sessions for long-running projects
  • Modularity: How to break complex agents into focused, maintainable components

3. Complete, Working Implementation

This isn't pseudo-code or theoretical concepts. You get:

  • Full source code for a production-quality research agent
  • Step-by-step explanations of each component
  • Design decisions explained in context
  • Multiple LLM provider options (OpenAI, Gemini, Anthropic)
  • Ready-to-run Jupyter notebook with all dependencies
  • Professional prompt engineering examples

4. Learn Patterns You Can Reuse

The patterns demonstrated here apply far beyond policy research:

  • Content creation systems with review workflows
  • Code generation with testing and refinement
  • Data analysis with iterative exploration
  • Customer service with escalation and specialization
  • Report generation with research and synthesis
  • Any multi-stage workflow requiring planning and quality control

5. Understand the "Why" Behind Design Choices

Each section explains not just what the code does, but why it's structured that way:

  • Why file systems prevent context overflow
  • Why sub-agents improve focus and quality
  • Why custom prompts are critical for complex tasks
  • Why planning tools enable strategic execution
  • Why this architecture scales where basic agents don't

6. Stay Current with AI Agent Evolution

The AI agent landscape is evolving rapidly from simple tool-calling to sophisticated, planful systems. DeepAgents represents the cutting edge of this evolution, incorporating lessons from production systems like Claude Code. Understanding this architecture prepares you for where the field is heading.

7. Practical Business Value

These techniques have immediate business applications:

  • Automating research and analysis workflows
  • Building intelligent content creation pipelines
  • Creating self-improving code generation systems
  • Developing sophisticated customer support agents
  • Implementing complex decision-making systems

Whether you're a developer building production AI systems, a researcher exploring agent architectures, or a technical leader evaluating AI capabilities, this guide provides practical knowledge you can apply immediately.


Let's Design

The Architecture Philosophy

When I set out to build this policy research agent, the fundamental question was: how do we create an AI system that thinks strategically rather than reactively? Traditional agents process tasks linearly—receive input, call tools, return output. But complex research requires something different: planning, delegation, iteration, and quality control.

The DeepAgents architecture addresses this through four interconnected capabilities that work together to enable sophisticated behavior:

1. Strategic Planning Layer

The Problem: Basic agents jump straight into action without considering the best approach. They can't break down complex tasks or track progress across multiple steps.

The Solution: DeepAgents provides a write_todos tool that enables agents to:

  • Decompose large research questions into specific subtasks
  • Create actionable checklists before starting work
  • Track which steps are complete and which remain
  • Adjust the plan dynamically as new information emerges

In Our Implementation: The main agent first saves the research question to question.txt, then creates a todo list outlining the research workflow: gather information, analyze findings, write the report, critique the draft, and finalize. This planning step transforms reactive execution into strategic orchestration.

2. Persistent Context Management

The Problem: LLMs have token limits. Complex tasks generate large amounts of intermediate data—research findings, draft content, notes—that quickly overflow the context window. Once that happens, the agent loses track of its work.

The Solution: DeepAgents integrates file system operations (read_file, write_file, edit_file, ls) that allow agents to:

  • Store intermediate results outside the conversation context
  • Retrieve specific information when needed
  • Build up complex outputs incrementally
  • Continue work across multiple sessions

In Our Implementation: The agent uses three key files:

  • question.txt: Stores the original research query for reference
  • final_report.md: Holds the evolving research report through drafts and revisions
  • Working memory in file system: Prevents context overflow even with extensive research

This file-based approach means the agent can handle research projects of any size without hitting token limits.

3. Specialized Sub-Agent Delegation

The Problem: Trying to do everything in one agent leads to bloated context, unclear responsibilities, and lower quality outputs. Research requires different skills than critique. Gathering information is different from synthesizing it.

The Solution: DeepAgents allows creation of focused sub-agents via the task tool. Each sub-agent has:

  • Its own specialized system prompt defining clear responsibilities
  • Dedicated tools appropriate to its function
  • Isolated context that doesn't clutter the main agent
  • Single, well-defined output that returns to the main agent

In Our Implementation: We use two sub-agents:

Policy Research Sub-Agent:

  • Purpose: Conduct in-depth investigation of AI regulations and policies
  • Tools: Internet search via Tavily API
  • Instructions: Find key updates, cite sources, compare global approaches, write professionally
  • Output: Comprehensive research findings passed back to main agent

Policy Critique Sub-Agent:

  • Purpose: Quality control and editorial review
  • Tools: File reading to access the draft report
  • Instructions: Check accuracy, verify citations, assess balance and tone
  • Output: Constructive feedback without direct modification

This separation of concerns means each sub-agent can focus deeply on its specialty without distractions.

4. Intelligent Workflow Orchestration

The Problem: Who coordinates all these pieces? How does the main agent know when to delegate, when to write, when to revise?

The Solution: A carefully crafted custom system prompt that serves as the "brain" of the operation. This prompt:

  • Defines the overall workflow step-by-step
  • Specifies when to invoke each sub-agent
  • Enforces quality standards and formatting requirements
  • Provides context about the agent's role and capabilities

In Our Implementation: The policy_research_instructions prompt creates a clear five-step workflow:

  1. Save the Question: Write user query to question.txt for reference
  2. Delegate Research: Invoke policy-research-agent to gather comprehensive information
  3. Synthesize Report: Write findings to final_report.md with proper structure and citations
  4. Quality Review: Optionally invoke policy-critique-agent for editorial feedback
  5. Finalize: Revise based on feedback and output the complete professional report

The prompt also enforces standards:

  • Markdown formatting with clear section headers
  • Citation style using [Title](URL) format
  • Professional, neutral tone suitable for policy briefings
  • Sources section at the end

The Complete Flow

Here's how everything works together when a user asks: "What are the latest updates on the EU AI Act and its global impact?"

User Query
    ↓
Main Deep Agent
    ↓
1. Saves question to question.txt (context management)
    ↓
2. Creates todo list (planning)
    ↓
3. Invokes Policy Research Sub-Agent
    ↓
    Research Sub-Agent:
    - Uses Tavily search for EU AI Act updates
    - Finds regulations, news, analysis
    - Compares global approaches
    - Formats findings professionally
    - Returns comprehensive research to Main Agent
    ↓
4. Main Agent writes draft to final_report.md
    ↓
5. Invokes Policy Critique Sub-Agent
    ↓
    Critique Sub-Agent:
    - Reads final_report.md
    - Checks accuracy and citations
    - Verifies balanced analysis
    - Returns constructive feedback to Main Agent
    ↓
6. Main Agent revises draft based on feedback
    ↓
7. Outputs final professional policy report
Enter fullscreen mode Exit fullscreen mode

Why This Architecture Scales

This design handles complexity through:

  • Modularity: Each component has a single, clear responsibility
  • Extensibility: Easy to add new sub-agents for different research domains
  • Robustness: File system prevents context overflow regardless of task size
  • Quality: Built-in review cycle ensures professional outputs
  • Flexibility: Works with any LLM provider (OpenAI, Gemini, Anthropic, etc.)
  • Maintainability: Clear separation between orchestration, execution, and review

Unlike basic agents that collapse under complexity, this architecture gets stronger as tasks become more sophisticated. The planning layer ensures strategic execution, the file system prevents memory issues, sub-agents maintain focus, and the workflow orchestration keeps everything coordinated.

This is the fundamental difference between shallow and deep agents: the ability to think, plan, delegate, and iterate rather than just react and respond.


Let's Get Cooking

Now let's build this system step by step, understanding each component and why it matters.

Step 1: Install Dependencies and Setup Environment

First, we need the core libraries that power our deep agent system.

!pip install deepagents tavily-python langchain-google-genai langchain-openai
Enter fullscreen mode Exit fullscreen mode

What We're Installing:

  • deepagents: The core LangChain library providing planning, file tools, and sub-agent capabilities
  • tavily-python: Client for Tavily web search API (our research tool)
  • langchain-google-genai: Integration for Google's Gemini models (alternative LLM)
  • langchain-openai: Integration for OpenAI's GPT models (primary LLM)

Why These Dependencies: DeepAgents is model-agnostic, so we install multiple LLM provider options. Tavily provides the real-time web search capability our research sub-agent needs.


Step 2: Configure API Keys

import os
from getpass import getpass

# Required for web search functionality
os.environ['TAVILY_API_KEY'] = getpass('Enter Tavily API Key: ')

# Choose your preferred LLM provider
os.environ['OPENAI_API_KEY'] = getpass('Enter OpenAI API Key: ')

# Optional: If using Gemini instead
# os.environ['GOOGLE_API_KEY'] = getpass('Enter Google API Key: ')
Enter fullscreen mode Exit fullscreen mode

Why This Matters:

  • The Tavily API key enables our research sub-agent to search the web for real-time information
  • You can use either OpenAI or Google (or other providers)—DeepAgents works seamlessly with all of them
  • Using getpass keeps your API keys secure and out of your code

Getting API Keys:

  • Tavily: Sign up at tavily.com for web search access
  • OpenAI: Get your key from platform.openai.com
  • Google: Access through Google AI Studio

Step 3: Import Core Libraries

import os
from typing import Literal
from tavily import TavilyClient
from deepagents import create_deep_agent

# Initialize the search client
tavily_client = TavilyClient()
Enter fullscreen mode Exit fullscreen mode

Why We Import These:

  • typing.Literal: Enables type hints for our search function parameters
  • TavilyClient: Provides the interface to web search functionality
  • create_deep_agent: The main factory function for building our deep agent system

Design Note: We initialize the Tavily client at the module level because it will be used inside our tool function, which gets called by the sub-agent.


Step 4: Define the Web Search Tool

from typing import Literal

def internet_search(
    query: str,
    max_results: int = 5,
    topic: Literal["general", "news", "finance"] = "general",
    include_raw_content: bool = False,
):
    """Run a web search and return relevant results.

    This tool allows agents to gather real-time information from the internet.
    """
    search_docs = tavily_client.search(
        query,
        max_results=max_results,
        include_raw_content=include_raw_content,
        topic=topic,
    )
    return search_docs
Enter fullscreen mode Exit fullscreen mode

What This Tool Does:

  • Accepts a search query string and optional parameters
  • Calls Tavily's search API to find relevant web content
  • Returns structured search results (titles, URLs, snippets, content)
  • Supports different topic categories (general, news, finance)

Why We Structure It This Way:

  • Clear docstring: Helps the LLM understand when and how to use this tool
  • Typed parameters: The Literal type hint tells the agent exactly what topic values are valid
  • Sensible defaults: 5 results and general search work for most cases
  • Raw content option: Can retrieve full page content when needed for deep research

How the Agent Uses It: The research sub-agent will automatically call this function when it needs to gather information about AI policies, regulations, or related topics.


Step 5: Create the Research Sub-Agent Configuration

sub_research_prompt = """
You are a specialized AI policy researcher.
Conduct in-depth research on government policies, global regulations, and ethical frameworks related to artificial intelligence.

Your answer should:
- Provide key updates and trends
- Include relevant sources and laws (e.g., EU AI Act, U.S. Executive Orders)
- Compare global approaches when relevant
- Be written in clear, professional language

Only your FINAL message will be passed back to the main agent.
"""

research_sub_agent = {
    "name": "policy-research-agent",
    "description": "Used to research specific AI policy and regulation questions in depth.",
    "system_prompt": sub_research_prompt,
    "tools": [internet_search],
}
Enter fullscreen mode Exit fullscreen mode

Breaking This Down:

The System Prompt:

  • Defines the sub-agent's identity and expertise (AI policy researcher)
  • Specifies output requirements (updates, sources, comparisons)
  • Sets quality standards (clear, professional language)
  • Reminds the agent that only the final message returns to the parent

Why This Prompt Works:

  • Focused role: "Specialized AI policy researcher" gives clear identity
  • Concrete requirements: The bullet points tell the agent exactly what to include
  • Examples: Mentioning "EU AI Act, U.S. Executive Orders" helps the agent understand scope
  • Important constraint: "Only your FINAL message will be passed back" prevents verbose intermediate steps

The Configuration Dictionary:

  • name: Identifier the main agent uses to invoke this sub-agent
  • description: Helps the main agent decide when to delegate to this specialist
  • system_prompt: The instructions that define this sub-agent's behavior
  • tools: List of functions this sub-agent can call (just internet search in this case)

Design Philosophy: This sub-agent has ONE job: research AI policies thoroughly. It has the internet search tool to do that job well, and its prompt focuses it entirely on that task. No distractions, no scope creep.


Step 6: Create the Critique Sub-Agent Configuration

sub_critique_prompt = """
You are a policy editor reviewing a report on AI governance.
Check the report at `final_report.md` and the question at `question.txt`.

Focus on:
- Accuracy and completeness of legal information
- Proper citation of policy documents
- Balanced analysis of regional differences
- Clarity and neutrality of tone

Provide constructive feedback, but do NOT modify the report directly.
"""

critique_sub_agent = {
    "name": "policy-critique-agent",
    "description": "Critiques AI policy research reports for completeness, clarity, and accuracy.",
    "system_prompt": sub_critique_prompt,
}
Enter fullscreen mode Exit fullscreen mode

The Critique System Prompt:

  • Defines role as editorial reviewer (not researcher)
  • Specifies exactly what files to check
  • Lists concrete quality criteria to evaluate
  • Explicitly prohibits direct modification (feedback only)

Why No Tools Here: Unlike the research sub-agent, the critique agent doesn't need internet search. It has access to the file system (built into all DeepAgents) to read the draft report and provide feedback. That's all it needs.

The Review Criteria:
Each bullet point gives the critic something specific to check:

  • Accuracy and completeness: Are the facts right? Is anything missing?
  • Proper citation: Are sources properly attributed?
  • Balanced analysis: Are regional differences fairly represented?
  • Clarity and neutrality: Is the tone appropriate for policy work?

Why "Do NOT modify": This is crucial. We want the critique agent to identify issues and suggest improvements, but leave the actual editing to the main agent. This separation prevents the critic from overstepping and ensures the main agent maintains control of the final output.

The Workflow: Main agent → writes draft → critique agent reviews → provides feedback → main agent revises. Clean separation of concerns.


Step 7: Design the Main Agent System Prompt

policy_research_instructions = """
You are an expert AI policy researcher and analyst.
Your job is to investigate questions related to global AI regulation, ethics, and governance frameworks.

1️⃣ Save the user's question to `question.txt`
2️⃣ Use the `policy-research-agent` to perform in-depth research
3️⃣ Write a detailed report to `final_report.md`
4️⃣ Optionally, ask the `policy-critique-agent` to critique your draft
5️⃣ Revise if necessary, then output the final, comprehensive report

When writing the final report:
- Use Markdown with clear sections (## for each)
- Include citations in [Title](URL) format
- Add a ### Sources section at the end
- Write in professional, neutral tone suitable for policy briefings
"""
Enter fullscreen mode Exit fullscreen mode

This Is the Brain of the Operation. Let me explain why each part matters:

Identity and Purpose ("You are an expert AI policy researcher..."):

  • Establishes the agent's high-level role
  • Sets expectations for quality and expertise
  • Provides context for decision-making

The Numbered Workflow (Steps 1-5):
This is the most important part. It gives the agent a clear execution plan:

  1. Save the question: Creates a persistent reference the agent can check later
  2. Delegate to research sub-agent: Uses the specialist to gather information
  3. Write the report: Synthesizes findings into a structured document
  4. Get critique: Optionally invokes the editor for quality review
  5. Finalize: Revises based on feedback and delivers the result

Why Numbered Steps Work: They create a mental model for the LLM. The agent knows there's a sequence, knows what comes next, and can track progress.

The Formatting Requirements (Markdown, citations, etc.):
These aren't just style preferences—they're quality controls:

  • Markdown sections: Ensures structured, navigable reports
  • Title citations: Makes sources clickable and verifiable
  • Sources section: Consolidates references for easy checking
  • Professional tone: Appropriate for the policy analysis domain

Why This Prompt Architecture Works:

  1. Clear role: The agent knows who it is
  2. Explicit workflow: The agent knows what to do
  3. Quality standards: The agent knows how to do it well
  4. File-based state: The agent can handle any complexity
  5. Delegation model: The agent knows when to get help

This is the difference between an agent that wanders aimlessly and one that executes strategically.


Step 8: Initialize the Main Deep Agent

from langchain.chat_models import init_chat_model
from deepagents import create_deep_agent

# Initialize with OpenAI GPT-4o
model = init_chat_model(model="openai:gpt-4o")

# Alternative: Use Google Gemini instead
# model = init_chat_model(model="google_genai:gemini-2.5-flash")

# Create the deep agent
agent = create_deep_agent(
    model=model,
    tools=[internet_search],
    system_prompt=policy_research_instructions,
    subagents=[research_sub_agent, critique_sub_agent],
)
Enter fullscreen mode Exit fullscreen mode

What's Happening Here:

Model Initialization:

model = init_chat_model(model="openai:gpt-4o")
Enter fullscreen mode Exit fullscreen mode
  • init_chat_model is LangChain's unified interface for any LLM
  • Format is "provider:model_name"
  • Switching providers is as simple as changing this string
  • If you don't specify a model, DeepAgents defaults to Claude Sonnet 4.5

Creating the Deep Agent:

agent = create_deep_agent(...)
Enter fullscreen mode Exit fullscreen mode

This single function call assembles the entire complex system:

  • model: The LLM that powers reasoning and decision-making
  • tools: Functions the main agent can call directly (internet search)
  • system_prompt: The workflow instructions we defined above
  • subagents: The specialized agents available for delegation

What create_deep_agent Does Internally:

  1. Sets up the planning system (todo management)
  2. Configures file system tools (read, write, edit, ls)
  3. Registers the sub-agents for delegation
  4. Integrates the custom tools we provided
  5. Wraps everything in a LangGraph workflow
  6. Connects to the LLM for execution

Why This Is Powerful: In these few lines, we've created an agent that can:

  • Plan and track complex tasks
  • Search the web for information
  • Delegate to specialized sub-agents
  • Manage persistent state through files
  • Maintain long-term memory
  • Execute multi-step workflows

All the complexity is abstracted away by create_deep_agent.

Model Flexibility Note: Notice how easy it is to switch between OpenAI and Gemini (or any other provider). This is intentional—DeepAgents is designed to be model-agnostic. The architecture works the same regardless of which LLM powers it.


Step 9: Invoke the Agent with a Research Query

query = "What are the latest updates on the EU AI Act and its global impact?"

result = agent.invoke({"messages": [{"role": "user", "content": query}]})
Enter fullscreen mode Exit fullscreen mode

What Happens When You Run This:

The Invocation Format:

{"messages": [{"role": "user", "content": query}]}
Enter fullscreen mode Exit fullscreen mode

This is the standard LangChain message format. The agent receives it as if it were a chat conversation.

The Execution Flow (what happens behind the scenes):

  1. Main Agent Receives Query

    • Reads the question: "What are the latest updates on the EU AI Act..."
    • Consults its system prompt (the numbered workflow)
    • Decides to start at step 1
  2. Step 1: Save the Question

   # Agent calls: write_file(path="question.txt", content=query)
Enter fullscreen mode Exit fullscreen mode
  • Uses built-in file tool to persist the question
  • Creates a reference point for later use
  1. Step 2: Delegate to Research Sub-Agent
   # Agent calls: task(agent="policy-research-agent", 
   #                   instruction="Research the EU AI Act updates and global impact")
Enter fullscreen mode Exit fullscreen mode
  • Research sub-agent receives the task
  • Calls internet_search("EU AI Act latest updates")
  • Calls internet_search("EU AI Act global impact")
  • Calls internet_search("AI regulations worldwide comparison")
  • Synthesizes findings into comprehensive report
  • Returns final message to main agent
  1. Step 3: Write the Draft Report
   # Agent calls: write_file(path="final_report.md", content=research_findings)
Enter fullscreen mode Exit fullscreen mode
  • Takes sub-agent's research
  • Structures it into Markdown sections
  • Includes citations in Title format
  • Writes to file system
  1. Step 4: Get Editorial Critique
   # Agent calls: task(agent="policy-critique-agent",
   #                   instruction="Review the draft report for quality")
Enter fullscreen mode Exit fullscreen mode
  • Critique sub-agent reads final_report.md
  • Checks accuracy, citations, balance, tone
  • Provides constructive feedback
  • Returns review to main agent
  1. Step 5: Revise and Finalize
   # Agent calls: edit_file(path="final_report.md", edits=improvements)
Enter fullscreen mode Exit fullscreen mode
  • Main agent incorporates feedback
  • Refines sections based on critique
  • Ensures all quality standards are met
  • Outputs final comprehensive report

The Result: result contains the complete execution trace, including:

  • All intermediate steps taken
  • Tool calls made by each agent
  • Sub-agent invocations and responses
  • The final policy report
  • File system changes

Viewing the Output:

# Get just the final report
print(result['messages'][-1]['content'])

# Or access the file directly
# agent.read_file("final_report.md")
Enter fullscreen mode Exit fullscreen mode

What Makes This Different from Basic Agents:

  • Basic agent: Search → Generate → Done (one step, no planning)
  • Deep agent: Plan → Delegate → Research → Write → Critique → Revise → Deliver (strategic, multi-stage)

The deep agent produces higher quality results because it:

  • Plans before acting
  • Uses specialists for specific tasks
  • Iterates based on feedback
  • Manages complexity through file system
  • Maintains focus through sub-agent delegation

This is the power of the DeepAgents architecture in action.


Let's Setup

Prerequisites

Before running this implementation, ensure you have:

1. Python Environment

  • Python 3.8 or higher
  • pip package manager
  • Jupyter Notebook or JupyterLab (recommended for interactive development)

2. API Access

3. Development Environment

  • Code editor (VS Code, PyCharm, or similar)
  • Terminal access for pip installations
  • Stable internet connection for API calls

Installation Steps

Step 1: Set Up Virtual Environment (Recommended)

# Create a virtual environment
python -m venv deepagents-env

# Activate it (Windows)
deepagents-env\Scripts\activate

# Activate it (Mac/Linux)
source deepagents-env/bin/activate
Enter fullscreen mode Exit fullscreen mode

Step 2: Install Core Dependencies

pip install deepagents tavily-python langchain-google-genai langchain-openai
Enter fullscreen mode Exit fullscreen mode

Step 3: Verify Installation

# Test imports
import deepagents
from tavily import TavilyClient
from langchain.chat_models import init_chat_model

print("✅ All dependencies installed successfully")
Enter fullscreen mode Exit fullscreen mode

Step 4: Configure Environment Variables

Create a .env file in your project directory:

TAVILY_API_KEY=your_tavily_key_here
OPENAI_API_KEY=your_openai_key_here
# GOOGLE_API_KEY=your_google_key_here  # If using Gemini
Enter fullscreen mode Exit fullscreen mode

Or set them programmatically:

import os
from getpass import getpass

os.environ['TAVILY_API_KEY'] = getpass('Enter Tavily API Key: ')
os.environ['OPENAI_API_KEY'] = getpass('Enter OpenAI API Key: ')
Enter fullscreen mode Exit fullscreen mode

Step 5: Test Your Setup

Run this quick test to verify everything works:

from tavily import TavilyClient
from langchain.chat_models import init_chat_model
from deepagents import create_deep_agent

# Test Tavily connection
client = TavilyClient()
test_search = client.search("test query", max_results=1)
print("✅ Tavily search working")

# Test model initialization
model = init_chat_model(model="openai:gpt-4o")
print("✅ LLM connection working")

# Test DeepAgents
simple_agent = create_deep_agent(model=model, tools=[])
print("✅ DeepAgents framework working")

print("\n🎉 Setup complete! You're ready to build deep agents.")
Enter fullscreen mode Exit fullscreen mode

Project Structure

Organize your project like this:

deepagents-project/
│
├── deepagents_research.ipynb    # Main notebook (from GitHub)
├── .env                          # Your API keys (DO NOT commit to git)
├── .gitignore                    # Ignore .env and other sensitive files
├── requirements.txt              # Dependency list
└── README.md                     # Project documentation
Enter fullscreen mode Exit fullscreen mode

Common Setup Issues and Solutions

Issue: "ModuleNotFoundError: No module named 'deepagents'"

  • Solution: Make sure you activated your virtual environment before installing

Issue: "Invalid API key" errors

  • Solution: Double-check your API keys, ensure no extra spaces, verify they're active

Issue: "Rate limit exceeded" from Tavily or OpenAI

  • Solution: Check your API usage limits, consider upgrading your plan, or add rate limiting

Issue: File permission errors

  • Solution: Ensure you have write permissions in your working directory

Quick Start Checklist

  • [ ] Python 3.8+ installed
  • [ ] Virtual environment created and activated
  • [ ] Dependencies installed (pip install deepagents tavily-python langchain-google-genai langchain-openai)
  • [ ] Tavily API key obtained and configured
  • [ ] LLM provider API key obtained and configured
  • [ ] Test imports successful
  • [ ] Tavily search test successful
  • [ ] Model initialization test successful
  • [ ] Ready to run the full implementation

Next Steps

Once your setup is complete, you can:

  1. Download the complete code from the GitHub repository
  2. Run through the implementation step by step
  3. Experiment with different research queries
  4. Modify sub-agent prompts for different domains
  5. Add additional sub-agents for specialized tasks

6. Integrate with your own applications

Let's Run

Running the Complete System

Now that everything is set up, let's execute the full policy research workflow.

Step 1: Start Jupyter Notebook

jupyter notebook
Enter fullscreen mode Exit fullscreen mode

Navigate to your deepagents_research.ipynb file and open it.

Step 2: Run Initialization Cells

Execute cells in order to:

  1. Import dependencies
  2. Configure API keys
  3. Define the search tool
  4. Create sub-agent configurations
  5. Set up the main agent prompt
  6. Initialize the deep agent

Step 3: Execute a Research Query

Run the invocation cell:

query = "What are the latest updates on the EU AI Act and its global impact?"
result = agent.invoke({"messages": [{"role": "user", "content": query}]})
Enter fullscreen mode Exit fullscreen mode

What You'll See During Execution:

The agent will output its thought process and actions:

💭 Thought: I need to research the EU AI Act updates systematically.

🔧 Action: write_file
   Path: question.txt
   Content: What are the latest updates on the EU AI Act...

✅ Observation: File created successfully

💭 Thought: Now I'll delegate the research to my specialized sub-agent.

🔧 Action: task
   Agent: policy-research-agent
   Instruction: Research latest EU AI Act updates and global impact

   [Sub-agent starts executing...]

   🔍 Searching: "EU AI Act 2025 updates"
   🔍 Searching: "EU AI Act global impact"
   🔍 Searching: "AI regulations worldwide comparison"

   ✅ Sub-agent complete: [Returns comprehensive research findings]

💭 Thought: I have the research. Now I'll structure it into a report.

🔧 Action: write_file
   Path: final_report.md
   Content: [Structured Markdown report with citations]

✅ Observation: Report written

💭 Thought: Let me get editorial feedback to ensure quality.

🔧 Action: task
   Agent: policy-critique-agent
   Instruction: Review the draft report

   [Critique sub-agent executes...]

   ✅ Sub-agent complete: [Returns constructive feedback]

💭 Thought: Based on the feedback, I'll make final revisions.

🔧 Action: edit_file
   Path: final_report.md
   Edits: [Improvements based on critique]

✅ Final report complete!
Enter fullscreen mode Exit fullscreen mode

Step 4: View the Results

Access the final report:

# Option 1: From the result object
final_message = result['messages'][-1]['content']
print(final_message)

# Option 2: Read from file system
report = agent.read_file("final_report.md")
print(report)

# Option 3: View all generated files
files = agent.list_files()
print("Files created:", files)
Enter fullscreen mode Exit fullscreen mode

Example Output Structure:

The final report will look something like this:

# Latest Updates on the EU AI Act and Global Impact

## Executive Summary

The EU AI Act, formally adopted in [date], represents the world's first 
comprehensive regulatory framework for artificial intelligence...

## Key Updates

### Regulatory Timeline
- Final text published: [date]
- Implementation begins: [date]
- Full enforcement: [date]

### Core Provisions
- Risk-based classification system
- Prohibited AI practices
- High-risk AI requirements
- Transparency obligations

## Global Impact

### Regional Responses

**United States**: The U.S. has responded with executive orders...

**China**: China's AI regulations focus on...

**United Kingdom**: The UK has taken a different approach...

### International Standards

The EU AI Act is influencing global AI governance through...

## Industry Implications

Organizations worldwide are adapting to these regulations by...

## Sources

- [EU AI Act Official Text](https://example.com)
- [Global AI Policy Tracker](https://example.com)
- [Industry Analysis Report](https://example.com)
Enter fullscreen mode Exit fullscreen mode

Step 5: Try Different Queries

Experiment with various research questions:

# Example 1: Different topic
query1 = "How are different countries regulating AI in healthcare?"
result1 = agent.invoke({"messages": [{"role": "user", "content": query1}]})

# Example 2: Specific comparison
query2 = "Compare AI ethics frameworks between US, EU, and China"
result2 = agent.invoke({"messages": [{"role": "user", "content": query2}]})

# Example 3: Narrow focus
query3 = "What are the key compliance requirements for AI systems under the EU AI Act?"
result3 = agent.invoke({"messages": [{"role": "user", "content": query3}]})
Enter fullscreen mode Exit fullscreen mode

Step 6: Inspect the Workflow

View the complete execution trace:

# See all steps taken
for msg in result['messages']:
    role = msg.get('role', 'system')
    content = msg.get('content', '')[:200]  # First 200 chars
    print(f"\n[{role}]: {content}...")

# Count tool calls
tool_calls = [msg for msg in result['messages'] if msg.get('tool_calls')]
print(f"\nTotal tool calls: {len(tool_calls)}")

# See sub-agent invocations
sub_agent_calls = [call for call in tool_calls if 'task' in str(call)]
print(f"Sub-agent invocations: {len(sub_agent_calls)}")
Enter fullscreen mode Exit fullscreen mode

Performance Metrics:

Monitor agent performance:

import time

start_time = time.time()
result = agent.invoke({"messages": [{"role": "user", "content": query}]})
end_time = time.time()

execution_time = end_time - start_time
print(f"⏱️  Total execution time: {execution_time:.2f} seconds")

# Token usage (if available from your LLM provider)
# This varies by provider
if 'usage' in result:
    print(f"🎫 Tokens used: {result['usage']}")
Enter fullscreen mode Exit fullscreen mode

Debugging Tips:

If something goes wrong:

  1. Check API Keys: Verify all keys are set correctly
print("Tavily:", "" if os.getenv('TAVILY_API_KEY') else "")
print("OpenAI:", "" if os.getenv('OPENAI_API_KEY') else "")
Enter fullscreen mode Exit fullscreen mode
  1. Test Components Individually:
# Test search function
test_result = internet_search("test query", max_results=1)
print("Search working:", "" if test_result else "")

# Test model
test_response = model.invoke([{"role": "user", "content": "Hi"}])
print("Model working:", "" if test_response else "")
Enter fullscreen mode Exit fullscreen mode
  1. Enable Verbose Logging:
import logging
logging.basicConfig(level=logging.DEBUG)
Enter fullscreen mode Exit fullscreen mode
  1. Check File System:
# List all files created by agent
files = agent.list_files()
print("Files created:", files)

# Read any file to debug
content = agent.read_file("question.txt")
print("Question file:", content)
Enter fullscreen mode Exit fullscreen mode

Expected Behavior:

  • Normal execution: 30-90 seconds depending on query complexity
  • Sub-agent calls: Typically 2 (research + critique)
  • File operations: 3-5 (write question, write report, edit report)
  • Search queries: 3-7 (depending on research depth)

Success Indicators:

✅ Agent completes all workflow steps
✅ Final report is well-structured Markdown
✅ Sources are properly cited
✅ Report addresses the original question
✅ Professional tone maintained throughout
✅ No errors or crashes

Next Experiments:

Once you have it working:

  1. Modify sub-agent prompts for different domains (technology, finance, healthcare)
  2. Add additional sub-agents (fact-checker, summarizer, translator)
  3. Adjust search parameters (more results, different topics)
  4. Try different LLM models (compare GPT-4 vs Gemini vs Claude)
  5. Integrate with your own data sources or APIs

Closing Thoughts

We've just built something remarkable: an AI agent that doesn't just respond to queries but thinks strategically, delegates to specialists, manages complex state, and iterates toward high-quality outputs. This isn't the future of AI agents—it's what's possible right now with LangChain's DeepAgents.

What We Accomplished

In this hands-on guide, you learned how to:

  • Transform a basic tool-calling agent into a sophisticated planning system
  • Leverage file-based context management to handle tasks of any complexity
  • Design and coordinate specialized sub-agents for focused execution
  • Implement iterative quality control through automated review workflows
  • Build production-ready research systems that rival human analysis

The policy research agent we built demonstrates patterns that extend far beyond this specific use case. Whether you're building code generation systems, content creation pipelines, data analysis workflows, or customer service automation, the principles remain the same: plan strategically, delegate intelligently, manage state persistently, and iterate toward quality.

The Deeper Implications

The evolution from shallow to deep agents represents a fundamental shift in how we build AI systems. We're moving from tools that execute individual tasks to systems that orchestrate complex workflows. The key insight is that intelligence emerges not just from powerful models, but from thoughtful architecture—planning layers, memory systems, delegation patterns, and quality controls.

DeepAgents embodies this philosophy. By providing built-in planning tools, file system access, sub-agent creation, and long-term memory, it gives developers the building blocks to create genuinely sophisticated AI systems without reinventing infrastructure.

What's Next for AI Agents

The trajectory is clear: agents are becoming more capable, more modular, and more specialized. We're heading toward ecosystems where:

  • Agent swarms collaborate on complex problems, each bringing specialized expertise
  • Persistent memory allows agents to maintain context across days, weeks, or months of work
  • Self-improvement loops enable agents to learn from feedback and enhance their own prompts
  • Multi-modal capabilities combine text, code, images, and data seamlessly
  • Human-AI collaboration reaches new levels as agents become true thought partners

The foundation you've learned here—planning, delegation, state management, quality control—will remain relevant as these capabilities evolve.

Practical Next Steps

Where should you go from here?

Immediate Experiments:

  1. Adapt this architecture to your own domain (swap policy research for tech analysis, financial reports, or medical literature reviews)
  2. Add more specialized sub-agents (fact-checkers, translators, data analyzers)
  3. Integrate with your own data sources, databases, or APIs
  4. Experiment with different LLM models to find the best performance/cost balance
  5. Build quality metrics to measure agent performance over time

Production Considerations:

  • Implement proper error handling and retry logic
  • Add monitoring and logging for production deployments
  • Build evaluation frameworks to assess agent output quality
  • Consider cost optimization strategies (caching, smaller models for sub-agents)
  • Design human-in-the-loop workflows for critical decisions

Learning More:

  • Explore the DeepAgents GitHub repository for advanced examples
  • Study the LangGraph documentation for state management patterns
  • Join the LangChain community to learn from other practitioners
  • Experiment with Claude Code and Deep Research to see these patterns at scale

The Bigger Picture

What excites me most about this technology isn't just what it can do today, but what it enables tomorrow. As agents become more sophisticated, the barrier to building intelligent systems continues to fall. Complex workflows that once required teams of specialists can now be orchestrated by thoughtfully designed agent systems.

This democratization of AI capabilities means:

  • Small teams can build products that previously required large organizations
  • Individuals can leverage AI to amplify their expertise and productivity
  • New categories of applications become possible
  • The focus shifts from AI implementation to AI orchestration

We're entering an era where the key skill isn't writing every algorithm from scratch, but knowing how to compose powerful systems from intelligent components.

Final Thoughts

The agent you built today is more than a research tool—it's a pattern for building intelligent systems. The planning layer, the delegation model, the context management, the quality control—these aren't specific to policy research. They're fundamental architectural principles for any complex AI workflow.

As you apply these patterns to your own projects, remember: the goal isn't to replace human intelligence, but to augment it. The best AI systems are those that handle complexity strategically, maintain context persistently, leverage specialization effectively, and iterate toward quality relentlessly.

That's what DeepAgents enables. That's what you now know how to build.

The future of AI isn't about smarter models alone—it's about smarter architectures. And you're now equipped to create them.


Want to go deeper?

Share your builds: I'd love to see what you create with DeepAgents. Tag your projects on social media or share them in the LangChain community.

Now go build something remarkable. 🚀

Top comments (0)