LangChain's Deep Agents provide a new way to build structured, multi-agent systems that can plan, delegate and reason across multiple steps.
It comes with planning, a filesystem for context and subagent spawning built in. But connecting that agent to a live frontend and actually showing what’s happening behind the scenes in real time is still surprisingly hard.
Today, we will build a Deep Agents powered research assistant using Tavily and connect it to a live Next.js UI with CopilotKit, so every step the agent takes streams to the frontend in real time.
You will find architecture, the key patterns, how state flows between the UI ↔ agent and the step-by-step guide to building this from scratch.
Let's build it.
What is covered?
In summary, we are covering these topics in detail.
- What are Deep Agents?
- Core Components
- What are we building?
- Building Frontend
- Building Backend (FastAPI + Deep Agents + AG-UI)
- Running Application
- Data flow (frontend ↔ Agent)
Here is the GitHub Repository, deployed link and official docs if you want to explore yourself.
1. What are Deep Agents?
Most agents today are just “LLM in a loop + tools”. That works but it tends to be shallow: no explicit plan, weak long-horizon execution and messy state as runs get longer.
Popular agents like Claude Code, Deep Research and Manus get around this by following a common pattern: they plan first, externalize working context (often via files or a shell) and delegate isolated pieces of work to sub-agents.
Deep Agents package those primitives into a reusable agent runtime.
Instead of designing your own agent loop from scratch, you call create_deep_agent(...) and get a pre-wired execution graph that already knows how to plan, delegate and manage state across many steps.
At a practical level, a Deep Agent created via create_deep_agent is just a LangGraph graph. There’s no separate runtime or hidden orchestration layer.
The "context management" in deep agents is also very practical -- they offload large tool payloads to the filesystem and only fall back to summarization when token usage approaches the model’s context window. You can read more on Context Management for Deep Agents blog by LangChain.
The mental model (how it runs)
Conceptually, the execution flow looks like this:
User goal
↓
Deep Agent (LangGraph StateGraph)
├─ Plan: write_todos → updates "todos" in state
├─ Delegate: task(...) → runs a subagent with its own tool loop
├─ Context: ls/read_file/write_file/edit_file → persists working notes/artifacts
↓
Final answer
That gives you a usable structure for “plan → do work → store intermediate artifacts → continue” without inventing your own plan format, memory layer or delegation protocol.
You can check official docs.
Where CopilotKit Fits
Deep Agents push key parts into explicit state (e.g. todos + files + messages), which makes runs easier to inspect. That explicit state is also what makes Copilotkit integration possible.
CopilotKit is a frontend runtime that keeps UI state in sync with agent execution by streaming agent events and state updates in real time (using AG-UI under the hood).
This middleware (CopilotKitMiddleware) is what allows the frontend to stay in lock-step with the agent as it runs. You can read the docs at docs.copilotkit.ai/langgraph/deep-agents.
agent = create_deep_agent(
model="openai:gpt-4o",
tools=[get_weather],
middleware=[CopilotKitMiddleware()], # for frontend tools and context
system_prompt="You are a helpful research assistant."
)
2. Core Components
Here are the core components that we will be using later on:
1) Planning Tools (built-in via Deep Agents) - built-in planning/to‑do behavior so the agent can break the workflow into steps without you writing a separate planning tool.
# Conceptual example (not required in codebase)
@tool
def todo_write(tasks: List[str]) -> str:
formatted = "\n".join([f"- {task}" for task in tasks])
return f"Todo list created:\n{formatted}"
2) Subagents - let the main agent delegate focused tasks into isolated execution loops. Each sub-agent has its own prompt, tools and context.
subagents = [
{
"name": "job-search-agent",
"description": "Finds relevant jobs and outputs structured job candidates.",
"system_prompt": JOB_SEARCH_PROMPT,
"tools": [internet_search],
}
]
3) Tools - this is how the agent actually does things. Here, finalize() signals completion.
@tool
def finalize() -> dict:
"""Signal that the agent is done."""
return {"status": "done"}
How Deep Agents are implemented (Middleware)
If you are wondering how create_deep_agent() actually injects planning, files and subagents into a normal LangGraph agent, the answer is middleware.
Each feature is implemented as a separate middleware. By default, three are attached:
To-do list middleware - adds the
write_todostool and instructions that push the agent to explicitly plan and update a live todo list during multi-step tasks.Filesystem middleware - adds file tools (
ls,read_file,write_file,edit_file) so the agent can externalize notes and artifacts instead of stuffing everything into chat history.Subagent middleware - adds the
tasktool, allowing the main agent to delegate work to subagents with isolated context and their own prompts/tools.
This is what makes Deep Agents feel “pre-wired” without introducing a new runtime. If you want to go deeper, the middleware docs show the exact implementation details.
3. What are we building?
Let's create an agent that:
- Accepts a research question from the user
- Uses Deep Agents to plan multi-step and orchestrate sub-agents
- Searches the web using Tavily
- Writes intermediate research artifacts using the filesystem middleware
- Streams tool results back to the UI via CopilotKit (AG-UI)
The interface is a two-panel app where the left side is a CopilotKit chat UI and the right side is a live workspace that shows the agent’s plan, generated files and sources as the agent works.
Here's a simplified call request → response flow of what will happen:
[User asks research question]
↓
Next.js Frontend (CopilotChat + Workspace)
↓
CopilotKit Runtime → LangGraphHttpAgent
↓
Python Backend (FastAPI + AG-UI)
↓
Deep Agent (research_assistant)
├── write_todos (planning, built-in)
├── write_file (filesystem, built-in)
├── read_file (filesystem, built-in)
└── research(query)
└── internal Deep Agent [thread-isolated]
└── internet_search (Tavily)
We will see the concepts in action as we build the agent.
4. Frontend: wiring the agent to the UI
Let's first build the frontend part. This is how our directory will look.
The src directory hosts the Next.js frontend, including the UI, shared components and the CopilotKit API route (/api/copilotkit) used for agent communication.
.
├── src/ ← Next.js frontend
│ ├── app/
│ │ ├── page.tsx
│ │ ├── layout.tsx ← CopilotKit provider
│ │ └── api/
│ │ └── copilotkit/route.ts ← CopilotKit AG-UI runtime
│ ├── components/
│ │ ├── FileViewerModal.tsx ← Markdown file viewer
│ │ ├── WorkSpace.tsx ← Research progress display
│ │ └── ToolCard.tsx ← Tool call visualizer
├── lib/
│ └── types.ts
├── package.json
├── next.config.ts
└── README.md
If you don’t have a frontend, you can create a new Next.js app with TypeScript.
// creates a nextjs app
npx create-next-app@latest .
Step 1: CopilotKit Provider & Layout
Install the necessary CopilotKit packages.
npm install @copilotkit/react-core @copilotkit/react-ui @copilotkit/runtime
@copilotkit/react-coreprovides the core React hooks and context that connect your UI to an AG-UI compatible agent backend.@copilotkit/react-uioffers ready-made UI components like<CopilotChat />to build AI chat or assistant interfaces quickly.@copilotkit/runtimeis the server-side runtime that exposes an API endpoint and bridges the frontend with an AG-UI compatible backend (e.g., a LangGraph HTTP agent).
The <CopilotKit> component must wrap the Copilot-aware parts of your application. In most cases, it's best to place it around the entire app, like in layout.tsx.
import type { Metadata } from "next";
import { CopilotKit } from "@copilotkit/react-core";
import "./globals.css";
import "@copilotkit/react-ui/styles.css";
export const metadata: Metadata = {
title: "Deep Research Assistant | CopilotKit Deep Agents Demo",
description: "A research assistant powered by Deep Agents and CopilotKit - demonstrating planning, memory, subagents, and generative UI",
};
export default function RootLayout({
children,
}: Readonly<{
children: React.ReactNode;
}>) {
return (
<html lang="en">
<body className="antialiased">
<CopilotKit runtimeUrl="/api/copilotkit" agent="research_assistant">
{children}
</CopilotKit>
</body>
</html>
);
}
Here, runtimeUrl="/api/copilotkit" points to the Next.js API route CopilotKit uses to talk to the agent backend.
Each page is wrapped in this context so UI components know which agent to invoke and where to send requests.
Step 2: Next.js API Route: Proxy to FastAPI
This Next.js API route acts as a thin proxy between the browser and the Deep Agents. It:
- Accepts CopilotKit requests from the UI
- Forwards them to the agent over AG-UI
- Streams agent state and events back to the frontend
Instead of letting the frontend talk to the FastAPI agent directly, all requests go through a single endpoint /api/copilotkit.
import {
CopilotRuntime,
ExperimentalEmptyAdapter,
copilotRuntimeNextJSAppRouterEndpoint,
} from "@copilotkit/runtime";
import { LangGraphHttpAgent } from "@copilotkit/runtime/langgraph";
import { NextRequest } from "next/server";
// Empty adapter since the LLM is handled by the remote agent
const serviceAdapter = new ExperimentalEmptyAdapter();
// Configure CopilotKit runtime with the Deep Agents backend
const runtime = new CopilotRuntime({
agents: {
research_assistant: new LangGraphHttpAgent({
url: process.env.LANGGRAPH_DEPLOYMENT_URL || "http://localhost:8123",
}),
},
});
export const POST = async (req: NextRequest) => {
const { handleRequest } = copilotRuntimeNextJSAppRouterEndpoint({
runtime,
serviceAdapter,
endpoint: "/api/copilotkit",
});
return handleRequest(req);
};
Here's a simple explanation of the above code:
The code above registers the
research_assistantagent.LangGraphHttpAgent: defines a remote LangGraph agent endpoint. It points to the Deep Agents backend running on FastAPI.ExperimentalEmptyAdapter: simple no-op adapter used when the agent backend handles its own LLM calls and orchestrationcopilotRuntimeNextJSAppRouterEndpoint: small helper that adapts the Copilot runtime to a Next.js App Router API route and returns ahandleRequestfunction
Step 3: Types (Research State)
Before building components, let's define shared state for todos, files, and sources in lib/types/research.ts. These are the contracts between the tool results from the agent and the local React state.
// Uses local state + useDefaultTool instead of CoAgent (avoids type issues with Python FilesystemMiddleware)
export interface Todo {
id: string;
content: string;
status: "pending" | "in_progress" | "completed";
}
export interface ResearchFile {
path: string;
content: string;
createdAt: string;
}
// Sources found via internet_search (includes content)
export interface Source {
url: string;
title: string;
content?: string;
status: "found" | "scraped" | "failed";
}
export interface ResearchState {
todos: Todo[];
files: ResearchFile[];
sources: Source[];
}
export const INITIAL_STATE: ResearchState = {
todos: [],
files: [],
sources: [],
};
Instead of dumping raw tool JSON into chat, each tool result routes into a dedicated state slot - write_todos updates todos, write_file appends to files and research appends to sources. This becomes the foundation of the Workspace panel.
Step 4: Building Key Components
I'm only covering the core logic behind each component since the overall code is huge. You can find all the components in the repository at src/components.
✅ ToolCard Component
This is the client component that renders every tool call inline inside chat. It has two modes:
SpecializedToolCardfor known tools (write_todos,research,write_file,read_file) with icons, status indicators and result previewsDefaultToolCardfor unknown tools that fall back to expandable JSON.
"use client";
import { useState } from "react";
import { Pencil, ClipboardList, Search, Save, BookOpen, Check, ChevronDown } from "lucide-react";
const TOOL_CONFIG = {
write_todos: {
icon: Pencil,
getDisplayText: () => "Updating research plan...",
getResultSummary: (result, args) => {
const todos = (args as { todos?: unknown[] })?.todos;
if (Array.isArray(todos)) {
return `${todos.length} todo${todos.length !== 1 ? "s" : ""} updated`;
}
return null;
},
},
research: {
icon: Search,
getDisplayText: (args) =>
`Researching: ${((args.query as string) || "...").slice(0, 50)}${(args.query as string)?.length > 50 ? "..." : ""}`,
getResultSummary: (result) => {
if (result && typeof result === "object" && "sources" in result) {
const { sources } = result as { summary: string; sources: unknown[] };
return `Found ${sources.length} source${sources.length !== 1 ? "s" : ""}`;
}
return "Research complete";
},
},
write_file: {
icon: Save,
getDisplayText: (args) => {
const path = args.path as string | undefined;
const filename =
path?.split("/").pop() || (args.filename as string | undefined);
return `Writing: ${filename || "file"}`;
},
getResultSummary: (_result, args) => {
const content = args.content as string | undefined;
if (content) {
const firstLine = content.split("\n")[0].slice(0, 50);
return firstLine + (content.length > 50 ? "..." : "");
}
return "File written";
},
},
// read_todos, read_file follow the same pattern
};
export function ToolCard({ name, status, args, result }: ToolCardProps) {
const config = TOOL_CONFIG[name];
if (config) {
return (
<SpecializedToolCard
name={name}
status={status}
args={args}
result={result}
config={config}
/>
);
}
return (
<DefaultToolCard name={name} status={status} args={args} result={result} />
);
}
Here's a brief explanation:
researchandwrite_todosare expandable, clicking reveals the full query + findings or the live todo checklist withpending/inprogress/completedstates.resultSummaryappears as a small green line below the display text so you can glance at the output without expanding.DefaultToolCardhandles any tool not inTOOL_CONFIG, showing collapsible args and result JSON.
In this component, ExpandedDetails handles the per-tool expanded view. research and write_todos get structured layouts; everything else falls back to a JSON pre block.
function ExpandedDetails({ name, result, args }) {
if (name === "research") {
const summary = typeof result === "object" && result && "summary" in result
? (result as any).summary
: "";
return (
<div>
<p>Query: {args.query as string}</p>
<p>{summary}</p>
</div>
);
}
if (name === "write_todos") {
const todos = (args as any)?.todos;
return (
<div>
{todos?.map((todo: any, i: number) => (
<div key={todo.id || i}>
<span>{todo.status === "completed" ? "✓" : todo.status === "inprogress" ? "●" : "○"}</span>
<span>{todo.content}</span>
</div>
))}
</div>
);
}
// Fallback
return (
<pre>
{typeof result === "string" ? result : JSON.stringify(result, null, 2)}
</pre>
);
}
Check out the complete code at src/components/ToolCard.tsx.
✅ Workspace Component
This is the right-side panel that displays research progress as it happens. It has three collapsible sections: Research Plan, Files and Sources - each with a live badge count and an empty state when nothing has arrived yet.
export function Workspace({ state }: { state: ResearchState }) {
const { todos, files, sources } = state;
const fileCount = files.length;
const todoCount = todos.length;
const sourceCount = sources.length;
// State for file viewer modal
const [selectedFile, setSelectedFile] = useState<ResearchFile | null>(null);
return (
<div className="workspace-panel p-6">
<div className="mb-6">
<h2 className="text-xl font-bold">Workspace</h2>
<p className="text-sm">Research progress and artifacts</p>
</div>
<Section title="Research Plan" icon={ListTodo} badge={todos.length}>
<TodoList todos={todos} />
</Section>
<Section title="Files" icon={FileText} badge={files.length}>
<FileList files={files} onFileClick={setSelectedFile} />
</Section>
<Section title="Sources" icon={Globe} badge={sources.length}>
<SourceList sources={sources} />
</Section>
<FileViewerModal
file={selectedFile}
onClose={() => setSelectedFile(null)}
/>
</div>
);
}
function TodoList({ todos }: { todos: Todo[] }) {
if (todos.length === 0) return <div>...</div>; // empty state
return (
<div className="space-y-1">
{todos.map((todo) => (
<div
key={todo.id}
className={`todo-item ${
todo.status === "completed"
? "todo-item-completed"
: todo.status === "in_progress"
? "todo-item-inprogress"
: "todo-item-pending"
}`}
>
<span>{/* Check / CircleDot / Circle icon based on status */}</span>
<span className="text-sm">{todo.content}</span>
</div>
))}
</div>
);
}
// FileList — same pattern, items are clickable (onFileClick)
// each row has a download button with e.stopPropagation()
function FileList({
files,
onFileClick,
}: {
files: ResearchFile[];
onFileClick: (file: ResearchFile) => void;
}) {
if (files.length === 0) return <div>...</div>; // empty state
return (
<div className="space-y-2">
{files.map((file, i) => (
<div
key={`${file.path}-${i}`}
className="file-item"
onClick={() => onFileClick(file)}
>
<div className="flex items-center gap-3">
{/* FileText icon */}
<div>
<p className="text-sm font-medium">
{file.path.split("/").pop()}
</p>
<p className="text-xs">{file.path}</p>
</div>
</div>
<button
onClick={(e) => {
e.stopPropagation();
downloadFile(file);
}}
>
{/* Download icon */}
</button>
</div>
))}
</div>
);
}
// SourceList — same pattern, colour-codes by source.status
// (scraped → green check, failed → red X, found → grey circle)
// each source links to source.url
function SourceList({ sources }: { sources: Source[] }) {
if (sources.length === 0) return <div>...</div>; // empty state
return (
<div className="space-y-2">
{sources.map((source, i) => (
<div
key={`${source.url}-${i}`}
className={`file-item ${source.status === "failed" ? "source-failed" : ""}`}
>
<span>{/* Check / X / Circle based on source.status */}</span>
<div className="flex-1 min-w-0">
<p className="text-sm font-medium truncate">
{source.title || new URL(source.url).hostname}
</p>
<a
href={source.url}
target="_blank"
rel="noopener noreferrer"
className="text-xs truncate block"
>
{source.url}
</a>
</div>
</div>
))}
</div>
);
}
A brief explanation:
TodoListrenders each todo with apending/inprogress/completedicon. Completed items get a strikethrough.FileListitems are clickable: they openFileViewerModalwith full Markdown rendering and a download button.SourceListcolour-codes sources: green check for scraped, red X for failed, grey circle for found-but-not-yet-scraped.
Check out the complete code at src/components/WorkSpace.tsx.
✅ FileViewerModal Component
This component renders a modal that displays the contents of a research file written by the Deep Agent. It renders file content as formatted Markdown using the react-markdown package.
Install it using the following command.
npm install react-markdown
Here is the full core implementation:
export function FileViewerModal({ file, onClose }: FileViewerModalProps) {
const handleKeyDown = useCallback(
(e: KeyboardEvent) => { if (e.key === "Escape") onClose(); },
[onClose]
);
useEffect(() => {
if (file) {
document.addEventListener("keydown", handleKeyDown);
document.body.style.overflow = "hidden";
}
return () => {
document.removeEventListener("keydown", handleKeyDown);
document.body.style.overflow = "";
};
}, [file, handleKeyDown]);
if (!file) return null;
const filename = file.path.split("/").pop() || file.path;
const handleDownload = () => {
const blob = new Blob([file.content], { type: "text/markdown" });
const url = URL.createObjectURL(blob);
const a = document.createElement("a");
a.href = url;
a.download = filename;
document.body.appendChild(a);
a.click();
document.body.removeChild(a);
URL.revokeObjectURL(url);
};
return (
<div className="fixed inset-0 z-50 flex items-center justify-center p-4">
<div className="absolute inset-0 bg-black/30 backdrop-blur-sm" onClick={onClose} aria-hidden="true" />
<div className="relative max-w-3xl w-full max-h-[85vh] flex flex-col" role="dialog" aria-modal="true">
{/* Header -- filename + download + close */}
<div className="flex items-center justify-between border-b ...">
<div className="flex items-center gap-3">
{/* FileText icon */}
<h2 className="truncate max-w-md">{filename}</h2>
</div>
<div className="flex items-center gap-2">
<button onClick={handleDownload}>...</button>
<button onClick={onClose}>...</button>
</div>
</div>
{/* Scrollable Markdown body */}
<div className="flex-1 overflow-y-auto p-8">
<div className="prose prose-sm prose-slate max-w-none">
<ReactMarkdown>{file.content}</ReactMarkdown>
</div>
</div>
{/* Footer -- full file path */}
<div className="border-t ...">
<code>{file.path}</code>
</div>
</div>
</div>
);
}
Here's a brief explanation:
useEffectregisters theEscapelistener and locks body scroll only when a file is open, and cleans both up on close.handleDownloadcreates a temporary anchor element to trigger a browser download without any server round-trip.
Check out the complete code at src/components/FileViewerModal.tsx.
Step 5: Connecting the Chat UI to the Agent
At this point, all the pieces are in place. This page owns ResearchState, passes it to Workspace, uses CopilotChat component for the conversational UI, runs useDefaultTool to intercept every tool call and routes results into state.
-
research→ appends tosources -
write_todos→ replacestodos -
write_file→ appends tofiles
Here's the code.
"use client";
import { useState, useRef } from "react";
import { CopilotChat } from "@copilotkit/react-ui";
import { useDefaultTool } from "@copilotkit/react-core";
import { Workspace } from "@/components/Workspace";
import { ResearchState, INITIAL_STATE, Todo } from "@/types/research";
import { ToolCard } from "@/components/ToolCard";
export default function Page() {
const [state, setState] = useState<ResearchState>(INITIAL_STATE);
const processedKeysRef = useRef<Set<string>>(new Set());
useDefaultTool({
render: (props) => {
const { name, status, args, result } = props;
// Deduplicate on re-renders
if (status === "complete") {
const resultStr = result ? JSON.stringify(result) : "";
const resultHash = resultStr
? `${resultStr.length}-${resultStr.slice(0, 100)}`
: "";
const key = `${name}-${JSON.stringify(args)}-${resultHash}`;
if (processedKeysRef.current.has(key)) return <ToolCard {...props} />;
processedKeysRef.current.add(key);
}
if (name === "research" && status === "complete" && result) {
const researchResult = result as {
summary: string;
sources: Array<{
url: string;
title: string;
content?: string;
status: "found" | "scraped" | "failed";
}>;
};
if (researchResult.sources && researchResult.sources.length > 0) {
queueMicrotask(() =>
setState((prev) => ({
...prev,
sources: [...prev.sources, ...researchResult.sources],
})),
);
}
}
if (name === "write_todos" && status === "complete" && args?.todos) {
const todosWithIds = (
args.todos as Array<{ id?: string; content: string; status: string }>
).map((todo, index) => ({
...todo,
id: todo.id || `todo-${Date.now()}-${index}`,
}));
queueMicrotask(() =>
setState((prev) => ({ ...prev, todos: todosWithIds as Todo[] })),
);
}
if (name === "write_file" && status === "complete" && args?.file_path) {
queueMicrotask(() =>
setState((prev) => ({
...prev,
files: [
...prev.files,
{
path: args.file_path as string,
content: args.content as string,
createdAt: new Date().toISOString(),
},
],
})),
);
}
return <ToolCard {...props} />;
},
});
return (
<div className="relative min-h-screen">
<main className="relative z-10 h-screen flex overflow-hidden">
{/* Chat panel -- left side */}
<div className="w-[38%] h-full border-r flex flex-col">
{/* Header */}
<div style={{ flex: 1, minHeight: 0, overflow: "hidden" }}>
<CopilotChat
className="h-full"
labels={{
title: "Deep Research Assistant",
initial: "What topic would you like me to research?",
placeholder: "Ask me to research any topic...",
}}
/>
</div>
</div>
{/* Workspace panel -- right side */}
<div className="w-[62%] h-full overflow-hidden">
<Workspace state={state} />
</div>
</main>
</div>
);
}
A few things worth noting:
useDefaultToolintercepts every tool call and renders aToolCardinline insideCopilotChat.queueMicrotaskdeferssetStateso it never fires mid-render.processedKeysRefdeduplicates results since the render callback fires multiple times as status updates stream in.
4. Backend: Building the Agent Service (FastAPI + Deep Agents + Tavily)
We will now build the FastAPI backend that hosts our Deep Agent.
Under the /agent directory lives a FastAPI server that runs the research agent. Here's the project structure of the backend.
agent/
├── main.py ← FastAPI server + AG-UI
├── agent.py ← Deep Agents graph
├── tools.py ← Tavily search tools
├── pyproject.toml ← Python deps (uv)
├── uv.lock
└── .env
At a high level, the backend:
- Builds a Deep Agent graph
- Uses Tavily for real-time web search
- Wraps research in an isolated sub-agent
- Exposes a CopilotKit-compatible AG-UI endpoint
- Streams tool calls to the frontend
The backend uses uv for dependency management. Install it if you don't have it:
pip install uv
Initialize a new uv project. This will generate a fresh pyproject.toml.
cd agent
uv init
uv python pin 3.12
Then install the dependencies:
uv add copilotkit deepagents fastapi langchain langchain-openai python-dotenv tavily-python "uvicorn[standard]" ag-ui-langgraph
copilotkit: connects agents to a frontend with streaming, tools and shared state.deepagents: planning-first agent framework for multi-step executionfastapi: web framework that exposes the agent APIlangchain: agent and tool orchestration layer.langchain-openai: OpenAI model integration for LangChain.tavily-python: web search for real-time researchag-ui-langgraph: AG-UI protocol adapter for LangGraphuvicorn[standard]: ASGI server to run FastAPI
Now run the following command to generate a uv.lock file pinned with exact versions.
uv sync
Add necessary API Keys
Create a .env file under the agent directory and add your OpenAI API Key and Tavily API Key to the file. I have attached the docs link so it's easy to follow.
OPENAI_API_KEY=sk-proj-...
TAVILY_API_KEY=tvly-dev-...
OPENAI_MODEL=gpt-5.2
Step 1: Implement Research Tools
Let's define two tools (tools.py):
internet_searchis the low-level Tavily wrapper. It runs a search and returns up tomax_resultsformatted results.researchwraps an internal Deep Agent that runs in a separate thread to prevent its streaming events from leaking to the frontend via LangChain callback propagation.
Here is the code.
import os
from typing import Any
from concurrent.futures import ThreadPoolExecutor
from langchain_core.tools import tool
from langchain_core.messages import HumanMessage
from tavily import TavilyClient
def _do_internet_search(query: str, max_results: int = 5) -> list[dict[str, Any]]:
tavily_key = os.environ.get("TAVILY_API_KEY")
if not tavily_key:
raise RuntimeError("TAVILY_API_KEY not set")
try:
client = TavilyClient(api_key=tavily_key)
results = client.search(
query=query,
max_results=max_results,
include_raw_content=False,
topic="general",
)
return [
{
"url": r.get("url", ""),
"title": r.get("title", ""),
"content": (r.get("content") or "")[:3000],
}
for r in results.get("results", [])
]
except Exception as e:
return [{"error": str(e)}]
@tool
def internet_search(query: str, max_results: int = 5) -> list[dict[str, Any]]:
"""Search the web and return results with content."""
return _do_internet_search(query, max_results)
@tool
def research(query: str) -> dict:
"""Research a topic using web search. Returns structured data with sources."""
from deepagents import create_deep_agent
from langchain_openai import ChatOpenAI
def _run_research_isolated():
search_results = []
def internet_search_tracked(query: str, max_results: int = 5):
"""Search the web and return results with content."""
results = _do_internet_search(query, max_results)
search_results.extend(results)
return results
llm = ChatOpenAI(
model=os.environ.get("OPENAI_MODEL", "gpt-5.2"),
temperature=0.7,
api_key=os.environ.get("OPENAI_API_KEY"),
)
research_agent = create_deep_agent(
model=llm,
system_prompt="""You are a Research Specialist.
Use internet_search to find information. Return a prose summary of findings.
Rules:
- Call internet_search ONCE with a focused query
- Return a brief summary (2-3 sentences) of key findings
- No JSON, no code blocks, just prose""",
tools=[internet_search_tracked],
)
result = research_agent.invoke({"messages": [HumanMessage(content=query)]})
summary = result["messages"][-1].content
sources = [
{
"url": r["url"],
"title": r.get("title", ""),
"content": r.get("content", "")[:3000],
"status": "found",
}
for r in search_results
if "url" in r and not r.get("error")
]
return {"summary": summary, "sources": sources}
with ThreadPoolExecutor(max_workers=1) as executor:
future = executor.submit(_run_research_isolated)
return future.result()
Here's what's happening:
_do_internet_searchcaps each result at 3000 chars to stay within token limits and returns an error dict instead of crashing.internet_search_trackedextendssearch_resultson every call, building a clean source list to return alongside the final summary.research_agentis constrained to callinternet_searchonce -- prevents looping and keeps latency predictable.ThreadPoolExecutorisolates the subagent in a separate thread so its stream events don't leak into the frontend chat.
Step 2: Define the Agent Behavior
This is the brain of the system. The agent is a Deep Agents graph that:
- Creates a research plan using
write_todos - Calls
research()for each research question - Writes a final report to
/reports/final_report.md
Here is the main system prompt that enforces this workflow:
MAIN_SYSTEM_PROMPT = """You are a Deep Research Assistant, an expert at planning and
executing comprehensive research on any topic.
Hard rules (ALWAYS follow):
- NEVER output raw JSON, data structures, or code blocks in your messages
- Communicate with the user only in natural, readable prose
- When you receive data from research, synthesize it into insights
Your workflow:
1. PLAN: Create a research plan using write_todos with clear, actionable steps
2. RESEARCH: Use research(query) tool to investigate each topic
3. SYNTHESIZE: Write a final report to /reports/final_report.md using write_file
Important guidelines:
- Always start by creating a research plan with write_todos
- Call research() for each distinct research question
- The research tool returns prose summaries of findings
- You write all files - compile findings into a comprehensive report
- Update todos as you complete each step
Example workflow:
1. write_todos(["Research topic A", "Research topic B", "Synthesize findings"])
2. research("Find information about topic A") -> receives prose summary
3. research("Find information about topic B") -> receives prose summary
4. write_file("/reports/final_report.md", "# Research Report\n\n...")
Always maintain a professional, comprehensive research style."""
Here is the core agent graph (build_agent):
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from deepagents import create_deep_agent
from langgraph.checkpoint.memory import MemorySaver
from copilotkit import CopilotKitMiddleware
from tools import research
load_dotenv()
def build_agent():
api_key = os.environ.get("OPENAI_API_KEY")
if not api_key:
raise RuntimeError("Missing OPENAI_API_KEY environment variable")
tavily_key = os.environ.get("TAVILY_API_KEY")
if not tavily_key:
raise RuntimeError("Missing TAVILY_API_KEY environment variable")
model_name = os.environ.get("OPENAI_MODEL", "gpt-5.2")
llm = ChatOpenAI(
model=model_name,
temperature=0.7,
api_key=api_key,
)
agent_graph = create_deep_agent(
model=llm,
system_prompt=MAIN_SYSTEM_PROMPT,
tools=[research],
middleware=[CopilotKitMiddleware()],
checkpointer=MemorySaver(),
)
return agent_graph.with_config({"recursion_limit": 100})
Here's what's happening:
-
CopilotKitMiddleware()enables AG-UI streaming to the frontend -
MemorySaver()enables stateful execution - Recursion limit increased to support multi-step research
-
researchis the only explicit tool sincewrite_todos,read_file,write_fileare built into Deep Agents automatically
Step 3: Expose the Agent via FastAPI + AG-UI
The last step is to wire everything together and expose it as a FastAPI app. The main.py builds the agent graph, configures which tool calls to emit to the frontend and registers the AG-UI endpoint.
import os
import uvicorn
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from dotenv import load_dotenv
from ag_ui_langgraph import add_langgraph_fastapi_endpoint
from copilotkit import LangGraphAGUIAgent
from copilotkit.langgraph import copilotkit_customize_config
from agent import build_agent
load_dotenv()
app = FastAPI(title="Deep Research Assistant", version="1.0.0")
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
@app.get("/health")
def health():
return {"status": "ok", "service": "deep-research-agent"}
agent_graph = build_agent()
# Only emit main agent tools — suppress internal subagent tool calls
agui_config = copilotkit_customize_config(
emit_tool_calls=["research", "write_todos", "write_file", "read_file", "edit_file"]
)
agui_config["recursion_limit"] = 100
add_langgraph_fastapi_endpoint(
app=app,
agent=LangGraphAGUIAgent(
name="research_assistant",
description="A deep research assistant that plans, searches, and synthesizes research reports",
graph=agent_graph,
config=agui_config,
),
path="/",
)
def main():
uvicorn.run(
"main:app",
host=os.getenv("SERVER_HOST", "0.0.0.0"),
port=int(os.getenv("SERVER_PORT", "8123")),
reload=True,
)
if __name__ == "__main__":
main()
The emit_tool_calls config controls exactly which tool calls CopilotKit streams to the frontend. Without it, the subagent's internal internet_search calls leak through as raw JSON noise in the chat UI.
5. Running the Application
After completing all the parts of the code, it's time to run it locally. Please make sure you have added the credentials to the agent/.env.
From the project root, navigate to the agent directory and start the FastAPI server:
cd agent
uv run python main.py
The backend will start on http://localhost:8123.
In a new terminal window, start the frontend development server using:
npm run dev
Once both servers are running, open the frontend in your browser at http://localhost:3000/ to view it locally.

CopilotKit also provides the Agent Inspector, which is a live AG-UI runtime view that lets you inspect agent runs, state snapshots, messages and tool calls as they stream from the backend. It's accessible from a copilotkit button overlaid on your app.
Here is the complete demo!
6. Data flow
Now that we have built both the frontend and the agent service, this is how data actually flows between them. This should be easy to follow if you have been building along so far.
[User sends research question]
↓
Next.js Frontend (CopilotChat + Workspace)
↓
POST /api/copilotkit (AG-UI protocol)
↓
CopilotKit Runtime → LangGraphHttpAgent (localhost:8123)
↓
FastAPI Backend (AG-UI endpoint "/")
↓
Deep Agent (research_assistant)
↓
Deep Agents orchestration
├── write_todos → todos state → Workspace (Research Plan)
├── research() → sources state → Workspace (Sources)
│ └── internal Deep Agent [thread-isolated]
│ └── internet_search (Tavily API)
└── write_file → files state → Workspace (Files)
↓
AG-UI streaming (SSE)
↓
useDefaultTool intercepts tool calls
↓
ToolCard renders in chat + Workspace updates live
That's it! 🎉
You now have a fully working Deep Agents research assistant that plans, searches, and writes reports, with every step visible in the frontend as it happens.
The real win here is visibility. CopilotKit's AG-UI layer turns what would otherwise be a black-box agent into something users can actually follow and trust.
| You can check my work at anmolbaranwal.com. Thank you for reading! 🥰 |
|
|---|
Follow CopilotKit on Twitter and say hi, and if you'd like to build something cool, join the Discord community.













Top comments (2)
Really appreciated the thread isolation pattern inside the research tool. Running the internal Deep Agent in a separate thread to prevent callback propagation is such a clean way to avoid tool-call noise leaking into the frontend stream. Subtle detail, but super important for real-world DevX. 👏
This is literally a tutorial for building your own Manus, Claude code, and Deep Research style applications! really cool.