Anmol Baranwal

for CopilotKit

Posted on Feb 20

How to Build a Research Assistant using Deep Agents

#react #javascript #tutorial #programming

LangChain's Deep Agents provide a new way to build structured, multi-agent systems that can plan, delegate and reason across multiple steps.

It comes with planning, a filesystem for context and subagent spawning built in. But connecting that agent to a live frontend and actually showing what’s happening behind the scenes in real time is still surprisingly hard.

Today, we will build a Deep Agents powered research assistant using Tavily and connect it to a live Next.js UI with CopilotKit, so every step the agent takes streams to the frontend in real time.

You will find architecture, the key patterns, how state flows between the UI ↔ agent and the step-by-step guide to building this from scratch.

Let's build it.

What is covered?

In summary, we are covering these topics in detail.

What are Deep Agents?
Core Components
What are we building?
Building Frontend
Building Backend (FastAPI + Deep Agents + AG-UI)
Running Application
Data flow (frontend ↔ Agent)

Here is the GitHub Repository, deployed link and official docs if you want to explore yourself.

1. What are Deep Agents?

Most agents today are just “LLM in a loop + tools”. That works but it tends to be shallow: no explicit plan, weak long-horizon execution and messy state as runs get longer.

Popular agents like Claude Code, Deep Research and Manus get around this by following a common pattern: they plan first, externalize working context (often via files or a shell) and delegate isolated pieces of work to sub-agents.

Deep Agents package those primitives into a reusable agent runtime.

Instead of designing your own agent loop from scratch, you call create_deep_agent(...) and get a pre-wired execution graph that already knows how to plan, delegate and manage state across many steps.

At a practical level, a Deep Agent created via create_deep_agent is just a LangGraph graph. There’s no separate runtime or hidden orchestration layer.

The "context management" in deep agents is also very practical -- they offload large tool payloads to the filesystem and only fall back to summarization when token usage approaches the model’s context window. You can read more on Context Management for Deep Agents blog by LangChain.

The mental model (how it runs)

Conceptually, the execution flow looks like this:

User goal
  ↓
Deep Agent (LangGraph StateGraph)
  ├─ Plan: write_todos → updates "todos" in state
  ├─ Delegate: task(...) → runs a subagent with its own tool loop
  ├─ Context: ls/read_file/write_file/edit_file → persists working notes/artifacts
  ↓
Final answer

That gives you a usable structure for “plan → do work → store intermediate artifacts → continue” without inventing your own plan format, memory layer or delegation protocol.

You can check official docs.

Where CopilotKit Fits

Deep Agents push key parts into explicit state (e.g. todos + files + messages), which makes runs easier to inspect. That explicit state is also what makes Copilotkit integration possible.

CopilotKit is a frontend runtime that keeps UI state in sync with agent execution by streaming agent events and state updates in real time (using AG-UI under the hood).

This middleware (CopilotKitMiddleware) is what allows the frontend to stay in lock-step with the agent as it runs. You can read the docs at docs.copilotkit.ai/langgraph/deep-agents.

agent = create_deep_agent(
    model="openai:gpt-4o",
    tools=[get_weather],
    middleware=[CopilotKitMiddleware()], # for frontend tools and context
    system_prompt="You are a helpful research assistant."
)

2. Core Components

Here are the core components that we will be using later on:

1) Planning Tools (built-in via Deep Agents) - built-in planning/to‑do behavior so the agent can break the workflow into steps without you writing a separate planning tool.

# Conceptual example (not required in codebase)
@tool
def todo_write(tasks: List[str]) -> str:
    formatted = "\n".join([f"- {task}" for task in tasks])
    return f"Todo list created:\n{formatted}"

2) Subagents - let the main agent delegate focused tasks into isolated execution loops. Each sub-agent has its own prompt, tools and context.

subagents = [
    {
        "name": "job-search-agent",
        "description": "Finds relevant jobs and outputs structured job candidates.",
        "system_prompt": JOB_SEARCH_PROMPT,
        "tools": [internet_search],
    }
]

3) Tools - this is how the agent actually does things. Here, finalize() signals completion.

@tool
def finalize() -> dict:
    """Signal that the agent is done."""
    return {"status": "done"}

How Deep Agents are implemented (Middleware)

If you are wondering how create_deep_agent() actually injects planning, files and subagents into a normal LangGraph agent, the answer is middleware.

Each feature is implemented as a separate middleware. By default, three are attached:

To-do list middleware - adds the write_todos tool and instructions that push the agent to explicitly plan and update a live todo list during multi-step tasks.
Filesystem middleware - adds file tools (ls, read_file, write_file, edit_file) so the agent can externalize notes and artifacts instead of stuffing everything into chat history.
Subagent middleware - adds the task tool, allowing the main agent to delegate work to subagents with isolated context and their own prompts/tools.

This is what makes Deep Agents feel “pre-wired” without introducing a new runtime. If you want to go deeper, the middleware docs show the exact implementation details.

3. What are we building?

Let's create an agent that:

Accepts a research question from the user
Uses Deep Agents to plan multi-step and orchestrate sub-agents
Searches the web using Tavily
Writes intermediate research artifacts using the filesystem middleware
Streams tool results back to the UI via CopilotKit (AG-UI)

The interface is a two-panel app where the left side is a CopilotKit chat UI and the right side is a live workspace that shows the agent’s plan, generated files and sources as the agent works.

Here's a simplified call request → response flow of what will happen:

[User asks research question]
        ↓
Next.js Frontend (CopilotChat + Workspace)
        ↓
CopilotKit Runtime → LangGraphHttpAgent
        ↓
Python Backend (FastAPI + AG-UI)
        ↓
Deep Agent (research_assistant)
    ├── write_todos        (planning, built-in)
    ├── write_file         (filesystem, built-in)
    ├── read_file          (filesystem, built-in)
    └── research(query)
            └── internal Deep Agent [thread-isolated]
                    └── internet_search (Tavily)

We will see the concepts in action as we build the agent.

4. Frontend: wiring the agent to the UI

Let's first build the frontend part. This is how our directory will look.

The src directory hosts the Next.js frontend, including the UI, shared components and the CopilotKit API route (/api/copilotkit) used for agent communication.

.
├── src/                             ← Next.js frontend
│   ├── app/
│   │   ├── page.tsx          
│   │   ├── layout.tsx               ← CopilotKit provider
│   │   └── api/
│   │       └── copilotkit/route.ts  ← CopilotKit AG-UI runtime
│   ├── components/
│   │   ├── FileViewerModal.tsx      ← Markdown file viewer
│   │   ├── WorkSpace.tsx            ← Research progress display
│   │   └── ToolCard.tsx             ← Tool call visualizer
├── lib/
│   └── types.ts
├── package.json                     
├── next.config.ts                   
└── README.md

If you don’t have a frontend, you can create a new Next.js app with TypeScript.

// creates a nextjs app  
npx create-next-app@latest .

Step 1: CopilotKit Provider & Layout

Install the necessary CopilotKit packages.

npm install @copilotkit/react-core @copilotkit/react-ui @copilotkit/runtime

@copilotkit/react-core provides the core React hooks and context that connect your UI to an AG-UI compatible agent backend.
@copilotkit/react-ui offers ready-made UI components like <CopilotChat /> to build AI chat or assistant interfaces quickly.
@copilotkit/runtime is the server-side runtime that exposes an API endpoint and bridges the frontend with an AG-UI compatible backend (e.g., a LangGraph HTTP agent).

The <CopilotKit> component must wrap the Copilot-aware parts of your application. In most cases, it's best to place it around the entire app, like in layout.tsx.

import type { Metadata } from "next";

import { CopilotKit } from "@copilotkit/react-core";
import "./globals.css";
import "@copilotkit/react-ui/styles.css";

export const metadata: Metadata = {
  title: "Deep Research Assistant | CopilotKit Deep Agents Demo",
  description: "A research assistant powered by Deep Agents and CopilotKit - demonstrating planning, memory, subagents, and generative UI",
};

export default function RootLayout({
  children,
}: Readonly<{
  children: React.ReactNode;
}>) {
  return (
    <html lang="en">
      <body className="antialiased">
        <CopilotKit runtimeUrl="/api/copilotkit" agent="research_assistant">
          {children}
        </CopilotKit>
      </body>
    </html>
  );
}

Here, runtimeUrl="/api/copilotkit" points to the Next.js API route CopilotKit uses to talk to the agent backend.

Each page is wrapped in this context so UI components know which agent to invoke and where to send requests.

Step 2: Next.js API Route: Proxy to FastAPI

This Next.js API route acts as a thin proxy between the browser and the Deep Agents. It:

Accepts CopilotKit requests from the UI
Forwards them to the agent over AG-UI
Streams agent state and events back to the frontend

Instead of letting the frontend talk to the FastAPI agent directly, all requests go through a single endpoint /api/copilotkit.

import {
  CopilotRuntime,
  ExperimentalEmptyAdapter,
  copilotRuntimeNextJSAppRouterEndpoint,
} from "@copilotkit/runtime";
import { LangGraphHttpAgent } from "@copilotkit/runtime/langgraph";
import { NextRequest } from "next/server";

// Empty adapter since the LLM is handled by the remote agent
const serviceAdapter = new ExperimentalEmptyAdapter();

// Configure CopilotKit runtime with the Deep Agents backend
const runtime = new CopilotRuntime({
  agents: {
    research_assistant: new LangGraphHttpAgent({
      url: process.env.LANGGRAPH_DEPLOYMENT_URL || "http://localhost:8123",
    }),
  },
});

export const POST = async (req: NextRequest) => {
  const { handleRequest } = copilotRuntimeNextJSAppRouterEndpoint({
    runtime,
    serviceAdapter,
    endpoint: "/api/copilotkit",
  });

  return handleRequest(req);
};

Here's a simple explanation of the above code:

The code above registers the research_assistant agent.
LangGraphHttpAgent : defines a remote LangGraph agent endpoint. It points to the Deep Agents backend running on FastAPI.
ExperimentalEmptyAdapter : simple no-op adapter used when the agent backend handles its own LLM calls and orchestration
copilotRuntimeNextJSAppRouterEndpoint : small helper that adapts the Copilot runtime to a Next.js App Router API route and returns a handleRequest function

Step 3: Types (Research State)

Before building components, let's define shared state for todos, files, and sources in lib/types/research.ts. These are the contracts between the tool results from the agent and the local React state.

// Uses local state + useDefaultTool instead of CoAgent (avoids type issues with Python FilesystemMiddleware)

export interface Todo {
  id: string;
  content: string;
  status: "pending" | "in_progress" | "completed";
}

export interface ResearchFile {
  path: string;
  content: string;
  createdAt: string;
}

// Sources found via internet_search (includes content)
export interface Source {
  url: string;
  title: string;
  content?: string;
  status: "found" | "scraped" | "failed";
}

export interface ResearchState {
  todos: Todo[];
  files: ResearchFile[];
  sources: Source[];
}

export const INITIAL_STATE: ResearchState = {
  todos: [],
  files: [],
  sources: [],
};

Instead of dumping raw tool JSON into chat, each tool result routes into a dedicated state slot - write_todos updates todos, write_file appends to files and research appends to sources. This becomes the foundation of the Workspace panel.

Step 4: Building Key Components

I'm only covering the core logic behind each component since the overall code is huge. You can find all the components in the repository at src/components.

✅ ToolCard Component

This is the client component that renders every tool call inline inside chat. It has two modes:

SpecializedToolCard for known tools (write_todos, research, write_file, read_file) with icons, status indicators and result previews
DefaultToolCard for unknown tools that fall back to expandable JSON.

"use client";

import { useState } from "react";
import { Pencil, ClipboardList, Search, Save, BookOpen, Check, ChevronDown } from "lucide-react";

const TOOL_CONFIG = {
  write_todos: {
    icon: Pencil,
    getDisplayText: () => "Updating research plan...",
    getResultSummary: (result, args) => {
      const todos = (args as { todos?: unknown[] })?.todos;
      if (Array.isArray(todos)) {
        return `${todos.length} todo${todos.length !== 1 ? "s" : ""} updated`;
      }
      return null;
    },
  },
  research: {
    icon: Search,
    getDisplayText: (args) =>
      `Researching: ${((args.query as string) || "...").slice(0, 50)}${(args.query as string)?.length > 50 ? "..." : ""}`,
    getResultSummary: (result) => {
      if (result && typeof result === "object" && "sources" in result) {
        const { sources } = result as { summary: string; sources: unknown[] };
        return `Found ${sources.length} source${sources.length !== 1 ? "s" : ""}`;
      }
      return "Research complete";
    },
  },
  write_file: {
    icon: Save,
    getDisplayText: (args) => {
      const path = args.path as string | undefined;
      const filename =
        path?.split("/").pop() || (args.filename as string | undefined);
      return `Writing: ${filename || "file"}`;
    },
    getResultSummary: (_result, args) => {
      const content = args.content as string | undefined;
      if (content) {
        const firstLine = content.split("\n")[0].slice(0, 50);
        return firstLine + (content.length > 50 ? "..." : "");
      }
      return "File written";
    },
  },
  // read_todos, read_file follow the same pattern
};

export function ToolCard({ name, status, args, result }: ToolCardProps) {
  const config = TOOL_CONFIG[name];
  if (config) {
    return (
      <SpecializedToolCard
        name={name}
        status={status}
        args={args}
        result={result}
        config={config}
      />
    );
  }
  return (
    <DefaultToolCard name={name} status={status} args={args} result={result} />
  );
}

Here's a brief explanation:

research and write_todos are expandable, clicking reveals the full query + findings or the live todo checklist with pending / inprogress / completed states.
resultSummary appears as a small green line below the display text so you can glance at the output without expanding.
DefaultToolCard handles any tool not in TOOL_CONFIG, showing collapsible args and result JSON.

In this component, ExpandedDetails handles the per-tool expanded view. research and write_todos get structured layouts; everything else falls back to a JSON pre block.

function ExpandedDetails({ name, result, args }) {
  if (name === "research") {
    const summary = typeof result === "object" && result && "summary" in result
      ? (result as any).summary
      : "";
    return (
      <div>
        <p>Query: {args.query as string}</p>
        <p>{summary}</p>
      </div>
    );
  }

  if (name === "write_todos") {
    const todos = (args as any)?.todos;
    return (
      <div>
        {todos?.map((todo: any, i: number) => (
          <div key={todo.id || i}>
            <span>{todo.status === "completed" ? "✓" : todo.status === "inprogress" ? "●" : "○"}</span>
            <span>{todo.content}</span>
          </div>
        ))}
      </div>
    );
  }

  // Fallback
  return (
    <pre>
      {typeof result === "string" ? result : JSON.stringify(result, null, 2)}
    </pre>
  );
}

Check out the complete code at src/components/ToolCard.tsx.

✅ Workspace Component

This is the right-side panel that displays research progress as it happens. It has three collapsible sections: Research Plan, Files and Sources - each with a live badge count and an empty state when nothing has arrived yet.

export function Workspace({ state }: { state: ResearchState }) {
  const { todos, files, sources } = state;
  const fileCount = files.length;
  const todoCount = todos.length;
  const sourceCount = sources.length;
  
  // State for file viewer modal
  const [selectedFile, setSelectedFile] = useState<ResearchFile | null>(null);

  return (
    <div className="workspace-panel p-6">
      <div className="mb-6">
        <h2 className="text-xl font-bold">Workspace</h2>
        <p className="text-sm">Research progress and artifacts</p>
      </div>
      <Section title="Research Plan" icon={ListTodo} badge={todos.length}>
        <TodoList todos={todos} />
      </Section>
      <Section title="Files" icon={FileText} badge={files.length}>
        <FileList files={files} onFileClick={setSelectedFile} />
      </Section>
      <Section title="Sources" icon={Globe} badge={sources.length}>
        <SourceList sources={sources} />
      </Section>
      <FileViewerModal
        file={selectedFile}
        onClose={() => setSelectedFile(null)}
      />
    </div>
  );
}

function TodoList({ todos }: { todos: Todo[] }) {
  if (todos.length === 0) return <div>...</div>; // empty state

  return (
    <div className="space-y-1">
      {todos.map((todo) => (
        <div
          key={todo.id}
          className={`todo-item ${
            todo.status === "completed"
              ? "todo-item-completed"
              : todo.status === "in_progress"
                ? "todo-item-inprogress"
                : "todo-item-pending"
          }`}
        >
          <span>{/* Check / CircleDot / Circle icon based on status */}</span>
          <span className="text-sm">{todo.content}</span>
        </div>
      ))}
    </div>
  );
}

// FileList — same pattern, items are clickable (onFileClick)
// each row has a download button with e.stopPropagation()
function FileList({
  files,
  onFileClick,
}: {
  files: ResearchFile[];
  onFileClick: (file: ResearchFile) => void;
}) {
  if (files.length === 0) return <div>...</div>; // empty state

  return (
    <div className="space-y-2">
      {files.map((file, i) => (
        <div
          key={`${file.path}-${i}`}
          className="file-item"
          onClick={() => onFileClick(file)}
        >
          <div className="flex items-center gap-3">
            {/* FileText icon */}
            <div>
              <p className="text-sm font-medium">
                {file.path.split("/").pop()}
              </p>
              <p className="text-xs">{file.path}</p>
            </div>
          </div>
          <button
            onClick={(e) => {
              e.stopPropagation();
              downloadFile(file);
            }}
          >
            {/* Download icon */}
          </button>
        </div>
      ))}
    </div>
  );
}

// SourceList — same pattern, colour-codes by source.status 
// (scraped → green check, failed → red X, found → grey circle) 
// each source links to source.url
function SourceList({ sources }: { sources: Source[] }) {
  if (sources.length === 0) return <div>...</div>; // empty state

  return (
    <div className="space-y-2">
      {sources.map((source, i) => (
        <div
          key={`${source.url}-${i}`}
          className={`file-item ${source.status === "failed" ? "source-failed" : ""}`}
        >
          <span>{/* Check / X / Circle based on source.status */}</span>
          <div className="flex-1 min-w-0">
            <p className="text-sm font-medium truncate">
              {source.title || new URL(source.url).hostname}
            </p>
            <a
              href={source.url}
              target="_blank"
              rel="noopener noreferrer"
              className="text-xs truncate block"
            >
              {source.url}
            </a>
          </div>
        </div>
      ))}
    </div>
  );
}

A brief explanation:

TodoList renders each todo with a pending / inprogress / completed icon. Completed items get a strikethrough.
FileList items are clickable: they open FileViewerModal with full Markdown rendering and a download button.
SourceList colour-codes sources: green check for scraped, red X for failed, grey circle for found-but-not-yet-scraped.

Check out the complete code at src/components/WorkSpace.tsx.

✅ FileViewerModal Component

This component renders a modal that displays the contents of a research file written by the Deep Agent. It renders file content as formatted Markdown using the react-markdown package.

Install it using the following command.

npm install react-markdown

Here is the full core implementation:

export function FileViewerModal({ file, onClose }: FileViewerModalProps) {
  const handleKeyDown = useCallback(
    (e: KeyboardEvent) => { if (e.key === "Escape") onClose(); },
    [onClose]
  );

  useEffect(() => {
    if (file) {
      document.addEventListener("keydown", handleKeyDown);
      document.body.style.overflow = "hidden";
    }
    return () => {
      document.removeEventListener("keydown", handleKeyDown);
      document.body.style.overflow = "";
    };
  }, [file, handleKeyDown]);

  if (!file) return null;

  const filename = file.path.split("/").pop() || file.path;

  const handleDownload = () => {
    const blob = new Blob([file.content], { type: "text/markdown" });
    const url = URL.createObjectURL(blob);
    const a = document.createElement("a");
    a.href = url;
    a.download = filename;
    document.body.appendChild(a);
    a.click();
    document.body.removeChild(a);
    URL.revokeObjectURL(url);
  };

  return (
    <div className="fixed inset-0 z-50 flex items-center justify-center p-4">
      <div className="absolute inset-0 bg-black/30 backdrop-blur-sm" onClick={onClose} aria-hidden="true" />
      <div className="relative max-w-3xl w-full max-h-[85vh] flex flex-col" role="dialog" aria-modal="true">
        {/* Header -- filename + download + close */}
        <div className="flex items-center justify-between border-b ...">
          <div className="flex items-center gap-3">
            {/* FileText icon */}
            <h2 className="truncate max-w-md">{filename}</h2>
          </div>
          <div className="flex items-center gap-2">
            <button onClick={handleDownload}>...</button>
            <button onClick={onClose}>...</button>
          </div>
        </div>

        {/* Scrollable Markdown body */}
        <div className="flex-1 overflow-y-auto p-8">
          <div className="prose prose-sm prose-slate max-w-none">
            <ReactMarkdown>{file.content}</ReactMarkdown>
          </div>
        </div>

        {/* Footer -- full file path */}
        <div className="border-t ...">
          <code>{file.path}</code>
        </div>
      </div>
    </div>
  );
}

Here's a brief explanation:

useEffect registers the Escape listener and locks body scroll only when a file is open, and cleans both up on close.
handleDownload creates a temporary anchor element to trigger a browser download without any server round-trip.

Check out the complete code at src/components/FileViewerModal.tsx.

Step 5: Connecting the Chat UI to the Agent

At this point, all the pieces are in place. This page owns ResearchState, passes it to Workspace, uses CopilotChat component for the conversational UI, runs useDefaultTool to intercept every tool call and routes results into state.

research → appends to sources
write_todos → replaces todos
write_file → appends to files

Here's the code.

"use client";

import { useState, useRef } from "react";
import { CopilotChat } from "@copilotkit/react-ui";
import { useDefaultTool } from "@copilotkit/react-core";
import { Workspace } from "@/components/Workspace";
import { ResearchState, INITIAL_STATE, Todo } from "@/types/research";
import { ToolCard } from "@/components/ToolCard";

export default function Page() {
  const [state, setState] = useState<ResearchState>(INITIAL_STATE);
  const processedKeysRef = useRef<Set<string>>(new Set());

  useDefaultTool({
    render: (props) => {
      const { name, status, args, result } = props;

      // Deduplicate on re-renders
      if (status === "complete") {
        const resultStr = result ? JSON.stringify(result) : "";
        const resultHash = resultStr
          ? `${resultStr.length}-${resultStr.slice(0, 100)}`
          : "";
        const key = `${name}-${JSON.stringify(args)}-${resultHash}`;
        if (processedKeysRef.current.has(key)) return <ToolCard {...props} />;
        processedKeysRef.current.add(key);
      }

      if (name === "research" && status === "complete" && result) {
        const researchResult = result as {
          summary: string;
          sources: Array<{
            url: string;
            title: string;
            content?: string;
            status: "found" | "scraped" | "failed";
          }>;
        };
        if (researchResult.sources && researchResult.sources.length > 0) {
          queueMicrotask(() =>
            setState((prev) => ({
              ...prev,
              sources: [...prev.sources, ...researchResult.sources],
            })),
          );
        }
      }

      if (name === "write_todos" && status === "complete" && args?.todos) {
        const todosWithIds = (
          args.todos as Array<{ id?: string; content: string; status: string }>
        ).map((todo, index) => ({
          ...todo,
          id: todo.id || `todo-${Date.now()}-${index}`,
        }));
        queueMicrotask(() =>
          setState((prev) => ({ ...prev, todos: todosWithIds as Todo[] })),
        );
      }

      if (name === "write_file" && status === "complete" && args?.file_path) {
        queueMicrotask(() =>
          setState((prev) => ({
            ...prev,
            files: [
              ...prev.files,
              {
                path: args.file_path as string,
                content: args.content as string,
                createdAt: new Date().toISOString(),
              },
            ],
          })),
        );
      }

      return <ToolCard {...props} />;
    },
  });

  return (
    <div className="relative min-h-screen">
      <main className="relative z-10 h-screen flex overflow-hidden">
        {/* Chat panel -- left side */}
        <div className="w-[38%] h-full border-r flex flex-col">
          {/* Header */}
          <div style={{ flex: 1, minHeight: 0, overflow: "hidden" }}>
            <CopilotChat
              className="h-full"
              labels={{
                title: "Deep Research Assistant",
                initial: "What topic would you like me to research?",
                placeholder: "Ask me to research any topic...",
              }}
            />
          </div>
        </div>

        {/* Workspace panel -- right side */}
        <div className="w-[62%] h-full overflow-hidden">
          <Workspace state={state} />
        </div>
      </main>
    </div>
  );
}

A few things worth noting:

useDefaultTool intercepts every tool call and renders a ToolCard inline inside CopilotChat.
queueMicrotask defers setState so it never fires mid-render.
processedKeysRef deduplicates results since the render callback fires multiple times as status updates stream in.

4. Backend: Building the Agent Service (FastAPI + Deep Agents + Tavily)

We will now build the FastAPI backend that hosts our Deep Agent.

Under the /agent directory lives a FastAPI server that runs the research agent. Here's the project structure of the backend.

agent/
├── main.py           ← FastAPI server + AG-UI
├── agent.py          ← Deep Agents graph
├── tools.py          ← Tavily search tools
├── pyproject.toml    ← Python deps (uv)
├── uv.lock
└── .env

At a high level, the backend:

Builds a Deep Agent graph
Uses Tavily for real-time web search
Wraps research in an isolated sub-agent
Exposes a CopilotKit-compatible AG-UI endpoint
Streams tool calls to the frontend

The backend uses uv for dependency management. Install it if you don't have it:

pip install uv

Initialize a new uv project. This will generate a fresh pyproject.toml.

cd agent
uv init
uv python pin 3.12

Then install the dependencies:

uv add copilotkit deepagents fastapi langchain langchain-openai python-dotenv tavily-python "uvicorn[standard]" ag-ui-langgraph

copilotkit : connects agents to a frontend with streaming, tools and shared state.
deepagents : planning-first agent framework for multi-step execution
fastapi : web framework that exposes the agent API
langchain : agent and tool orchestration layer.
langchain-openai : OpenAI model integration for LangChain.
tavily-python : web search for real-time research
ag-ui-langgraph : AG-UI protocol adapter for LangGraph
uvicorn[standard] : ASGI server to run FastAPI

Now run the following command to generate a uv.lock file pinned with exact versions.

uv sync

Add necessary API Keys

Create a .env file under the agent directory and add your OpenAI API Key and Tavily API Key to the file. I have attached the docs link so it's easy to follow.

OPENAI_API_KEY=sk-proj-...
TAVILY_API_KEY=tvly-dev-...
OPENAI_MODEL=gpt-5.2

Step 1: Implement Research Tools

Let's define two tools (tools.py):

internet_search is the low-level Tavily wrapper. It runs a search and returns up to max_results formatted results.
research wraps an internal Deep Agent that runs in a separate thread to prevent its streaming events from leaking to the frontend via LangChain callback propagation.

Here is the code.

import os
from typing import Any
from concurrent.futures import ThreadPoolExecutor
from langchain_core.tools import tool
from langchain_core.messages import HumanMessage
from tavily import TavilyClient


def _do_internet_search(query: str, max_results: int = 5) -> list[dict[str, Any]]:
    tavily_key = os.environ.get("TAVILY_API_KEY")
    if not tavily_key:
        raise RuntimeError("TAVILY_API_KEY not set")

    try:
        client = TavilyClient(api_key=tavily_key)
        results = client.search(
            query=query,
            max_results=max_results,
            include_raw_content=False,
            topic="general",
        )
        return [
            {
                "url": r.get("url", ""),
                "title": r.get("title", ""),
                "content": (r.get("content") or "")[:3000],
            }
            for r in results.get("results", [])
        ]
    except Exception as e:
        return [{"error": str(e)}]


@tool
def internet_search(query: str, max_results: int = 5) -> list[dict[str, Any]]:
    """Search the web and return results with content."""
    return _do_internet_search(query, max_results)


@tool
def research(query: str) -> dict:
    """Research a topic using web search. Returns structured data with sources."""

    from deepagents import create_deep_agent
    from langchain_openai import ChatOpenAI

    def _run_research_isolated():
        search_results = []

        def internet_search_tracked(query: str, max_results: int = 5):
            """Search the web and return results with content."""
            results = _do_internet_search(query, max_results)
            search_results.extend(results)
            return results

        llm = ChatOpenAI(
            model=os.environ.get("OPENAI_MODEL", "gpt-5.2"),
            temperature=0.7,
            api_key=os.environ.get("OPENAI_API_KEY"),
        )

        research_agent = create_deep_agent(
            model=llm,
            system_prompt="""You are a Research Specialist.
Use internet_search to find information. Return a prose summary of findings.
Rules:
- Call internet_search ONCE with a focused query
- Return a brief summary (2-3 sentences) of key findings
- No JSON, no code blocks, just prose""",
            tools=[internet_search_tracked],
        )

        result = research_agent.invoke({"messages": [HumanMessage(content=query)]})

        summary = result["messages"][-1].content
        sources = [
            {
                "url": r["url"],
                "title": r.get("title", ""),
                "content": r.get("content", "")[:3000],
                "status": "found",
            }
            for r in search_results
            if "url" in r and not r.get("error")
        ]

        return {"summary": summary, "sources": sources}

    with ThreadPoolExecutor(max_workers=1) as executor:
        future = executor.submit(_run_research_isolated)
        return future.result()

Here's what's happening:

_do_internet_search caps each result at 3000 chars to stay within token limits and returns an error dict instead of crashing.
internet_search_tracked extends search_results on every call, building a clean source list to return alongside the final summary.
research_agent is constrained to call internet_search once -- prevents looping and keeps latency predictable.
ThreadPoolExecutor isolates the subagent in a separate thread so its stream events don't leak into the frontend chat.

Step 2: Define the Agent Behavior

This is the brain of the system. The agent is a Deep Agents graph that:

Creates a research plan using write_todos
Calls research() for each research question
Writes a final report to /reports/final_report.md

Here is the main system prompt that enforces this workflow:

MAIN_SYSTEM_PROMPT = """You are a Deep Research Assistant, an expert at planning and
executing comprehensive research on any topic.

Hard rules (ALWAYS follow):
- NEVER output raw JSON, data structures, or code blocks in your messages
- Communicate with the user only in natural, readable prose
- When you receive data from research, synthesize it into insights

Your workflow:
1. PLAN: Create a research plan using write_todos with clear, actionable steps
2. RESEARCH: Use research(query) tool to investigate each topic
3. SYNTHESIZE: Write a final report to /reports/final_report.md using write_file

Important guidelines:
- Always start by creating a research plan with write_todos
- Call research() for each distinct research question
- The research tool returns prose summaries of findings
- You write all files - compile findings into a comprehensive report
- Update todos as you complete each step

Example workflow:
1. write_todos(["Research topic A", "Research topic B", "Synthesize findings"])
2. research("Find information about topic A") -> receives prose summary
3. research("Find information about topic B") -> receives prose summary
4. write_file("/reports/final_report.md", "# Research Report\n\n...")

Always maintain a professional, comprehensive research style."""

Here is the core agent graph (build_agent):

import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from deepagents import create_deep_agent
from langgraph.checkpoint.memory import MemorySaver
from copilotkit import CopilotKitMiddleware
from tools import research

load_dotenv()

def build_agent():
    api_key = os.environ.get("OPENAI_API_KEY")
    if not api_key:
        raise RuntimeError("Missing OPENAI_API_KEY environment variable")

    tavily_key = os.environ.get("TAVILY_API_KEY")
    if not tavily_key:
        raise RuntimeError("Missing TAVILY_API_KEY environment variable")

    model_name = os.environ.get("OPENAI_MODEL", "gpt-5.2")
    llm = ChatOpenAI(
        model=model_name,
        temperature=0.7,
        api_key=api_key,
    )

    agent_graph = create_deep_agent(
        model=llm,
        system_prompt=MAIN_SYSTEM_PROMPT,
        tools=[research],
        middleware=[CopilotKitMiddleware()],
        checkpointer=MemorySaver(),
    )

    return agent_graph.with_config({"recursion_limit": 100})

Here's what's happening:

CopilotKitMiddleware() enables AG-UI streaming to the frontend
MemorySaver() enables stateful execution
Recursion limit increased to support multi-step research
research is the only explicit tool since write_todos, read_file, write_file are built into Deep Agents automatically

Step 3: Expose the Agent via FastAPI + AG-UI

The last step is to wire everything together and expose it as a FastAPI app. The main.py builds the agent graph, configures which tool calls to emit to the frontend and registers the AG-UI endpoint.

import os
import uvicorn
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from dotenv import load_dotenv
from ag_ui_langgraph import add_langgraph_fastapi_endpoint
from copilotkit import LangGraphAGUIAgent
from copilotkit.langgraph import copilotkit_customize_config

from agent import build_agent

load_dotenv()

app = FastAPI(title="Deep Research Assistant", version="1.0.0")

app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

@app.get("/health")
def health():
    return {"status": "ok", "service": "deep-research-agent"}

agent_graph = build_agent()

# Only emit main agent tools — suppress internal subagent tool calls
agui_config = copilotkit_customize_config(
    emit_tool_calls=["research", "write_todos", "write_file", "read_file", "edit_file"]
)
agui_config["recursion_limit"] = 100

add_langgraph_fastapi_endpoint(
    app=app,
    agent=LangGraphAGUIAgent(
        name="research_assistant",
        description="A deep research assistant that plans, searches, and synthesizes research reports",
        graph=agent_graph,
        config=agui_config,
    ),
    path="/",
)

def main():
    uvicorn.run(
        "main:app",
        host=os.getenv("SERVER_HOST", "0.0.0.0"),
        port=int(os.getenv("SERVER_PORT", "8123")),
        reload=True,
    )

if __name__ == "__main__":
    main()

The emit_tool_calls config controls exactly which tool calls CopilotKit streams to the frontend. Without it, the subagent's internal internet_search calls leak through as raw JSON noise in the chat UI.

5. Running the Application

After completing all the parts of the code, it's time to run it locally. Please make sure you have added the credentials to the agent/.env.

From the project root, navigate to the agent directory and start the FastAPI server:

cd agent
uv run python main.py

The backend will start on http://localhost:8123.

In a new terminal window, start the frontend development server using:

npm run dev

Once both servers are running, open the frontend in your browser at http://localhost:3000/ to view it locally.

CopilotKit also provides the Agent Inspector, which is a live AG-UI runtime view that lets you inspect agent runs, state snapshots, messages and tool calls as they stream from the backend. It's accessible from a copilotkit button overlaid on your app.

Here is the complete demo!

6. Data flow

Now that we have built both the frontend and the agent service, this is how data actually flows between them. This should be easy to follow if you have been building along so far.

[User sends research question]
        ↓
Next.js Frontend (CopilotChat + Workspace)
        ↓
POST /api/copilotkit (AG-UI protocol)
        ↓
CopilotKit Runtime → LangGraphHttpAgent (localhost:8123)
        ↓
FastAPI Backend (AG-UI endpoint "/")
        ↓
Deep Agent (research_assistant)
        ↓
Deep Agents orchestration
    ├── write_todos  → todos state   → Workspace (Research Plan)
    ├── research()   → sources state → Workspace (Sources)
    │       └── internal Deep Agent [thread-isolated]
    │               └── internet_search (Tavily API)
    └── write_file   → files state   → Workspace (Files)
        ↓
AG-UI streaming (SSE)
        ↓
useDefaultTool intercepts tool calls
        ↓
ToolCard renders in chat + Workspace updates live

That's it! 🎉

You now have a fully working Deep Agents research assistant that plans, searches, and writes reports, with every step visible in the frontend as it happens.

The real win here is visibility. CopilotKit's AG-UI layer turns what would otherwise be a black-box agent into something users can actually follow and trust.

You can check my work at anmolbaranwal.com. Thank you for reading! 🥰

Follow CopilotKit on Twitter and say hi, and if you'd like to build something cool, join the Discord community.

CopilotKit

React UI + elegant infrastructure for AI Copilots, in-app AI agents, AI chatbots, and AI-powered Textareas 🪁

Top comments (17)

Fliin • Feb 20

Really appreciated the thread isolation pattern inside the research tool. Running the internal Deep Agent in a separate thread to prevent callback propagation is such a clean way to avoid tool-call noise leaking into the frontend stream. Subtle detail, but super important for real-world DevX. 👏

Anmol Baranwal CopilotKit • Feb 23 • Edited

yep! when you actually build and stream these systems in production, you see how fast the event stream gets chaotic. once tools start triggering subagents, debugging becomes painful. isolating it keeps the mental model clean and the frontend predictable.

Guilherme Zaia • Feb 21

Deep Agents abstract the easy part (planning loops). The hard part? Debugging cascading failures when SubAgent B hallucinates because Agent A's file write hit filesystem latency. Where's your observability stack? Without distributed tracing across that LangGraph, production becomes a black box. Streaming state to UI is slick, but can you replay a failed run without re-executing 40 Tavily calls?

Anmol Baranwal CopilotKit • Feb 23

Great points. This post focuses on the execution pattern + UI streaming, not full production observability.

LangGraph's MemorySaver does provide checkpoint-level state snapshots so you can inspect where a run broke in something like LangSmith without blindly re-running everything. It's not distributed tracing, but it's not a total black box either.

On Tavily replay, you are right. There's no caching layer in this demo, so a mid-run failure would re-hit the API. A simple cache around do_internet_search would fix that.

One small clarification: the default Deep Agents "filesystem" is virtual and backed by LangGraph state, so there isn't actual disk latency involved (unless someone wires a custom backend). In the default setup, the concern is more about state consistency across nested agents than actual disk latency.

Matthew Hou • Feb 21

The thread isolation bit is the most interesting part to me — keeping the inner agent's tool-call noise out of the frontend stream sounds simple but I've seen that go wrong fast in multi-agent setups. Once you have agents calling tools that trigger other agents, the event stream becomes impossible to reason about without some kind of boundary.

Curious if you hit any edge cases where the research sub-agent needs to stream partial results back before it's fully done?

Anmol Baranwal CopilotKit • Feb 23

yeah, once you let subagent events bleed into the outer stream, you lose the ability to reason about what's happening at the UI level.

On partial results: the current setup doesn't stream mid-research -- the sub-agent runs to completion inside the thread, then the result crosses the boundary as a single structured output. The research step shows as a single tool call in the stream, you see it start and complete in the demo video, just not the internal search it does underneath.

The real-time feel comes from the main agent updating the plan and writing files as it goes.

Matthew Hou • Feb 23

That makes sense — treating the research step as a single structured output at the boundary is cleaner than trying to stream partial results. You avoid a whole class of ordering bugs that way. The tradeoff is latency (user waits for the full sub-agent run), but for research-type tasks that's probably fine since the user expects it to take a minute anyway. Thanks for the detailed explanation.

Rune Breinholt Andersen • Feb 21

This resonates a lot.

I’ve noticed the same shift while building AI-driven side projects, especially products where speed of creation is no longer the bottleneck.

The real bottleneck is clarity.

Deleting code is often the moment where the product actually improves:
less mental load,
fewer edge cases,
faster iteration.

AI makes adding trivial.
Taste makes removing valuable.

Curious - do you think this changes how we should teach junior developers?

Anmol Baranwal CopilotKit • Feb 23

I think it does. We probably over-index on "how to build" and under-index on "how to decide what not to build." Right now most tutorials teach juniors to build first, refactor later -- but if AI handles the building, that order breaks down.

The earlier skill to develop is judgment: what not to add, when something is done, what complexity is actually load-bearing. AI removes the friction of writing code but it doesn't remove the cost of maintaining it.

Harsh • Feb 23

Super clean implementation! 🔥 Finally someone solved the real challenge with Deep Agents - that black box problem. Live streaming every agent step to the UI is a game changer for debugging and user trust.