Introduction
"LLMs shouldn't just talk about actions — they should actually execute them."
This is article No.33 in the "One Open Source Project a Day" series. Today's project is DeerFlow (GitHub).
Most AI Agent frameworks share a hidden limitation: they're good at suggesting, but not at doing. Generating code is easy — but actually running it, handling errors, iterating, and producing a deliverable artifact? That's the real challenge for complex research and automation tasks.
DeerFlow (Deep Exploration and Efficient Research Flow) is ByteDance's open-source answer to this problem. Completely rewritten in v2.0, it's no longer just a deep research framework — it's a general-purpose SuperAgent execution engine that runs code in real sandboxes, orchestrates parallel sub-agents, and handles tasks that take minutes to hours — from a single prompt all the way to a research report, a webpage, or a working program.
It hit #1 on GitHub Trending shortly after launch and now sits at 59k+ Stars, making it one of the most watched open-source projects in the AI Agent space.
What You'll Learn
- DeerFlow's core positioning and the v1 → v2 architectural evolution
- The SuperAgent execution flow: Lead Agent + parallel sub-agent orchestration
- How sandbox-isolated code execution works and its security design
- The Skills-as-Markdown extensibility mechanism
- Multi-model support strategy and Chinese model optimization
Prerequisites
- Basic understanding of LLM API calls (OpenAI-compatible interface format)
- Some Docker experience (recommended for deployment)
- Python basics (optional, for customization)
Project Background
What Is It?
DeerFlow stands for Deep Exploration and Efficient Research Flow, open-sourced by ByteDance. The project has gone through two major versions:
- v1.x: Positioned as a deep research framework — multi-round search, web scraping, and consolidated report generation
- v2.0 (February 2026, complete rewrite): Elevated to a general-purpose SuperAgent execution engine, introducing real sandbox code execution and a much broader range of supported task types
v2.0 shares no code with v1.x — it's a ground-up architectural rebuild, marking the project's transition from a "research assistance tool" to a "production-grade Agent execution engine."
About the Team
- Organization: ByteDance (official open-source project)
- Nature: Community-driven, led by ByteDance engineers, accepts external contributions
- Release Timeline: v1.0 ~early 2025, v2.0 released February 2026
- Milestone: Hit #1 on GitHub Trending on February 28, 2026
Project Stats
- ⭐ GitHub Stars: 59,200+
- 🍴 Forks: 7,500+
- 🐛 Open Issues: ~365
- 📄 License: MIT
- 🔄 Active Branches:
main(v2.x),main-1.x(v1.x maintenance)
Key Features
Core Purpose
DeerFlow's fundamental value proposition is making AI Agents actually do things rather than just talk about things:
| Capability | Traditional Agent Frameworks | DeerFlow v2.0 |
|---|---|---|
| Code Execution | Generates code (doesn't run it) | Real execution in isolated sandbox |
| Task Duration | Seconds to minutes | Minutes to hours |
| Task Decomposition | Sequential execution | Parallel sub-agent orchestration |
| Output Type | Text suggestions | Real deliverables: files, pages, programs |
| Context Limits | Bound by single model window | Sub-agent divide-and-conquer |
Use Cases
-
Deep Research Reports
- Given a research topic, automatically performs multi-round search, web scraping, and data synthesis to produce a structured report
-
Code Generation & Validation
- From requirements to a working program — real execution and debugging in the sandbox, iterating until it works
-
Data Analysis & Visualization
- Upload a data file; the Agent writes analysis scripts, generates charts, and outputs a ready-to-use analytics report
-
Web Development
- Describe what you need; the Agent writes HTML/CSS/JS, validates it in the sandbox, and delivers a complete webpage
-
Content Creation
- Automatically generate slides, podcast summaries, technical blog posts, and other content formats
Quick Start
Recommended (Docker):
# Clone the repository
git clone https://github.com/bytedance/deer-flow.git
cd deer-flow
# Generate configuration file
make config
# Edit config — fill in your model API keys
# Supports OpenAI, Claude, DeepSeek, Qwen, Doubao, etc.
vim config.yaml
# Initialize and start
make docker-init
make docker-start
# Access the web UI
# http://localhost:2026
Local development mode:
# Check environment requirements (Python 3.12+, Node.js 22+)
make check
# Install dependencies (uv for Python, pnpm for JS)
make install
# Start development servers
make dev
Built-in Skills
DeerFlow ships with several production-ready skills out of the box:
| Skill | Functionality |
|---|---|
| Deep Research | Multi-round search + web scraping + consolidated research report |
| Report Generation | Formatted report generation |
| Slide Creation | Presentation slide creation |
| Web Page Development | Full webpage development |
| GitHub Deep Research | In-depth GitHub repository analysis |
How It Compares
| Dimension | DeerFlow | AutoGen | CrewAI | Manus |
|---|---|---|---|---|
| Real Code Execution | ✅ Sandbox isolated | ✅ | ❌ | ✅ (commercial) |
| Open Source | MIT | MIT | MIT | ❌ Closed |
| Chinese Model Support | ✅ First-class | Average | Average | ❌ |
| Production Validated | ✅ ByteDance | ❌ | ❌ | — |
| Skills Extensibility | ✅ Markdown | Python class | Python class | — |
| Deployment Complexity | Medium (Docker) | Low | Low | No self-hosting |
Why choose DeerFlow?
- Validated in ByteDance's production environments — reliability is battle-tested
- Sandbox execution produces actual deliverables, not just text suggestions
- First-class support for DeepSeek, Qwen, Doubao, and other Chinese models
- Skills-as-Markdown has the lowest extension barrier in its class
Deep Dive
System Architecture
DeerFlow supports two deployment modes, sharing the same frontend but differing in backend process count:
Standard Mode — Recommended for production
┌─────────────────────────────────────┐
│ Nginx (Reverse Proxy + Routing) │
├──────────────┬──────────────────────┤
│ Frontend │ Gateway API │
│ (Web UI) │ (REST + WebSocket) │
│ ├──────────────────────┤
│ │ LangGraph Server │
│ │ (Standalone Agent │
│ │ Runtime) │
└──────────────┴──────────────────────┘
Gateway Mode — Experimental, lighter deployment
┌─────────────────────────────────────┐
│ Nginx │
├──────────────┬──────────────────────┤
│ Frontend │ Gateway API │
│ │ (Embedded Agent │
│ │ Runtime) │
└──────────────┴──────────────────────┘
Core Execution Flow
DeerFlow's agent orchestration is a three-tier structure:
User Input (Prompt)
│
▼
┌───────────────────────────────────────┐
│ Lead Agent │
│ Task decomposition → Sub-task plan │
│ → Result aggregation │
└──────┬─────────────┬──────────────────┘
│ │ │
▼ ▼ ▼
Researcher Coder Agent Reporter
Sub-Agent (Code Gen + Sub-Agent
(Search/Crawl) Sandbox Exec) (Report Synthesis)
│ │
▼ ▼
Search APIs Docker Sandbox
Web Scraping bash / Python
File System
The Lead Agent is the system's "brain", responsible for:
- Understanding task intent and breaking it into parallelizable sub-tasks
- Assigning each sub-task to the appropriate Sub-Agent
- Aggregating results from all Sub-Agents into the final output
Sandbox Execution
The sandbox is one of v2.0's most important technical breakthroughs. Real code isolation is achieved through Docker containers:
# Simplified: Coder Sub-Agent's sandbox invocation
async def execute_in_sandbox(code: str, language: str = "python") -> ExecutionResult:
"""
Execute code inside a Docker container, isolated from the host
"""
container = await docker_client.containers.create(
image="deerflow-sandbox:latest",
command=["python", "-c", code],
volumes={
"/mnt/user-data/workspace": {"bind": "/workspace", "mode": "rw"},
"/mnt/user-data/outputs": {"bind": "/outputs", "mode": "rw"},
},
network_mode="bridge", # Restricted network access
mem_limit="2g", # Memory cap
cpu_period=100000,
cpu_quota=50000, # CPU cap at 50%
)
result = await container.start()
stdout, stderr = await container.logs()
return ExecutionResult(
stdout=stdout.decode(),
stderr=stderr.decode(),
exit_code=result["StatusCode"]
)
Sandbox filesystem layout:
Inside the Docker container:
├── /mnt/user-data/uploads # User-uploaded files (read-only)
├── /mnt/user-data/workspace # Agent working directory (read-write)
└── /mnt/user-data/outputs # Final output artifacts (read-write)
This design guarantees:
- Security isolation: Agent-generated code cannot access sensitive host files
- Reproducibility: Every task runs in a clean container, avoiding state contamination
- Real deliverables: Output files persist to the host machine, immediately usable by the user
Skills as Markdown
The Skills system is the crown jewel of DeerFlow's extensibility design. Unlike other frameworks that define Skills as Python classes, DeerFlow uses Markdown files — dramatically lowering the barrier to extension:
.claude/skills/deep-research/
├── SKILL.md # Skill description, trigger conditions, execution steps
└── references/
├── search-strategy.md # Search strategy specifications
├── report-template.md # Report template
└── quality-checklist.md # Quality checklist
A typical SKILL.md structure:
# Deep Research Skill
## Trigger Conditions
Activate when the user needs to conduct deep research on a topic,
competitive analysis, or industry investigation.
## Execution Steps
1. Understand the research objective; break it into 3-5 key questions
2. Perform multi-round searches per question (minimum 3 rounds, diverse angles)
3. Crawl high-quality source pages; extract key information
4. Synthesize findings; identify consensus and contradictions
5. Generate structured output using the report template
## Output Format
- Executive summary (< 200 words)
- Deep-dive sections (500-1000 words each)
- Key findings summary
- Source reference list
## Load Resources
- load_skill_resource("references/search-strategy.md")
- load_skill_resource("references/report-template.md")
This design means non-engineers can write and customize skills — all you need is Markdown, no Python code required.
LangGraph Integration
DeerFlow chose LangGraph as the Agent orchestration layer rather than building its own state machine. LangGraph's key advantages:
- Directed Acyclic Graph (DAG): Task dependencies are clearly visualized
- Checkpoints: Supports Human-in-the-Loop — pause and wait for human approval at critical nodes
- Persistent State: Cross-session task state saving supports interruption and resumption of long-running tasks
- Parallel Execution: Native parallel node execution means Sub-Agents can truly run concurrently
# DeerFlow's LangGraph workflow (simplified)
from langgraph.graph import StateGraph, END
from typing import TypedDict
class ResearchState(TypedDict):
query: str
sub_tasks: list[str]
search_results: dict
code_outputs: dict
final_report: str
workflow = StateGraph(ResearchState)
# Add nodes
workflow.add_node("planner", lead_agent_plan)
workflow.add_node("researcher", researcher_agent)
workflow.add_node("coder", coder_agent)
workflow.add_node("reporter", reporter_agent)
# Define edges (execution order)
workflow.set_entry_point("planner")
workflow.add_edge("planner", "researcher")
workflow.add_edge("planner", "coder") # Parallel
workflow.add_edge("researcher", "reporter")
workflow.add_edge("coder", "reporter")
workflow.add_edge("reporter", END)
app = workflow.compile()
Multi-Model Strategy
DeerFlow is model-agnostic, with recommended selection criteria:
- Long context: 100k+ tokens (for processing large search results and codebases)
- Reasoning: Complex multi-step reasoning capability
- Tool calling: Reliable function calling / tool use
- Recommended Chinese models: Doubao-Seed-2.0-Code (ByteDance in-house), DeepSeek v3.2, Kimi 2.5
Configuration (config.yaml):
# Any OpenAI-compatible endpoint works
llm:
provider: openai_compatible
base_url: "https://ark.cn-beijing.volces.com/api/v3"
api_key: "${DOUBAO_API_KEY}"
model: "doubao-seed-2-0-code-250605"
max_tokens: 16384
# Or use DeepSeek
llm:
provider: openai_compatible
base_url: "https://api.deepseek.com"
api_key: "${DEEPSEEK_API_KEY}"
model: "deepseek-v3"
Resources
Official
- 🌟 GitHub: https://github.com/bytedance/deer-flow
- 📄 Docs: Project README
- 🐛 Issues: https://github.com/bytedance/deer-flow/issues
Related Projects
- LangGraph: The Agent orchestration framework powering DeerFlow's backend
- LangSmith / Langfuse: Observability tracing integrations
- OpenDeepResearch (OpenAI): Comparable competitor in the deep research space
Summary
Key Takeaways
- Positioning leap: From v1's deep research tool to v2's general SuperAgent execution engine — the core jump is real execution capability, not just text generation
- Docker sandbox: Real isolated code execution means Agents produce actual deliverables, not suggestions
- Sub-agent parallelism: The Lead Agent + Sub-Agent architecture breaks past single-model context limits, enabling genuinely complex long-running tasks
- Skills-as-Markdown: Lowest-barrier extensibility in its class — non-engineers can customize Agent behavior
- Chinese model first-class support: First-class support for Doubao, DeepSeek, and Qwen makes it the natural choice for developers in China
Who Should Use This
- Researchers / Analysts: Knowledge workers who need to aggregate and synthesize large amounts of information
- AI Engineers: Development teams building production-grade Agent applications that need a reliable execution engine
- Python Developers: Practitioners looking to learn LangGraph and multi-agent orchestration through a real-world codebase
- Enterprise Tech Teams: Teams exploring AI automation of complex tasks — research, reporting, code generation
Visit my personal site for more useful knowledge and interesting products
Top comments (0)