WonderLab

Posted on Apr 8 • Edited on Jun 4

One Open Source Project a Day (No.33): DeerFlow - ByteDance's SuperAgent Execution Engine

#ai #agents #opensource #langchain

Introduction

"LLMs shouldn't just talk about actions — they should actually execute them."

This is article No.33 in the "One Open Source Project a Day" series. Today's project is DeerFlow (GitHub).

Most AI Agent frameworks share a hidden limitation: they're good at suggesting, but not at doing. Generating code is easy — but actually running it, handling errors, iterating, and producing a deliverable artifact? That's the real challenge for complex research and automation tasks.

DeerFlow (Deep Exploration and Efficient Research Flow) is ByteDance's open-source answer to this problem. Completely rewritten in v2.0, it's no longer just a deep research framework — it's a general-purpose SuperAgent execution engine that runs code in real sandboxes, orchestrates parallel sub-agents, and handles tasks that take minutes to hours — from a single prompt all the way to a research report, a webpage, or a working program.

It hit #1 on GitHub Trending shortly after launch and now sits at 59k+ Stars, making it one of the most watched open-source projects in the AI Agent space.

What You'll Learn

DeerFlow's core positioning and the v1 → v2 architectural evolution
The SuperAgent execution flow: Lead Agent + parallel sub-agent orchestration
How sandbox-isolated code execution works and its security design
The Skills-as-Markdown extensibility mechanism
Multi-model support strategy and Chinese model optimization

Prerequisites

Basic understanding of LLM API calls (OpenAI-compatible interface format)
Some Docker experience (recommended for deployment)
Python basics (optional, for customization)

Project Background

What Is It?

DeerFlow stands for Deep Exploration and Efficient Research Flow, open-sourced by ByteDance. The project has gone through two major versions:

v1.x: Positioned as a deep research framework — multi-round search, web scraping, and consolidated report generation
v2.0 (February 2026, complete rewrite): Elevated to a general-purpose SuperAgent execution engine, introducing real sandbox code execution and a much broader range of supported task types

v2.0 shares no code with v1.x — it's a ground-up architectural rebuild, marking the project's transition from a "research assistance tool" to a "production-grade Agent execution engine."

About the Team

Organization: ByteDance (official open-source project)
Nature: Community-driven, led by ByteDance engineers, accepts external contributions
Release Timeline: v1.0 ~early 2025, v2.0 released February 2026
Milestone: Hit #1 on GitHub Trending on February 28, 2026

Project Stats

⭐ GitHub Stars: 59,200+
🍴 Forks: 7,500+
🐛 Open Issues: ~365
📄 License: MIT
🔄 Active Branches: main (v2.x), main-1.x (v1.x maintenance)

Key Features

Core Purpose

DeerFlow's fundamental value proposition is making AI Agents actually do things rather than just talk about things:

Capability	Traditional Agent Frameworks	DeerFlow v2.0
Code Execution	Generates code (doesn't run it)	Real execution in isolated sandbox
Task Duration	Seconds to minutes	Minutes to hours
Task Decomposition	Sequential execution	Parallel sub-agent orchestration
Output Type	Text suggestions	Real deliverables: files, pages, programs
Context Limits	Bound by single model window	Sub-agent divide-and-conquer

Use Cases

Deep Research Reports
- Given a research topic, automatically performs multi-round search, web scraping, and data synthesis to produce a structured report
Code Generation & Validation
- From requirements to a working program — real execution and debugging in the sandbox, iterating until it works
Data Analysis & Visualization
- Upload a data file; the Agent writes analysis scripts, generates charts, and outputs a ready-to-use analytics report
Web Development
- Describe what you need; the Agent writes HTML/CSS/JS, validates it in the sandbox, and delivers a complete webpage
Content Creation
- Automatically generate slides, podcast summaries, technical blog posts, and other content formats

Quick Start

Recommended (Docker):

# Clone the repository
git clone https://github.com/bytedance/deer-flow.git
cd deer-flow

# Generate configuration file
make config

# Edit config — fill in your model API keys
# Supports OpenAI, Claude, DeepSeek, Qwen, Doubao, etc.
vim config.yaml

# Initialize and start
make docker-init
make docker-start

# Access the web UI
# http://localhost:2026

Local development mode:

# Check environment requirements (Python 3.12+, Node.js 22+)
make check

# Install dependencies (uv for Python, pnpm for JS)
make install

# Start development servers
make dev

Built-in Skills

DeerFlow ships with several production-ready skills out of the box:

Skill	Functionality
Deep Research	Multi-round search + web scraping + consolidated research report
Report Generation	Formatted report generation
Slide Creation	Presentation slide creation
Web Page Development	Full webpage development
GitHub Deep Research	In-depth GitHub repository analysis

How It Compares

Dimension	DeerFlow	AutoGen	CrewAI	Manus
Real Code Execution	✅ Sandbox isolated	✅	❌	✅ (commercial)
Open Source	MIT	MIT	MIT	❌ Closed
Chinese Model Support	✅ First-class	Average	Average	❌
Production Validated	✅ ByteDance	❌	❌	—
Skills Extensibility	✅ Markdown	Python class	Python class	—
Deployment Complexity	Medium (Docker)	Low	Low	No self-hosting

Why choose DeerFlow?

Validated in ByteDance's production environments — reliability is battle-tested
Sandbox execution produces actual deliverables, not just text suggestions
First-class support for DeepSeek, Qwen, Doubao, and other Chinese models
Skills-as-Markdown has the lowest extension barrier in its class

Deep Dive

System Architecture

DeerFlow supports two deployment modes, sharing the same frontend but differing in backend process count:

Standard Mode — Recommended for production
┌─────────────────────────────────────┐
│  Nginx (Reverse Proxy + Routing)    │
├──────────────┬──────────────────────┤
│  Frontend    │  Gateway API         │
│  (Web UI)    │  (REST + WebSocket)  │
│              ├──────────────────────┤
│              │  LangGraph Server    │
│              │  (Standalone Agent   │
│              │   Runtime)           │
└──────────────┴──────────────────────┘

Gateway Mode — Experimental, lighter deployment
┌─────────────────────────────────────┐
│  Nginx                              │
├──────────────┬──────────────────────┤
│  Frontend    │  Gateway API         │
│              │  (Embedded Agent     │
│              │   Runtime)           │
└──────────────┴──────────────────────┘

Core Execution Flow

DeerFlow's agent orchestration is a three-tier structure:

User Input (Prompt)
        │
        ▼
┌───────────────────────────────────────┐
│          Lead Agent                   │
│  Task decomposition → Sub-task plan   │
│  → Result aggregation                 │
└──────┬─────────────┬──────────────────┘
       │             │             │
       ▼             ▼             ▼
  Researcher     Coder Agent    Reporter
  Sub-Agent      (Code Gen +    Sub-Agent
  (Search/Crawl)  Sandbox Exec) (Report Synthesis)
       │             │
       ▼             ▼
  Search APIs    Docker Sandbox
  Web Scraping   bash / Python
                 File System

The Lead Agent is the system's "brain", responsible for:

Understanding task intent and breaking it into parallelizable sub-tasks
Assigning each sub-task to the appropriate Sub-Agent
Aggregating results from all Sub-Agents into the final output

Sandbox Execution

The sandbox is one of v2.0's most important technical breakthroughs. Real code isolation is achieved through Docker containers:

# Simplified: Coder Sub-Agent's sandbox invocation
async def execute_in_sandbox(code: str, language: str = "python") -> ExecutionResult:
    """
    Execute code inside a Docker container, isolated from the host
    """
    container = await docker_client.containers.create(
        image="deerflow-sandbox:latest",
        command=["python", "-c", code],
        volumes={
            "/mnt/user-data/workspace": {"bind": "/workspace", "mode": "rw"},
            "/mnt/user-data/outputs": {"bind": "/outputs", "mode": "rw"},
        },
        network_mode="bridge",  # Restricted network access
        mem_limit="2g",         # Memory cap
        cpu_period=100000,
        cpu_quota=50000,        # CPU cap at 50%
    )

    result = await container.start()
    stdout, stderr = await container.logs()

    return ExecutionResult(
        stdout=stdout.decode(),
        stderr=stderr.decode(),
        exit_code=result["StatusCode"]
    )

Sandbox filesystem layout:

Inside the Docker container:
├── /mnt/user-data/uploads    # User-uploaded files (read-only)
├── /mnt/user-data/workspace  # Agent working directory (read-write)
└── /mnt/user-data/outputs    # Final output artifacts (read-write)

This design guarantees:

Security isolation: Agent-generated code cannot access sensitive host files
Reproducibility: Every task runs in a clean container, avoiding state contamination
Real deliverables: Output files persist to the host machine, immediately usable by the user

Skills as Markdown

The Skills system is the crown jewel of DeerFlow's extensibility design. Unlike other frameworks that define Skills as Python classes, DeerFlow uses Markdown files — dramatically lowering the barrier to extension:

.claude/skills/deep-research/
├── SKILL.md              # Skill description, trigger conditions, execution steps
└── references/
    ├── search-strategy.md    # Search strategy specifications
    ├── report-template.md    # Report template
    └── quality-checklist.md  # Quality checklist

A typical SKILL.md structure:

# Deep Research Skill

## Trigger Conditions
Activate when the user needs to conduct deep research on a topic,
competitive analysis, or industry investigation.

## Execution Steps
1. Understand the research objective; break it into 3-5 key questions
2. Perform multi-round searches per question (minimum 3 rounds, diverse angles)
3. Crawl high-quality source pages; extract key information
4. Synthesize findings; identify consensus and contradictions
5. Generate structured output using the report template

## Output Format
- Executive summary (< 200 words)
- Deep-dive sections (500-1000 words each)
- Key findings summary
- Source reference list

## Load Resources
- load_skill_resource("references/search-strategy.md")
- load_skill_resource("references/report-template.md")

This design means non-engineers can write and customize skills — all you need is Markdown, no Python code required.

LangGraph Integration

DeerFlow chose LangGraph as the Agent orchestration layer rather than building its own state machine. LangGraph's key advantages:

Directed Acyclic Graph (DAG): Task dependencies are clearly visualized
Checkpoints: Supports Human-in-the-Loop — pause and wait for human approval at critical nodes
Persistent State: Cross-session task state saving supports interruption and resumption of long-running tasks
Parallel Execution: Native parallel node execution means Sub-Agents can truly run concurrently

# DeerFlow's LangGraph workflow (simplified)
from langgraph.graph import StateGraph, END
from typing import TypedDict

class ResearchState(TypedDict):
    query: str
    sub_tasks: list[str]
    search_results: dict
    code_outputs: dict
    final_report: str

workflow = StateGraph(ResearchState)

# Add nodes
workflow.add_node("planner", lead_agent_plan)
workflow.add_node("researcher", researcher_agent)
workflow.add_node("coder", coder_agent)
workflow.add_node("reporter", reporter_agent)

# Define edges (execution order)
workflow.set_entry_point("planner")
workflow.add_edge("planner", "researcher")
workflow.add_edge("planner", "coder")       # Parallel
workflow.add_edge("researcher", "reporter")
workflow.add_edge("coder", "reporter")
workflow.add_edge("reporter", END)

app = workflow.compile()

Multi-Model Strategy

DeerFlow is model-agnostic, with recommended selection criteria:

Long context: 100k+ tokens (for processing large search results and codebases)
Reasoning: Complex multi-step reasoning capability
Tool calling: Reliable function calling / tool use
Recommended Chinese models: Doubao-Seed-2.0-Code (ByteDance in-house), DeepSeek v3.2, Kimi 2.5

Configuration (config.yaml):

# Any OpenAI-compatible endpoint works
llm:
  provider: openai_compatible
  base_url: "https://ark.cn-beijing.volces.com/api/v3"
  api_key: "${DOUBAO_API_KEY}"
  model: "doubao-seed-2-0-code-250605"
  max_tokens: 16384

# Or use DeepSeek
llm:
  provider: openai_compatible
  base_url: "https://api.deepseek.com"
  api_key: "${DEEPSEEK_API_KEY}"
  model: "deepseek-v3"

Resources

Official

🌟 GitHub: https://github.com/bytedance/deer-flow
📄 Docs: Project README
🐛 Issues: https://github.com/bytedance/deer-flow/issues

Related Projects

LangGraph: The Agent orchestration framework powering DeerFlow's backend
LangSmith / Langfuse: Observability tracing integrations
OpenDeepResearch (OpenAI): Comparable competitor in the deep research space

Summary

Key Takeaways

Positioning leap: From v1's deep research tool to v2's general SuperAgent execution engine — the core jump is real execution capability, not just text generation
Docker sandbox: Real isolated code execution means Agents produce actual deliverables, not suggestions
Sub-agent parallelism: The Lead Agent + Sub-Agent architecture breaks past single-model context limits, enabling genuinely complex long-running tasks
Skills-as-Markdown: Lowest-barrier extensibility in its class — non-engineers can customize Agent behavior
Chinese model first-class support: First-class support for Doubao, DeepSeek, and Qwen makes it the natural choice for developers in China

Who Should Use This

Researchers / Analysts: Knowledge workers who need to aggregate and synthesize large amounts of information
AI Engineers: Development teams building production-grade Agent applications that need a reliable execution engine
Python Developers: Practitioners looking to learn LangGraph and multi-agent orchestration through a real-world codebase
Enterprise Tech Teams: Teams exploring AI automation of complex tasks — research, reporting, code generation

Check out PrimeSkills — a curated marketplace of AI agents and skills that have been validated in real-world, enterprise-grade workflows. No fluff, just what actually works.

Find more useful knowledge and interesting products on my Homepage

DEV Community