吴迦

Posted on Mar 2

ReAct模式实战指南：让AI Agent学会"先想后做"

#ai #react #agentai #llm

ReAct是2026年最广泛使用的Agent推理模式。它让AI学会了人类最基本的问题解决方式：先想清楚要做什么，再去做，看到结果后再想下一步。

一、ReAct是什么？为什么它重要？

1.1 从Chain-of-Thought到ReAct

2022年，Chain-of-Thought（CoT）让LLM学会了"想"——在回答前先列出推理步骤。但CoT有一个致命缺陷：它只能"想"，不能"做"。面对需要查数据、调API、执行代码的任务，纯推理束手无策。

ReAct（Reasoning + Acting）在2023年由Yao等人提出，核心思想极其简单：

Thought → Action → Observation → Thought → Action → ... → Answer

让LLM在推理过程中可以调用工具、获取信息，然后基于新信息继续推理。

1.2 ReAct循环详解

每一轮ReAct循环包含三个阶段：

Thought（推理）：Agent分析当前状态，决定下一步行动

Thought: 用户问北京明天天气，我需要调用天气API查询

Action（行动）：调用工具获取信息

Action: weather_api(city="Beijing", date="tomorrow")

Observation（观察）：接收工具返回结果

Observation: {"temp": 15, "condition": "sunny", "wind": "NE 3-4"}

循环继续直到Agent认为信息足够，生成最终回答。

二、代码实战：5种ReAct实现

2.1 LangGraph实现（推荐）

from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
from langchain_core.messages import HumanMessage

# 定义工具
@tool
def search_web(query: str) -> str:
    """搜索网页获取最新信息"""
    return brave_search(query)

@tool
def calculate(expression: str) -> str:
    """计算数学表达式"""
    return str(eval(expression))

@tool
def query_database(sql: str) -> str:
    """查询数据库"""
    return db.execute(sql)

tools = [search_web, calculate, query_database]

# 创建Agent节点
def agent_node(state):
    messages = state["messages"]
    response = model.bind_tools(tools).invoke(messages)
    return {"messages": [response]}

# 判断是否需要继续
def should_continue(state):
    last_message = state["messages"][-1]
    if last_message.tool_calls:
        return "tools"  # 有工具调用，继续
    return "end"  # 无工具调用，结束

# 构建ReAct图
graph = StateGraph(AgentState)
graph.add_node("agent", agent_node)
graph.add_node("tools", ToolNode(tools))

graph.set_entry_point("agent")
graph.add_conditional_edges("agent", should_continue,
    {"tools": "tools", "end": END})
graph.add_edge("tools", "agent")  # 工具结果回到Agent

app = graph.compile()

# 使用
result = app.invoke({
    "messages": [HumanMessage(content="北京和上海今天的气温差多少？")]
})

2.2 OpenAI Agents SDK实现

from openai import agents

weather_agent = agents.Agent(
    name="天气助手",
    instructions="""你是一个天气查询助手。
    使用ReAct模式：先分析用户需求，再调用工具，最后综合回答。""",
    tools=[
        agents.function_tool(search_web),
        agents.function_tool(calculate),
    ],
    model="gpt-4o"
)

result = agents.run(weather_agent, "比较北京和上海下周的天气")

2.3 Amazon Bedrock实现

import boto3

bedrock = boto3.client('bedrock-agent-runtime')

# Bedrock Agent内置ReAct
response = bedrock.invoke_agent(
    agentId='AGENT_ID',
    agentAliasId='ALIAS_ID', 
    sessionId='session-001',
    inputText='分析我们上个月AWS账单中EC2费用最高的区域'
)

# Bedrock自动进行ReAct循环：
# Thought → 调用CostExplorer API → Observation → 
# Thought → 需要更多细节 → 调用CloudWatch → ... → 最终分析

三、工具使用分析

在生产环境中，ReAct Agent的工具调用遵循幂律分布：Web搜索占35%，代码执行25%，其余分散在文件读取、API调用等。

3.1 工具设计最佳实践

# ❌ 差的工具定义
@tool
def do_stuff(input: str) -> str:
    """处理数据"""
    pass

# ✅ 好的工具定义
@tool  
def analyze_csv_data(
    file_path: str,
    columns: list[str] | None = None,
    aggregation: str = "mean"
) -> str:
    """分析CSV文件中的数据。

    Args:
        file_path: CSV文件路径
        columns: 要分析的列名列表，None表示全部
        aggregation: 聚合方式 - mean/sum/count/max/min

    Returns:
        JSON格式的分析结果，包含统计摘要和异常值
    """
    pass

四、性能对比

ReAct在需要外部信息的任务（QA、Research）上优势最大，提升幅度达26-37%。

4.1 Token成本分析

ReAct的主要成本在推理（Thought）阶段——占40%的token。优化方向：压缩历史、缓存常见推理链。

4.2 迭代深度分析

中位数3.5次迭代。超过5次通常意味着任务定义不清或工具不足。设置max_iterations=8是合理上限。

五、高级优化技巧

5.1 ReAct + Reflection

def react_with_reflection(state):
    # 标准ReAct循环
    result = react_loop(state)

    # 反思阶段
    reflection = model.invoke(f"""
    回顾你的推理过程：
    {state['trace']}

    最终结果：{result}

    评估：
    1. 推理链是否完整？
    2. 是否有遗漏的信息？
    3. 结论是否可靠？
    4. 如果重新来过，你会怎样改进？
    """)

    if reflection.needs_revision:
        return react_loop(state, improved_strategy=reflection.suggestions)
    return result

5.2 并行ReAct

async def parallel_react(question, num_paths=3):
    """多条推理路径并行执行，取最佳结果"""
    tasks = [
        react_with_strategy(question, "depth-first"),
        react_with_strategy(question, "breadth-first"),
        react_with_strategy(question, "tool-heavy"),
    ]
    results = await asyncio.gather(*tasks)

    # 投票选最佳
    return select_best(results, criteria=["accuracy", "completeness"])

5.3 工具结果缓存

from functools import lru_cache
import hashlib

class CachedToolExecutor:
    def __init__(self, ttl_seconds=3600):
        self.cache = {}
        self.ttl = ttl_seconds

    async def execute(self, tool_name, args):
        cache_key = hashlib.md5(
            f"{tool_name}:{json.dumps(args, sort_keys=True)}".encode()
        ).hexdigest()

        if cache_key in self.cache:
            entry = self.cache[cache_key]
            if time.time() - entry['timestamp'] < self.ttl:
                return entry['result']  # 缓存命中

        result = await self.tools[tool_name](**args)
        self.cache[cache_key] = {'result': result, 'timestamp': time.time()}
        return result

六、生产经验总结

设置合理的迭代上限（8次）防止无限循环
工具描述要详细——LLM靠描述选择工具
Observation要精简——别把整个API响应塞进上下文
添加反思机制——让Agent审视自己的推理
监控token消耗——ReAct比直接回答贵3-5倍
缓存工具结果——减少重复调用

ReAct不是最花哨的模式，但它是最实用的。简单、可靠、可解释——这正是生产环境需要的。

作者：JiaDe Wu | AWS Solutions Architect | sample-OpenClaw-on-AWS-with-Bedrock Owner | GitHub: github.com/JiaDe-Wu

DEV Community