為什麼你的 AI 系統不需要用 LLM 來做路由？反射路由設計模式解析 | Why Your AI System Doesn't Need an LLM for Routing

#ai #architecture #llm #systemdesign

前言 | Introduction

當我們建造基於大型語言模型的 AI 系統時，一個常見的直覺是：「讓 LLM 決定一切」。收到使用者訊息？丟給 LLM 分類。需要選工具？讓 LLM 挑。要判斷語氣？LLM 說了算。

但這個直覺有三個致命問題：延遲、成本、不確定性。

When building LLM-based AI systems, a common instinct is: "Let the LLM decide everything." Got a user message? Let the LLM classify it. Need to pick a tool? Let the LLM choose.

But this instinct has three fatal flaws: latency, cost, and non-determinism.

問題的本質 | The Core Problem

想像你的 AI 助理每次收到訊息，都先花 500ms–2s 讓 LLM 判斷「這是什麼類型的請求」，然後再花 1–3s 生成回應。使用者感受到的延遲是兩次 LLM 呼叫的疊加。

更糟的是，LLM 路由是概率性的。同一句「幫我查天氣」，LLM 可能 95% 歸類為「工具呼叫」，偶爾歸類為「閒聊」。這種不確定性造成難以除錯的幽靈 bug。

反射路由：確定性的智能分流 | Reflex Routing: Deterministic Intelligent Dispatch

反射路由借用人類神經系統概念：不是所有決策都需要經過大腦皮層。碰到燙的東西，脊髓反射就能讓你縮手。

四大設計原則：

確定性優先 Determinism First — 關鍵字 + 正則 → 100% 可重現
分層攔截 Layered Interception — 安全→主權→認知→演化→整合
按需載入 Load on Demand — 只注入命中的知識模組
快速路徑 Fast Path — 整個路由 < 20ms

五層反射叢集 | Five-Tier Reflex Clusters

Tier	功能 Function	比喻 Analogy
A — Safety	煞車系統	危險訊號立即攔截
B — Sovereignty	方向盤保護	防止失去決策自主權
C — Cognition	反自我欺騙	偵測認知偏差與盲點
D — Evolution	可控犯錯	安全的實驗空間
E — Integration	慢層啟動	長期策略與節律管理

路由流程 | Routing Flow

def route(message: str, history: list) -> RoutingResult:
    # Step 1: Cluster detection — keywords + regex, < 5ms
    scores = detect_clusters(message)

    # Step 2: Loop selection
    if scores['tier_a'] > 0.5 or scores['tier_b'] > 0.5:
        loop = 'FAST'
    elif scores['tier_d'] > 0.5 or scores['tier_e'] > 0.5:
        loop = 'SLOW'
    else:
        loop = 'EXPLORATION'

    # Step 3: Module injection — only relevant knowledge
    modules = select_modules(scores)  # 0-6 modules

    # Step 4: Tool routing — only needed tool schemas
    tools = select_tools(scores)      # 5-15 vs 42+ total

    return RoutingResult(loop, modules, tools)

實測效果 | Performance

Metric	LLM Routing	Reflex Routing
Latency	500–2000ms	< 20ms
Consistency	~95%	100%
Token Usage	200–500/call	0
Debuggability	Low (black box)	High (traceable)

何時該用 LLM 路由？| When to Use LLM Routing?

意圖高度模糊 — 隱喻、反諷無法靠關鍵字
需要語義理解 — 上下文語義而非表面詞彙
分類空間極大 — 50+ 意圖類別

最佳實踐：混合架構 — 反射路由 80% 確定性場景，LLM 路由 20% 模糊場景。

結語 | Conclusion

AI 系統的工程挑戰不在於「讓 LLM 做更多事」，而在於精確界定 LLM 應該做什麼。用確定性邏輯處理確定性問題，把算力留給創造力和推理。

The engineering challenge isn't "making LLM do more" — it's precisely defining what the LLM should do.

DEV Community