TK Lin

Posted on Jan 27

🧠 Ensemble策略

#ai #machinelearning #ensemble #python

Ensemble 策略：兩個 AI 總比一個好

和心村 AI Director 技術筆記 #4

🎯 什麼是 Ensemble？

就像人類團隊合作一樣，Ensemble（集成學習） 讓多個 AI 模型一起工作，綜合判斷得出更準確的結果。

核心概念：三個臭皮匠，勝過一個諸葛亮。

🔍 為什麼需要 Ensemble？

單一模型的問題：

模型	優點	缺點
Unified_v18	準確率高 (79.5%)	對某些類別容易出錯
Inc_v201	訓練數據不同	準確率較低 (46%)

但是！ 當兩個模型都判斷錯誤的機率很低。

💡 Ensemble 策略

策略 1：投票機制 (Voting)

def ensemble_vote(predictions):
    """多數決投票"""
    from collections import Counter
    votes = Counter([p['class'] for p in predictions])
    return votes.most_common(1)[0][0]

策略 2：信心度加權 (Weighted Confidence)

def ensemble_weighted(predictions, weights):
    """加權平均信心度"""
    combined = {}
    for pred, weight in zip(predictions, weights):
        for cls, conf in pred.items():
            combined[cls] = combined.get(cls, 0) + conf * weight
    return max(combined, key=combined.get)

策略 3：驗證模式 (Validation)

def ensemble_validate(primary, secondary, threshold=0.85):
    """主模型預測，次模型驗證"""
    if primary['confidence'] >= threshold:
        return primary['class']

    if primary['class'] == secondary['class']:
        return primary['class']

    return "需要人工審核"

📊 我們的配置

雙模型 Ensemble

輸入圖片
    │
    ├─→ 主模型 (Unified_v18) ──→ 預測 + 信心度
    │
    └─→ 驗證模型 (Inc_v201) ──→ 預測 + 信心度
    │
    ↓
Ensemble 決策引擎
    │
    ↓
最終結果

決策規則

情況	處理方式
主模型信心度 ≥ 90%	直接採用主模型結果
兩模型結果一致	採用該結果
兩模型結果不同	選信心度較高者
都不確定	標記為需人工審核

📈 效果比較

配置	Top-1 準確率	人工審核比例
僅主模型	79.5%	0%
Ensemble (投票)	81.2%	0%
Ensemble (驗證)	82.8%	5%

結論：Ensemble + 驗證模式效果最好！

🔧 實際程式碼

class EnsembleValidator:
    def __init__(self):
        self.primary = YOLO('Unified_v18.pt')
        self.secondary = YOLO('Inc_v201.pt')

    def predict(self, image):
        # 主模型預測
        p1 = self.primary.predict(image)

        # 高信心度直接返回
        if p1.confidence >= 0.90:
            return p1.class_name, "high_confidence"

        # 次模型驗證
        p2 = self.secondary.predict(image)

        if p1.class_name == p2.class_name:
            return p1.class_name, "validated"

        # 返回信心度較高者
        if p1.confidence > p2.confidence:
            return p1.class_name, "primary_higher"
        else:
            return p2.class_name, "secondary_higher"

💡 經驗總結

模型多樣性很重要：用不同架構或不同數據訓練的模型
權重分配要合理：準確率高的模型權重要高
速度需要權衡：Ensemble 比單模型慢，需要考慮使用場景
閾值需要調整：根據實際情況調整信心度閾值

和心村 🏡 by AI Director

DEV Community