DEV Community

NydarTrading
NydarTrading

Posted on • Originally published at nydar.co.uk

Meta-Labeling: Filtering Bad Trades Before They Happen

The Problem with Raw Predictions

Imagine a model that's 55% accurate. That means 45% of its signals are wrong. If you follow every signal, you're taking a lot of bad trades alongside the good ones.

What if there was a way to know which predictions are likely to be correct — before the trade happens?

That's meta-labeling.

Prediction Widget


What Is Meta-Labeling?

Meta-labeling is a two-stage prediction framework popularised by Marcos Lopez de Prado in Advances in Financial Machine Learning. The concept is simple:

Stage 1 — Primary Model: Predicts the direction (bullish or bearish). This is our XGBoost model.

Stage 2 — Meta Model: Takes the primary model's prediction and asks: "Is this specific prediction likely to be correct?"

The meta model doesn't predict direction — it predicts the quality of the primary prediction. It outputs a confidence score. If the meta-confidence is below our threshold, we withhold the signal.

Think of It Like a Quality Filter

  • Primary model: "I think BTC will go up"
  • Meta model: "I'm 72% confident that prediction is correct" → Signal passes

vs.

  • Primary model: "I think ETH will go down"
  • Meta model: "I'm only 48% confident that prediction is correct" → Signal withheld

The result: fewer signals, but higher quality.


How We Train the Meta Model

The training process uses a 70/30 split within each walk-forward window:

  1. Split the training data: 70% for the primary model, 30% for the meta model
  2. Train the primary model on the 70% portion
  3. Get primary predictions on the held-out 30%
  4. Create meta-labels: For each prediction, label it 1 (correct) or 0 (incorrect)
  5. Train the meta model on these correctness labels, using the original features plus the primary model's confidence as inputs

The meta model is a separate XGBoost classifier with more regularisation than the primary (shallower trees, higher regularisation) to avoid overfitting.

Critically, the meta model sees features the primary model doesn't optimise for. The primary model optimises for direction prediction. The meta model optimises for when the primary is right. These are different problems.


Our Experiment Results

We tested meta-labeling across 10 cryptocurrencies at three confidence thresholds:

1-Hour Timeframe

Threshold Accuracy Coverage Improvement
No filter (baseline) 54.9% 100%
Meta ≥ 0.55 56.0% 53% +1.1%
Meta ≥ 0.60 56.3% 44% +1.4%
Meta ≥ 0.65 56.1% 38% +1.2%

4-Hour Timeframe

Threshold Accuracy Coverage Improvement
No filter 52.3% 100%
Meta ≥ 0.55 52.5% 48% +0.2%
Meta ≥ 0.60 52.2% 41% -0.1%
Meta ≥ 0.65 52.8% 36% +0.5%

Daily Timeframe

Threshold Accuracy Coverage Improvement
No filter 51.2% 100%
Meta ≥ 0.55 54.4% 52% +3.2%
Meta ≥ 0.60 54.7% 43% +3.5%
Meta ≥ 0.65 55.7% 39% +4.5%

The daily timeframe showed the strongest meta-labeling effect — a +4.5% improvement is substantial.


The Accuracy vs Coverage Tradeoff

This is the core tension in meta-labeling. Higher thresholds mean:

  • Higher accuracy on signals that pass
  • Lower coverage (fewer signals generated)

At threshold 0.65 on the daily timeframe, you only get signals ~39% of the time. The other 61% of periods, the meta model says "I'm not confident enough" and no signal is generated.

Is this a problem? It depends on your perspective:

  • For active traders who want constant signals: Yes, reduced coverage is frustrating
  • For quality-focused traders who prefer fewer, better trades: Meta-labeling is exactly what you want
  • For automated systems (like our trading bot): Fewer but higher-quality signals actually improve risk-adjusted returns

We chose threshold 0.60 as the default — it gives the best accuracy-to-coverage balance on the hourly timeframe where most of our signals are generated.


Per-Coin Results

Meta-labeling doesn't help every coin equally:

Coin Baseline With Meta Improvement
AUCTION 54.8% 63.9% +9.2%
BTC (1d) 50.6% 66.1% +15.5%
ETH (1d) 55.0% 62.7% +7.7%
SOL 53.7% 54.9% +1.2%
ETH (1h) 55.6% 55.2% -0.5%
HIVE (1d) 51.0% 49.8% -1.2%

Some observations:

  • BTC daily showed the largest improvement (+15.5%), though this is on a small test set (60 samples per fold)
  • AUCTION was consistently the most improved by meta-labeling across timeframes
  • ETH on 1h actually got slightly worse — the meta model occasionally filtered out predictions that would have been correct
  • HIVE daily was slightly negative, suggesting the meta model doesn't generalise well for all low-cap altcoins

What the Meta Model Learns

The meta model's feature importance reveals what it's actually learning:

  1. Primary model confidence (the probability the primary assigns to its prediction) is the single most important feature — unsurprisingly, more confident predictions are more likely to be correct
  2. Volatility indicators (ATR, Bollinger Width) rank high — the model is worse during high-volatility periods
  3. Trend indicators (EMA alignment, MACD) — predictions during clear trends are more reliable
  4. Volume — higher volume periods produce more reliable predictions

In other words, the meta model learns to trust the primary model more when:

  • The primary is highly confident
  • Volatility is moderate (not extreme)
  • There's a clear trend (not choppy/ranging)
  • Volume confirms the move

Implementation Considerations

If you're building a similar system:

Train/meta split matters. We use 70/30 within each walk-forward window. Too little meta-training data (e.g., 90/10) makes the meta model unreliable. Too much (e.g., 50/50) starves the primary model.

The meta model should be more regularised. We use shallower trees (depth 5 vs 8) and higher regularisation. The meta model sees fewer samples and has an easier classification task.

Include primary confidence as a meta feature. This is the single most important feature for the meta model. Without it, meta-labeling performance drops significantly.

Walk-forward prevents leakage. The meta model must only be trained on data the primary model hasn't seen. Our 70/30 split within each walk-forward window ensures this.


Key Takeaways

  1. Meta-labeling improves accuracy by 1-5% depending on timeframe and threshold
  2. Coverage drops to 39-44% — you trade less often but with higher quality
  3. Daily timeframe benefits most (+4.5% at threshold 0.65)
  4. Not all coins benefit equally — works best on BTC, ETH, and mid-cap tokens
  5. The accuracy-coverage tradeoff is real — you need to decide what matters more for your strategy

Part of Our Research Series

  1. 13,500 Model Fits Later: What Actually Works — Overview
  2. Why We Chose XGBoost Over LSTM — Model comparison
  3. How Macro Indicators Predict Crypto Prices — Macro features
  4. This post — Meta-labeling and signal quality

Full methodology: How Our AI Works


AI trading signals are probabilistic predictions, not financial advice. Meta-labeling improves signal quality but does not eliminate risk. Past performance does not guarantee future results.


Originally published at Nydar. Nydar is a free trading platform with AI-powered signals and analysis.

Top comments (0)