We Added 5 Regime Filters. They Don't Do Much. Here's Why That's Interesting.

#ai #python #tutorial #machinelearning

What we tested

This week we added 5 regime filters to the cohort API: same_vrp_bucket (variance risk premium), same_term_bucket (VIX term structure), same_credit_bucket (HYG/LQD credit spread proxy), same_curve_bucket (yield curve slope), and same_breadth_bucket (market breadth). The academic literature on return predictability says these should materially condition forward return distributions, with VRP specifically called out as the best single-factor regime predictor.

We ran the test honestly: 200 anchors with known 5d and 10d forward returns, six cohort modes per anchor (baseline + each regime filter applied alone), 2,400 total cohort runs. For each, we measured interquartile range width, [p10, p90] band width, and held-out-coverage of actuals.

The result

Across 5 and 10 day horizons, the 5 regime filters produced distribution widths that differed from baseline by 0.2 percentage points or less. Empirical coverage shifted by 1-2 percentage points. The n (cohort size) barely changed — baseline drew 198 neighbors; every filtered version drew 199-200.

5d baseline IQR: 4.17%. same_vrp: 4.21%. same_curve: 4.07%. Max shift: 0.14pp.
5d baseline 80-band width: 8.78%. Max shift across filters: 0.21pp.
10d baseline IQR: 5.88%. same_credit: 6.25%. Max shift: 0.37pp.
Empirical [p10,p90] coverage on held-out actuals: baseline 73.5% (5d) / 71.0% (10d), regime-filtered all within ±2pp.

Why the filters don't bite

The filters are real and the columns they reference are populated across 24M+ embeddings. But at the ±0.15 percentile bucketing we chose, the filter keeps roughly 70% of the base pool. When you already have 200 near-neighbors from a 24M-row kNN, dropping 30% of candidates barely changes which 200 bubble to the top.

There's a second, subtler reason: the kNN search is over shape embeddings that were computed from price + volume + volatility signals. Patterns that are shape-similar tend to already be drawn from similar regimes — you don't get a roaring-bull-market pattern and a 2008-crash pattern as nearest neighbors. The regime filter is redundant with information the embedding already captured.

What this tells an agent builder

The lesson isn't that regime doesn't matter — it's that regime matters implicitly once you retrieve by shape. If you're already using shape-based kNN, layering a loose regime filter on top buys you very little. The cases where regime filtering WILL bite are:

Tight bucketing (±0.05 percentile) instead of loose (±0.15). This drops cohort size materially and should move distributions — at the cost of higher variance on the remaining estimate.
Interaction filters (same_vrp AND same_term AND same_credit) that restrict to a specific regime combination — probably the correct default when an agent is reasoning about a specific macro setup.
Regime-stratified calibration: fit separate conformal offsets per regime bucket so the bands reflect 'what happens in high-VIX high-VRP environments specifically.' This is probably where the real win lives.

What we're doing about it

The filters ship as-is because they do still constrain the cohort (just mildly), they're cheap to apply, and they give agents a clean way to say 'only match within similar macro conditions.' Users who want stronger effects can stack them — our MCP tool documentation now reflects that.

The next experiment is interaction filters: same_vrp AND same_term AND same_credit simultaneously, at a ±0.10 bucket. That should materially change cohort composition. If it does, we'll publish the delta; if it doesn't, we'll publish that too.

CTA — This is the kind of audit agent builders should demand from any historical-pattern API. If a provider claims their filters condition distributions, ask for the IQR shift. If they can't produce it, the filters are decoration. Ours are documented at chartlibrary.io/calibration.

Originally published at chartlibrary.io. Chart Library is the stock-market memory for AI agents — free Sandbox tier at chartlibrary.io/developers.