The problem every stock-research agent has
If you've built an AI agent that answers questions like 'what usually happens after a breakout like this in NVDA,' you've hit the same wall everyone does: the model confidently narrates a number that has no historical backing. The base rate is either invented or pulled from the model's training cut-off, not from real data conditioned on the actual setup.
The fix is structural, not prompt-engineered. You need a tool the agent calls that returns real conditional base rates — not 'on average, NVDA goes up X%' but 'given this chart shape, filtered by current regime and sector, in a corpus of historical analogs that includes delisted names, here's the distribution of forward returns.' One call, one number the agent can reason about, one sample size so it knows when to hedge.
The primitive: POST /api/v1/cohort
Chart Library's Conditional Distribution endpoint is the smallest composable unit for this pattern. You send an anchor (symbol + date) and optional filters, you get back a cohort of historical matches plus the distribution of outcomes at 1/5/10 day horizons:
- anchor
- symbol
- NVDA
- date
- 2024-06-18
- horizons
- top_k
coh_...
5
n
return_pct
p10
p50
p90
hit_rate
above_entry
included_delisted
total_matches
Every response includes a 15-minute cohort_id you can refine progressively, and a survivorship flag so the agent knows whether delisted names are part of the base rate.
Three filter dimensions that matter
The reason shape-only matching doesn't produce alpha on its own is that outcomes are conditional on context. The cohort API takes three filter dimensions that meaningfully shift the distribution:
- same_as_anchor
- filters.regime.same_vix_bucket = true keeps only matches whose VIX regime is within ±15 percentile of today's
- filters.regime.same_trend = true matches the sign of the SPY 20d trend at the match date
Real example: NVDA 2024-06-18 unfiltered shows 54% up at 5 days across 492 analogs. Apply same_sector + same_vix_bucket and 1d drops to 48.6% up while 10d rises to 55.2% — a meaningful conditional pattern (short-term mean reversion, medium-term continuation) that's invisible in the unconditional stats.
The edge-mining loop (where it gets powerful)
Single calls are fine. The real leverage is the loop: start broad, ask which filter matters, narrow, repeat. Three tools:
- POST /api/v1/cohort — the initial cohort. Returns cohort_id.
- GET /api/v1/cohort/{id}/explain — ranks candidate filters (VIX regime, trend, recent-5-years) by how much each one shifts the above-entry hit rate. Tells the agent which dimension is actually moving the distribution for this specific setup.
- POST /api/v1/cohort/{id}/filter — narrows the stored cohort with whichever filter was most informative. No kNN re-run (sub-second) and returns a new cohort_id so agents can branch.
This is how agents (and humans) discover conditional structure rather than pattern-match to a canned base rate. The cohort_id keeps the expensive embedding search cached, so refinement is free. Fork, compare, keep the branch with the highest-confidence distribution.
MCP: one tool call in any agent framework
- NVDA
- 2024-06-18
- coh_...
- coh_...
Drop the MCP server into your CrewAI, LangGraph, AutoGen, or Claude function-calling setup. The agent discovers the tool, calls it, and returns a number grounded in real historical base rates instead of a number it made up.
Why this matters
The next wave of AI agents in finance will be judged on whether their answers are wrong in ways users can't detect. A hallucinated base rate is indistinguishable from a real one at the language-output level. The only structural defense is to ground every claim in a retrieval call backed by real data — conditional, explicit, sample-sized, and survivorship-aware.
Chart Library's cohort primitive is built for exactly that pattern. Free sandbox tier, $29 Builder, $299 Agent (with burst + session handles + 1K req/min), and the MCP server is one pip install away.
CTA — Ready to build? Grab an API key at chartlibrary.io/developers and the MCP server on PyPI (chartlibrary-mcp). The conditional distribution primitive is live on the Free tier.
Originally published at chartlibrary.io. Chart Library is the stock-market memory for AI agents — free Sandbox tier at chartlibrary.io/developers.
Top comments (0)