3 Patterns for AI Agents That Analyze Stock Charts

#ai #python #agents #tutorial

Why these three

If you've shipped an AI agent that answers stock questions, you've hit a predictable set of failure modes: it invents base rates, stops at the first retrieval instead of probing conditional structure, and strips the narrative hooks users actually remember. This post names the three patterns we ship in our own API so you can apply them regardless of which chart-data provider you end up using.

All three map to the same underlying idea: a stock-research agent should expose retrieval-first composable primitives to the LLM, and force synthesis into the final turn. Everything here is domain-independent — the techniques port to any agent answering 'what usually happens after X' questions in finance, sports, operations, or science.

Pattern 1: Grounded base rates (no hallucinated statistics)

The failure: Claude/GPT is happy to answer 'what usually happens after a NVDA-style breakout' with invented percentiles and sample sizes. The numbers sound real because they're formatted like real numbers.

The fix: a single tool that returns real conditional distributions with sample size and survivorship flag, plus a system prompt that forbids inventing forward-return statistics. The agent MUST call the tool before making any claim about 'typically' or 'usually.'

Tool returns: percentile distribution of forward returns (p10/p25/p50/p75/p90)
Per-horizon MAE (max adverse excursion) and MFE (max favorable excursion)
Realized vol distribution (for options or vol-scaling)
Hit rates: above-entry, MFE>1%, MAE<-1%
Sample size n AND survivorship flag (how many delisted names in cohort)
Every response gets a cohort_id for downstream tools

System prompt template: 'You are a stock-research assistant. If the user asks about forward returns, hit rates, drawdowns, or pattern outcomes, you MUST call get_cohort_distribution first. Quote the sample size in your answer. Disclose the survivorship flag. Never quote a percentile you did not see in tool output.'

Seems obvious, but agents written without this constraint invariably produce authoritative-sounding sentences with zero grounding. Add the tool + the constraint together; neither works alone.

Pattern 2: The edge-mining loop

The failure: agents stop at the first retrieval. They call the base-rate tool once, get an answer, and write it up. Whatever conditional structure lives INSIDE the cohort — the part that actually matters for trading — never surfaces.

The fix: expose two more tools the agent can chain after the initial cohort. One ranks which additional filter would move the distribution most (explain). The other applies a filter to fork a narrower cohort (refine). Agents iterate until they've identified the dimension that actually matters.

cohort(anchor, filters) returns cohort_id
explain(cohort_id, horizon) ranks candidate filters by |shift on above-entry rate|
refine(cohort_id, filter) applies that filter, returns new cohort_id
Agent loops: cohort → explain → refine → maybe explain again → synthesize

Why this works: sub-second refinement on a stored cohort (no repeat retrieval) means the agent can fork 5 branches and compare. Agents trained on tool use will do this naturally once they have the primitives. Agents routed through a LangGraph StateGraph execute the loop deterministically and use the model only for the final synthesis step.

Outputs agents write when given these tools: 'Baseline: 54% above-entry across 491 analogs. Narrowing by same_vix_bucket drops to 48% at 5d but climbs to 55% at 10d — short-term mean reversion, medium-term continuation.' That's a real trading insight, and it only emerges from the loop.

Pattern 3: Named analogs for narrative

The failure: your API returns 10 similar historical patterns, each a (symbol, date) tuple with a number. The agent dutifully lists them. The user's eyes glaze over. The most valuable piece — 'one of these analogs was Silicon Valley Bank the week it collapsed' — is invisible.

The fix: attach a named_event field to any match that falls inside a curated window of a famous market moment. It's a small curation job (30-50 events cover most of what retail traders and content agents care about) but it turns every match row into a potential headline.

Curate notable (symbol, date-range, label, description) tuples
At match-return time, check each match against the catalog
Attach {slug, label, description} to matches inside a window
UI renders a small colored pill; content agents pull label into a headline

Events to seed: bank collapses (SIVB, FRC, SBNY, CS, PACW), M&A exits (TWTR, ATVI), macro inflection points (COVID crash + bottom, 2022 bear low, Russia invasion, peak CPI), narrative blowoffs (GME squeeze, 2021 growth peak), AI-era milestones (NVDA 2023-05 breakout, DeepSeek shock). 30 events get you 80% coverage.

Outputs: 'NVDA today maps onto SIVB 2023-03-08 (bank collapse, -71% in 10 days) as the 4th-closest analog' — that's the sentence a market writer needs, and it comes for free from a tiny curated catalog.

Putting it together

All three patterns are live in Chart Library's API and MCP server. The system prompt in our Claude Agent example enforces pattern 1. The LangGraph example at github.com/grahammccain/chart-library/tree/main/integrations/langgraph shows pattern 2 as a StateGraph. Pattern 3 is a single named_event field on every top_match in cohort responses.

But the patterns are bigger than us. If you're building a stock-research agent on a different stack (Bloomberg API, Polygon direct, your own embeddings), implement these three primitives the same way and you'll have a system that grounds claims, discovers conditional structure, and surfaces narrative hooks automatically.

CTA — All three patterns work on the free Sandbox tier (200 calls/day). Grab a key at chartlibrary.io/developers and try the full loop in under 20 lines of Python.

Originally published at chartlibrary.io. Chart Library is the stock-market memory for AI agents — free Sandbox tier at chartlibrary.io/developers.