I want to tell you about the moment I realized our pricing team was running the same failed experiment for the third time in six months.
Same hypothesis. Same price delta. Same result: churn goes up, revenue goes sideways, everyone acts surprised.
The problem wasn't that people were dumb. The problem was that the knowledge of what happened last time lived nowhere. It was buried in some Slack thread or a spreadsheet nobody opened. So when someone new proposed the idea, there was nothing to stop them.
That's what ExpTrack AI is built to fix. It's a pricing experiment tracker with persistent AI memory — so the next time someone proposes a 20% price hike, the system can say: "hey, we literally tried this in Q1 and churn jumped 18%."
The stack: FastAPI + Groq (qwen/qwen3-32b) + Hindsight memory SDK + vanilla HTML/CSS/JS frontend. No frameworks. Hackathon-ready.
The Architecture — How It All Fits Together
Before diving into code, here's the mental model you need. There are three layers:
- Memory layer (Hindsight) — stores every experiment as a searchable vector memory
- Intelligence layer (Groq LLM) — analyzes past memories and generates insight
- Interface layer (HTML/JS frontend) — where humans interact with the system
The key insight is the learning loop. Every experiment that gets logged enriches the memory. Every future query gets smarter because of it. It's not a chatbot — it's an institutional memory that compounds.
The Learning Loop in Plain English
Here is the exact sequence of events when a user proposes a new pricing experiment:
- User submits a proposal (experiment name, original price, proposed price, hypothesis)
- Backend calls
hindsight.recall()with a semantic query built from the hypothesis and price delta - Top 5 most relevant past experiments are retrieved from the vector store
- Past memories + new proposal are sent to Groq LLM as context
- LLM returns a Hindsight Insight — either a warning or a validation
- New experiment is stored in Hindsight as a
"pending"memory - When the experiment ends,
PATCH /update-resultenriches the memory with the real outcome — closing the loop
Why step 7 matters: Updating the memory with a real outcome transforms a hypothesis into grounded evidence. The next
recall()for a similar proposal will surface this — with the actual result.
The Backend — FastAPI + Hindsight + Groq
The backend has two endpoints. That's it. Clean, minimal, exactly what a hackathon needs.
Endpoint 1: POST /check-experiment
This is the pre-flight check. Before anyone runs an experiment, they hit this endpoint. Here's what happens inside:
# 1. Calculate price delta for semantic context
price_delta_pct = (proposed - original) / original * 100
# 2. Build a recall query combining hypothesis + price direction
recall_query = f"{hypothesis}. Price change of {delta:.1f}%."
# 3. Retrieve top 5 semantically similar past experiments
recalled = hindsight.recall(
query=recall_query,
collection='pricing_experiments',
top_k=5
)
# 4. Send past memories + new proposal to Groq LLM
response = groq.chat.completions.create(
model='qwen/qwen3-32b',
messages=[system_prompt, user_prompt_with_context]
)
# 5. Store new experiment as pending memory
hindsight.store(content=memory_text, metadata={...status: 'pending'})
The recall query is intentionally rich — it combines the hypothesis intent with the price direction. This means if someone proposes a "premium tier price increase to reduce support load," the system will surface past premium tier experiments and past price increase experiments. Not just one or the other.
Endpoint 2: PATCH /update-result
This is where the learning curve gets established. When an experiment ends, the user logs the outcome:
# Retrieve existing memory
existing = hindsight.get(memory_id=request.memory_id)
# Enrich with real outcome — this closes the learning loop
hindsight.update(
memory_id=request.memory_id,
content=updated_content_with_outcome,
metadata={
...status: 'completed',
outcome: 'Failure',
reason: 'Churn rose 18% in week 1'
}
)
After this call, the memory is no longer just a hypothesis. It's evidence. The next time someone proposes anything similar, the LLM will have this as context and can reason from it directly.
Error Handling
The backend handles errors in two tiers:
- Hindsight recall/store failures are non-fatal — if memory is temporarily unavailable, the system proceeds without past context and notes this in the LLM prompt
- Groq LLM failures raise HTTP 502 with a clear retry message — the insight is the core value, so we surface this error explicitly
- If a memory_id is local (API was offline when the experiment was created), the frontend falls back to localStorage gracefully
The AI Flow — What the LLM Actually Does
The LLM prompt is where the product thinking lives. Here's the system prompt:
You are a Pricing Strategy AI Analyst with access to
a company's full history of pricing experiments.
Your job: analyze a proposed pricing change against
past outcomes and deliver a clear, actionable Hindsight Insight.
Rules:
- Reference actual past experiment names and outcomes.
- Be concise: 2-4 sentences maximum.
- ⚠ Warning if past experiments warn against this idea.
- ✅ Validate if past experiments support it.
- Always end with one concrete recommendation.
Temperature is set to 0.3 — low enough for consistent analytical output, high enough to avoid robotic phrasing. The model gets the full past context injected into the user message, so every insight is grounded in your actual data, not generic pricing advice.
What a Good Insight Looks Like
⚠ Warning: You ran a similar 20% increase on your Starter tier in Q1 and churn rose 18% in 3 weeks. The hypothesis was almost identical. Consider a 5% incremental test first to measure elasticity before committing to a full rollout.
That's not generic. That's your history, applied to your current decision. That's the whole point.
The Frontend — Pure HTML/CSS/JS
No React. No Vue. No build step. The frontend is a single index.html file with ~400 lines of vanilla JavaScript. Hackathon judges love this — it runs anywhere, loads instantly, and there's nothing to break.
Key Frontend Decisions
- API base URL is configurable via a text input — point it at localhost or a deployed URL without touching code
- localStorage persistence — experiments survive page refresh, demo data seeds on first load
- Graceful offline fallback — if the API is unreachable and the memory_id is local, result logging saves to localStorage without breaking
- AI Insights computed locally from experiment history — patterns like repeated failures, average delta, discount trends
- Cards animate in with CSS keyframes — smooth fade-slide on every new experiment
The Form to API Connection
When the user submits the form, the frontend calls POST /check-experiment and renders the Hindsight Insight inline below the form:
const res = await fetch(apiBase() + '/check-experiment', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
experiment_name: name,
original_price: orig,
proposed_price: prop,
hypothesis: hypo
})
});
const data = await res.json();
// data.memory_id saved locally for later result logging
// data.hindsight_insight rendered in the insight box
The memory_id returned from the API is stored in the local experiment object. When the user clicks "Log result" on a card, that ID is sent to PATCH /update-result to close the loop in Hindsight.
Demo Data — What Ships Out of the Box
Five sample experiments seed on first load, giving anyone an immediate picture of the product:
| Experiment | Delta | Outcome |
|---|---|---|
| Q1 Starter Tier Bump | +30% | ❌ Failure — churn jumped 18% |
| Pro Annual Discount | -20% | ✅ Success — annual conversions up 34% |
| Enterprise Add-on Fee | +17% | ✅ Success — zero churn, 2 new upsells |
| Freemium Seat Limit | new $4.99 | ❌ Failure — free users churned entirely |
| Mid-Market 5% Test | +5% | ⏳ Pending |
The mix of successes, failures, and a pending experiment demonstrates the full lifecycle in one view. The success rate widget updates immediately, the insights panel generates patterns, and the pending card shows the "Log result" button ready to demo.
What I'd Do Differently
If this weren't a hackathon build, I'd make a few changes:
- Replace Hindsight with ChromaDB for local deployment — zero signup, same semantic search, easier for teams to self-host
- Add experiment tagging (product line, market segment) so recall queries can be scoped — right now every experiment competes in one global search space
- Build a timeline view — the card grid is good but a chronological view of experiments per product would show the learning curve visually
- Add a "similar experiments" panel on the form — show the top 3 recalled memories directly in the UI alongside the LLM insight
The Core Idea, Restated Simply
Most AI tools are stateless. You ask a question, you get an answer, nothing is remembered. ExpTrack AI is different because it compounds — every experiment that completes makes the next recommendation smarter.
That's not a complicated technical idea. It's just: write things down in a place the AI can actually read them.
The Hindsight integration is the whole product, really. Groq and FastAPI are the plumbing. The memory layer is the value.
Built for hackathon. Works in production. Ship it. 🚀
Top comments (0)