DEV Community

Sadhuram Agarwal
Sadhuram Agarwal

Posted on

We built a memory-powered AI sales agent using Hindsight and cascadeflow

Every sales rep has the same problem. They jump on a call with a prospect they spoke to 3 weeks ago and remember nothing. The objection raised in Call 2. The CFO's name. The competitor mentioned in passing. It's all gone. The rep sounds generic. The prospect feels like a number. The deal dies.
We built DealMind AI to fix this.
Here's exactly how we did it.
The Problem We Targeted
Sales reps manage 30–50 active deals simultaneously. Current CRMs store data but don't think. They don't connect dots across calls. They don't tell you what matters right now before you pick up the phone.
We asked one question: what if your AI agent remembered everything?
What We Built
DealMind AI is a sales intelligence agent with persistent memory. It remembers every call, every objection, every competitor mention, every commitment — forever. When a rep comes back to a prospect after 3 weeks, the agent recalls everything relevant instantly and tells them exactly what to say.
The stack:

Memory layer: Hindsight by Vectorize — persistent semantic memory for AI agents
Runtime Intelligence: cascadeflow — cost-intelligent model routing
LLM: Groq (llama-3.3-70b-versatile) — fast and free
Backend: FastAPI (Python)
Frontend: React + Tailwind CSS
Deployment: Render + Vercel

Why Hindsight Changes Everything
Standard AI has no memory. Every conversation starts from zero. Hindsight gives agents a persistent memory bank — store information with retain(), search with recall(), and reason with reflect().
We built a dual memory architecture. Hindsight Cloud handles semantic search and knowledge graphs. A local fallback ensures the demo never breaks. Every prospect gets their own memory bank with a custom mission statement.
When a rep clicks "Prep for Call" on Ananya Singh's ₹50L deal, the agent recalls across 5 calls:

Board approval required for deals above ₹10L
CFO approval needed before Q3 ends
She requested a pilot program in Call 3
She wants to see the memory demo again

No human rep could remember all of this across 50 deals. The agent never forgets.
How cascadeflow Cut Our Costs 95.8%
Production AI is expensive if you're not smart about it. cascadeflow is a runtime intelligence layer that routes queries to the cheapest model that can handle them — and only escalates when quality requires it.
Our audit trail shows the result: 95.8% cost savings vs sending every query to GPT-4. Every decision logged. Every rupee saved visible on the live dashboard.
The Architecture
Sales Rep → React Dashboard

FastAPI Backend
↙ ↘
Hindsight Groq LLM
(Memory) (Intelligence)
↘ ↙
DealMind Agent Response
We built 9 endpoints:

/log-call — stores call notes in Hindsight memory
/recall/{id} — semantic search across all past calls
/prepare-for-call/{id} — AI call prep from memory
/draft-followup — personalized email referencing past calls
/deal-risk/{id} — AI deal risk score 1-10
/audit-trail — full cost and model audit log

The Demo Moment
Rep opens DealMind → clicks Ananya Singh (₹50L deal) → clicks "Prep for Call"
Agent responds:
"Board approval required for deals above ₹10L — this has come up in 4 of 5 calls. CFO approval needed before Q3 ends. She requested a pilot program in Call 3. Lead with the case study from HealthTech vertical she asked for."
That's not a chatbot. That's an agent that learned.
What We Learned

Persistent memory is not a nice-to-have — it's the difference between a toy and a product
Dual memory architecture (cloud + local fallback) is production-grade thinking
Scope ruthlessly — one workflow done brilliantly beats five done poorly
Ship first, polish second

Try It Live

Live Demo: https://dealmind-ai.vercel.app
API Docs: https://dealmind-ai-cdkj.onrender.com/docs
GitHub: https://github.com/sadhuram09/dealmind-ai

Built with Hindsight by Vectorize and cascadeflow.
Team VoxAid- Sadhuram, Aman, Satyam, Sattvik

Top comments (0)