DEV Community

Cover image for How Hindsight Turned a Stateless LLM Into a Deal-Aware Agent
Chaitanya Gupta
Chaitanya Gupta

Posted on

How Hindsight Turned a Stateless LLM Into a Deal-Aware Agent


đźš© The Problem
Every LLM I’ve worked with has the same fundamental flaw: it forgets everything the moment a request ends.
For a chatbot answering trivia, that’s fine.
For a sales intelligence agent that needs to recall objections from three calls ago, Salesforce integration requirements, and the fact that the CTO is the real decision‑maker — it’s a non‑starter.

That’s the gap I set out to solve when building the Deal Intelligence Agent: a FastAPI + React system that gives sales teams a persistent, queryable memory layer across their pipeline.

The LLM (Groq’s Llama 3.3 70B) is still stateless at inference. What makes it feel deal‑aware is Hindsight — a purpose‑built agent memory layer that stores, indexes, and semantically retrieves structured context on demand.

⚙️ What the Agent Does
The agent tracks every meaningful event in a deal’s lifecycle:

Objections raised

Competitors mentioned

Stakeholders identified

Pricing discussions

Call outcomes

When a rep asks: “What objections did this prospect raise and how have we handled them before?” → the system pulls relevant memories from Hindsight, injects them into the prompt, and the LLM responds with specifics instead of generic advice.

Architecture overview:

FastAPI backend → REST + streaming endpoints

MemoryService → wraps Hindsight SDK

LLMService → Groq completions with memory context

DealService → orchestrates memory + LLM

React frontend → chat, deal detail, competitor radar, risk heatmap, revenue forecasting

Twilio + SMTP → SMS, voice calls, personalized follow‑up emails

Every outbound action (SMS, briefing, email) writes back to Hindsight as a memory event. The agent’s context grows with every interaction.

đź§© The Core Challenge: State Across Sessions
Most tutorials treat memory as “stuff the last few messages into the context window.”
That breaks down fast:

Context windows have hard limits.

Deals span months. You can’t fit six months of notes into one prompt.

The right abstraction isn’t “longer context windows.”
It’s retrieval: surface only what’s relevant to the current query. That’s exactly what Hindsight’s persistent memory layer provides.

📝 How Memory Gets Written
Memory writes are first‑class events, not side effects.

python

async def store_memory(
    self,
    deal_id: str,
    entry_type: str,
    content: str,
    metadata: Optional[Dict] = None
) -> Dict:
    entry = {
        "id": self._generate_id(deal_id, content),
        "deal_id": deal_id,
        "type": entry_type,
        "content": content,
        "embedding_text": f"[{entry_type.upper()}] Deal {deal_id}: {content}"
    }

    if self.use_hindsight:
        result = await asyncio.to_thread(
            self.client.memory.store,
            user_id=deal_id,
            text=entry["embedding_text"],
            metadata={"deal_id": deal_id, "type": entry_type, "content": content, **metadata}
        )
Enter fullscreen mode Exit fullscreen mode

Embedding text prepends [OBJECTION], [COMPETITOR], [STAKEHOLDER] → retrieval is context‑aware.

Async wrapper (asyncio.to_thread) prevents blocking FastAPI’s event loop.

🔍 How Memory Gets Read
Retrieval is where Hindsight shines.

python

@app.post("/api/chat")
async def chat(msg: ChatMessage):
    memories = []
    if msg.deal_id:
        memories = await memory_svc.get_relevant_memories(
            deal_id=msg.deal_id,
            query=msg.message,
            limit=10
        )

    response = await llm_svc.chat_with_context(
        user_message=msg.message,
        memories=memories,
        deal_id=msg.deal_id,
        extra_context=msg.context
    )
Enter fullscreen mode Exit fullscreen mode

The agent runs semantic search scoped to the deal ID, returns the top 10 relevant entries, and injects them into the prompt.

Example formatted context:

Code

[MEMORY CONTEXT]
1. [OBJECTION][2024-11-03] Price is 40% above current vendor
2. [COMPETITOR][2024-11-03] Salesforce mentioned as incumbent
3. [STAKEHOLDER][2024-10-28] David Kim (CTO) — wants API docs
4. [PRICING][2024-11-10] Offered 15% discount; they want 25%

[USER QUERY]
Enter fullscreen mode Exit fullscreen mode

What's the best angle for our renewal call tomorrow?
📊 Win/Loss Pattern Learning
Because every closed deal writes its outcome to Hindsight, the system can analyze patterns:

Objections correlated with wins vs losses

Competitor mentions tied to outcomes

This isn’t the LLM generalizing from training data — it’s your pipeline history driving insights.

🗣️ Real Behavior Shift
Ask about a deal with no history → generic advice.
Ask after six months of interactions → grounded, specific strategy:

“Based on our history with Meridian Systems, their CTO flagged API docs as a blocker. The pricing objection raised on Nov 3 aligns with deals we closed by bundling discounts with longer contracts. Lead with the API docs, then anchor the discount to a 24‑month commit.”

đź’ˇ Lessons Learned
Treat memory writes as domain events, not logs.

Scope memory by the right identity (deal_id worked best).

Semantic retrieval changes the ceiling of LLM usefulness.

Async wrappers prevent concurrency bugs.

Graceful degradation (fallback store) makes dev + onboarding easier.

🎯 Conclusion
LLMs are stateless — but agents don’t have to be.
With Hindsight, the gap between “generic chatbot” and “deal‑aware agent” closes with a handful of well‑placed store_memory and get_relevant_memories calls.

The model stays dumb.
The memory layer makes it look smart.

👉 Repo: https://github.com/chaitanya07-ai/deal-intelligence-agent
👉 Live Demo: https://deal-intelligence-agent-1.onrender.com/

Top comments (0)