DEV Community

Shashank M
Shashank M

Posted on

Every call, my rep started from zero. I fixed it with Hindsight.

Every Monday morning, our sales rep would open the CRM, stare at a list of 40 prospects, and have no idea where to start. Not because the data wasn't there — it was. Timestamps, outcome tags, notes buried three clicks deep. But reading through all of it to decide who needed a call today, who could wait, and who was about to go cold? That took 45 minutes. And it still felt like guesswork.

We built SalesMemory to fix that. The weekly digest feature is where it clicks into place.


What the system does

SalesMemory gives sales reps a persistent memory layer across every prospect interaction. Before a call, you type a name and get a structured brief: objections raised, what to focus on, deal health score, last contact. After the call, you log 2–3 sentences and select an outcome. Hindsight stores that note permanently as a semantic memory. Every future brief for that prospect is built from everything recalled about them.

The stack: React + Vite + Tailwind on the frontend (Vercel), Python + FastAPI on the backend (Render), Groq's llama-3.3-70b-versatile as the LLM, and Hindsight as the only persistence layer. No database.

There are four user flows: pre-call brief, post-call logger, memory timeline, and the weekly digest. The first three are straightforward. The digest is where the interesting engineering decisions live.


The digest problem

At the start of every week, a rep needs to answer one question: given everything I know about every prospect right now, what should I do today?

The naive implementation: loop through every prospect, call the LLM once per person, collect the results, sort them by priority. Five prospects, five API calls, one merge step.

That works. But it has a problem the output makes obvious. When you ask the LLM "what should the rep do about James Okafor?" in isolation, it can only answer based on James. It doesn't know that Priya Sharma just agreed to a pilot and is moving fast, making James — who hasn't responded in 7 days — comparatively more urgent. It doesn't know that Anika Patel is genuinely waiting on legal and needs nothing. Separate calls produce separate answers. They can't produce relative priority.

We built the digest differently.


One call, all prospects

The digest endpoint recalls memory for every prospect in a single pass, builds one prompt with all of it, and sends everything to the LLM simultaneously:

async def generate_digest() -> dict:
    all_prospects = get_all_prospects()
    prospect_contexts = []

    for name in all_prospects:
        recalled = recall_prospect(name)
        if recalled:
            prospect_contexts.append({"name": name, "context": recalled})

    prompt = build_digest_prompt(prospect_contexts)
    response = groq_client.chat.completions.create(
        model="llama-3.3-70b-versatile",
        messages=[
            {"role": "system", "content": DIGEST_SYSTEM_PROMPT},
            {"role": "user", "content": prompt}
        ],
        temperature=0.3,
        max_tokens=1200
    )
    return json.loads(response.choices[0].message.content.strip())
Enter fullscreen mode Exit fullscreen mode

The LLM sees all five prospect contexts at once and reasons across them in a single pass. The output is a prioritized action plan in three buckets:

🔴 Needs attention now — no response after 5+ days, stalled deal, at-risk signal
🟡 Follow up this week — active deal, clear next step needed
🟢 On track — waiting on prospect, no rep action needed

Each item includes a specific reason and a concrete one-sentence action. Not "follow up with James." More like: "James hasn't responded in 7 days after you sent the security report — follow up and directly address the competitor pricing concern he raised."

That level of specificity only exists because the LLM read the full history for every prospect before making any prioritization decisions.


Why this works: agent memory as the prerequisite

The single-prompt digest only works because retrieving each prospect's full context is fast and cheap. That's what Hindsight makes possible.

Recalling a prospect's history is one function call:

def recall_prospect(prospect_name: str) -> str:
    results = client.recall(
        pipeline_id=PIPELINE_ID,
        query=f"prospect interactions with {prospect_name}",
        top_k=10
    )
    return results
Enter fullscreen mode Exit fullscreen mode

The top_k=10 returns the 10 most semantically relevant memories for that prospect — not just the 10 most recent. If a budget freeze was mentioned two months ago and the current week involves a pricing conversation, that memory surfaces. A SQL ORDER BY timestamp DESC LIMIT 10 would miss it.

Each recalled context feeds directly into the digest prompt. No transformation, no schema mapping. The LLM reads raw memory and reasons about it. This is persistent memory across sessions being used the way it's supposed to be — not just to answer questions about one thing, but to reason across an entire portfolio simultaneously.


What the digest actually outputs

Here's what the five prospects in our system look like through the digest lens:

Priya Sharma (VP Sales, Rentokil) — 4 interactions logged. Budget freeze, onboarding concern, ROI calculator loved, pilot agreed, data migration concern emerged. Deal health: 70/100, improving. Digest bucket: 🟡 Follow up this week. Action: come prepared with a data migration plan and timeline.

James Okafor (Head of Revenue, Paysend) — 3 interactions. API reliability concern, SLA well received after technical deep dive, security report sent, competitor pricing raised. No response in 7 days. Digest bucket: 🔴 Needs attention now. Action: follow up on the security report and address the pricing pressure directly.

Sarah Linden (Director of Partnerships, Fetch Rewards) — went cold during a reorg, reconnected after 6 weeks, now requesting pricing for 10 seats. Digest bucket: 🟡 Follow up this week. Action: send the pricing proposal promptly while momentum is fresh.

Marcus Webb (Commercial Director, Trainline) — budget confirmed, Q4 timeline, requested a lunch-and-learn with his 8 reps. Digest bucket: 🔴 Needs attention now. Action: schedule the lunch-and-learn — this is a rep action blocking the deal.

Anika Patel (VP Growth, GoCardless) — GDPR compliance is a hard gate, DPA must be signed before anything moves forward, legal is reviewing. Digest bucket: 🟢 On track. No action needed.

In five seconds, the rep knows exactly what Monday morning looks like. James and Marcus need action today. Priya and Sarah need action this week. Anika can wait.

That ranking is only possible because the LLM saw all five simultaneously. James's 7-day silence reads differently when you also know Priya's deal is accelerating. Marcus's pending lunch-and-learn is flagged as urgent because the rep is the bottleneck, not the prospect. Anika correctly gets left alone because the system understands she's waiting on legal, not the rep.


How memory gets stored in the first place

The digest is downstream of good logging. After every call, the rep writes a few sentences and selects an outcome:

def retain_interaction(prospect_name: str, summary: str, outcome: str, timestamp: str):
    content = f"""
Prospect: {prospect_name}
Date: {timestamp}
Outcome: {outcome}
Summary: {summary}
"""
    client.retain(
        pipeline_id=PIPELINE_ID,
        content=content,
        metadata={
            "prospect": prospect_name,
            "outcome": outcome,
            "timestamp": timestamp,
            "type": "call_log"
        }
    )
Enter fullscreen mode Exit fullscreen mode

Freeform text in the content field means reps write naturally. "She seemed nervous about the migration timeline" works just as well as "onboarding is a concern" — Hindsight stores and retrieves by semantic meaning. The metadata is what keeps the timeline view reliable: filter by prospect name, sort by timestamp, and the history is always there.

The system prompt the LLM uses to generate briefs makes the scoring explicit:

BRIEF_SYSTEM_PROMPT = """
You are a sales intelligence assistant. You read raw memory from past prospect
interactions and return a structured JSON pre-call brief for a sales rep.

Deal health scoring guide:
* 0-20: Cold. No engagement, no signals, or long silence.
* 21-40: Warming up. Early interest but objections unresolved.
* 41-60: Engaged. Active conversations, some positive signals.
* 61-80: Hot. Strong signals, near decision stage.
* 81-100: Closing. Verbal commitment or trial agreed.

Return ONLY valid JSON. No explanation. No markdown. No code fences.
"""
Enter fullscreen mode Exit fullscreen mode

Vague instructions produce vague scores. A rubric with specific deduction and addition logic produces scores that match what a human sales manager would say. We spent more time on this prompt than on any single backend component.


What we learned

Comparative reasoning requires shared context. An LLM that sees one prospect gives you an answer about that prospect. An LLM that sees all prospects gives you a ranking. For a weekly digest, ranking is the entire value. Separate calls and a merge step cannot produce genuine prioritization.

One API call beats five on every metric. Lower latency, lower cost, more coherent output. The only reason to split calls is if the context window is too small — which wasn't an issue here with five prospects.

Semantic recall is what makes the single prompt viable. If recalling each prospect's context required a slow database query or a full-text search pass, batching all five into one prompt would have a noticeable delay. Hindsight's recall is fast enough that the loop runs in milliseconds. By the time build_digest_prompt runs, all the context is ready.

The framing "generate my week" is the right product abstraction. Reps don't want to review five individual briefs and mentally prioritize them. They want one view. The weekly digest turns five separate memory retrieval operations into one actionable output. That's the design decision that made the feature actually get used.


The reason every call started from zero wasn't that our rep was disorganized. It was that the context lived in five different places and nothing connected it. Hindsight documentation describes the retain/recall/reflect model as exactly the right pattern for this — store everything, retrieve by meaning, reason across sessions.

The weekly digest is that model applied to a practical problem: a sales rep, Monday morning, needing to know where to start. One click. Five prospects. Fifteen seconds. Done.

Top comments (0)