DEV Community

Sandhya G K
Sandhya G K

Posted on

CRMs store data. I built something that remembers context.

The first question I get when I tell people we have no database is: "What do you mean, no database?"

We built SalesMemory — a persistent memory layer for sales reps — on a stack with no PostgreSQL, no SQLite, no Redis, nothing. Every prospect interaction is stored as a memory. Every pre-call brief is generated by recalling those memories and feeding them to an LLM. The only persistence layer is Hindsight, a semantic memory system built by Vectorize.

That decision shaped everything about how the product works. It's worth explaining why we made it, what we got from it, and what we gave up.


The problem we were solving

Sales reps carry 30 to 50 active prospects at a time. Each one has history — objections raised, budget signals, competitor comparisons, things said three calls ago that still matter. That context lives in CRM notes nobody reads, or only in the rep's head.

The result: reps re-introduce themselves on calls. They miss that the prospect said budget is frozen until Q3. They forget the deal almost died over onboarding speed. Every call without memory is a wasted call.

What we built: before a call, the rep types a name and gets a structured brief pulled from every past interaction — key objections, a 2–3 sentence recommendation, deal health, last contact. After the call, they log 2–3 sentences. That note goes into permanent memory. The next brief is better because of it.

The question was how to build the persistence layer for this.


Why we didn't reach for a database

The standard move would be: PostgreSQL table, one row per interaction, columns for prospect name, timestamp, outcome, notes. Maybe a separate table for deal scores. Add an ORM, write migrations, deploy.

That approach works. But it has a core limitation: it stores structured fields. You can query "all interactions where outcome = 'Objection logged'" or "last contact date for Priya Sharma." What you can't query is "everything relevant to Priya's concerns about onboarding speed," because that requires understanding the semantic meaning of freeform notes, not matching field values.

Sales notes are not structured. Reps write "she seemed unsure about the timeline" or "he pushed back on implementation scope" — not "objection_type: onboarding, sentiment: negative." Any system that forces structured input is a system reps won't use. We've all seen the CRM graveyard.

Hindsight documentation describes a retain/recall/reflect model where memories are stored with semantic embeddings and retrieved by meaning, not by field matching. That's exactly what we needed. The question became: can we build the entire product on this, with no other persistence?

The answer was yes, with one tradeoff we'll get to.


How the memory layer actually works

Storing an interaction after a call is a single function call:

def retain_interaction(prospect_name: str, summary: str, outcome: str, timestamp: str):
    content = f"""
Prospect: {prospect_name}
Date: {timestamp}
Outcome: {outcome}
Summary: {summary}
"""
    client.retain(
        pipeline_id=PIPELINE_ID,
        content=content,
        metadata={
            "prospect": prospect_name,
            "outcome": outcome,
            "timestamp": timestamp,
            "type": "call_log"
        }
    )
Enter fullscreen mode Exit fullscreen mode

The content is freeform text. The metadata is structured. This combination is important: the metadata is what makes the memory timeline reliable (filter by prospect name, sort by timestamp). The freeform content is what makes retrieval semantic.

Recalling everything before a call:

def recall_prospect(prospect_name: str) -> str:
    results = client.recall(
        pipeline_id=PIPELINE_ID,
        query=f"prospect interactions with {prospect_name}",
        top_k=10
    )
    return results
Enter fullscreen mode Exit fullscreen mode

That top_k=10 means we get the 10 most semantically relevant memories for that prospect. Not just the 10 most recent. If a rep logged a note three months ago about a budget freeze and the current call is about pricing, that memory surfaces. A SQL ORDER BY timestamp DESC LIMIT 10 would miss it.

The recalled context goes directly into the LLM system prompt. No transformation, no schema mapping — the LLM reads raw memory and returns a structured JSON brief with objections, recommendations, deal health score, momentum, risk, and confidence level.


The deal health score is never stored

This is the part that surprised people most when we showed it.

The Deal Health Score — a 0 to 100 integer with a label (Cold / Warming up / Engaged / Hot / At risk) — is recomputed on every brief request. We never write it to disk. There's nothing to migrate if the scoring logic changes. There's no stale score sitting in a row from six weeks ago.

The LLM scores a deal using specific rubric logic we define in the system prompt:

BRIEF_SYSTEM_PROMPT = """
You are a sales intelligence assistant. You read raw memory from past prospect
interactions and return a structured JSON pre-call brief for a sales rep.

Deal health scoring guide:
* 0-20: Cold. No engagement, no signals, or long silence.
* 21-40: Warming up. Early interest but objections unresolved.
* 41-60: Engaged. Active conversations, some positive signals.
* 61-80: Hot. Strong signals, near decision stage.
* 81-100: Closing. Verbal commitment or trial agreed.

Return ONLY valid JSON. No explanation. No markdown. No code fences.
"""
Enter fullscreen mode Exit fullscreen mode

Deductions include unresolved objections, budget uncertainty, long gaps since last contact, and competitor mentions without resolution. Additions include pilot agreed, budget confirmed, and clear next steps established.

Real output for Priya Sharma (VP Sales, Rentokil) after four logged interactions:

Score: 70/100
Label: Engaged
Momentum: ↑ Improving
Risk: "Data migration concerns may stall the deal"
Recommended action: "Provide a detailed data migration plan and timeline"
Confidence: Medium
Enter fullscreen mode Exit fullscreen mode

Because the score is a function of all recalled memory right now, it automatically gets more accurate as memory accumulates. You don't have a recalculation job. You don't have stale scores. You don't have a schema that breaks when you add a new scoring dimension.


The weekly digest: one call, all prospects

The weekly digest endpoint does something non-obvious. The rep clicks "Generate my week" and gets a prioritized action plan across all their prospects — bucketed into needs attention now (no response in 5+ days, stalled deal), follow up this week (active deal, next step needed), and on track (waiting on prospect).

The entire thing runs in a single LLM call:

async def generate_digest() -> dict:
    all_prospects = get_all_prospects()
    prospect_contexts = []

    for name in all_prospects:
        recalled = recall_prospect(name)
        if recalled:
            prospect_contexts.append({"name": name, "context": recalled})

    prompt = build_digest_prompt(prospect_contexts)
    response = groq_client.chat.completions.create(
        model="llama-3.3-70b-versatile",
        messages=[
            {"role": "system", "content": DIGEST_SYSTEM_PROMPT},
            {"role": "user", "content": prompt}
        ],
        temperature=0.3,
        max_tokens=1200
    )
    return json.loads(response.choices[0].message.content.strip())
Enter fullscreen mode Exit fullscreen mode

All prospect contexts go into one prompt. The LLM sees everyone simultaneously and can reason comparatively — "James hasn't responded in 7 days while Priya just agreed to a pilot, so James is the priority." Five separate calls and a merge step would be slower, more expensive, and less coherent because the LLM couldn't make that comparison.

This works specifically because agent memory lets us recall all prospect contexts quickly and cheaply, then hand them off in a single structured prompt.


What we gave up

No database means no SQL aggregations. You can't run "count all deals by stage" or "total revenue weighted by deal health" or "average time between first contact and pilot agreement." If SalesMemory ever needs a reporting layer, we'd have to add one — probably by maintaining a lightweight metadata index in parallel with the memory layer.

For the core use case — helping a rep prepare for a call and log what happened — this tradeoff is fine. The semantic recall is what matters, and SQL can't do that.

We also can't do bulk updates. If we change what "outcome" means, we can't write a migration that updates every stored interaction. The memories are immutable once stored. This is a real constraint if the data model needs to evolve significantly.


What we learned

Metadata is what keeps retrieval from becoming chaos. Storing {"prospect": name, "outcome": outcome, "timestamp": timestamp, "type": "call_log"} on every memory is what makes the timeline view work. Without it, you're running full semantic search on everything and hoping the right things surface. Structure the metadata even when the content is freeform.

Semantic recall is not keyword search. "Prospect interactions with Priya Sharma" returns contextually relevant memories, not exact matches. "She pushed back on timing" and "onboarding timeline is a concern" both surface for the same query. This matters because real sales notes are inconsistent by nature.

The prompt is the actual engineering work. Vague scoring instructions produce vague scores. The rubric in the system prompt — specific deduction and addition logic, five labeled score bands, explicit confidence levels — is what makes the deal health score match what a human sales manager would say. We spent more time on the scoring prompt than on any single piece of backend code.

No CRM update required is the real value prop. The system works with informal, unstructured notes because persistent memory across sessions stores semantic meaning. Reps don't change how they write. That's why they actually use it. Every system that required structured input got abandoned.


The no-database architecture is not the right choice for every product. It was right for this one because the core value is semantic recall, not structured queries. The moment you need SQL aggregations, you need a database alongside the memory layer.

But for a system where the rep logs "she seemed nervous about the migration timeline" and three calls later the brief correctly surfaces data migration as the top risk — that's semantic memory doing what SQL never could.

That's the gap between storing data and remembering context.

Top comments (1)

Collapse
 
sandhyagk profile image
Sandhya G K

Great read! The point about structured inputs being where 'CRMs go to die' is spot on—reps just want to type freeform text. I love the semantic approach here, but I'm curious: if you eventually did need to introduce high-level reporting (like total pipeline value or aggregate win rates), would you lean toward a hybrid model? For example, using the LLM to async-extract a lean, parallel relational database layer alongside the Hindsight memory layer?"