Most engineers building LLM applications think about memory as a retrieval problem. How do I get the right context back when the user asks a question? That's necessary, but it's only half the design. The other half — the half that actually determines retrieval quality — is the write path. What you store, how you structure it, and when you write it determines everything downstream.
Building the Deal Intelligence Agent taught me that the write architecture is where most teams quietly fail. They store too little too late, or they store too much in the wrong format, and then wonder why their retrieval is noisy. Here's what I learned by treating every sales event as an explicit memory write to Hindsight.
The write-everything principle
The instinct when building agent memory is to be selective. Only store the "important" stuff. The problem is that "important" is defined at query time, not at write time, and you often can't predict what will matter later.
For the Deal Intelligence Agent, the rule became: every domain event writes to memory. Objection raised? Write. Stage changed? Write. Competitor mentioned? Write. Email drafted? Write. SMS sent? Write. Pre-call briefing generated? Write. Roleplay session completed? Write. Autopilot playbook generated? Write.
# After a briefing is generated
await memory_svc.store_memory(
deal_id=deal_id,
entry_type="briefing",
content=f"Pre-call briefing generated. Key risks: {risk_summary}",
metadata={"generated_at": datetime.utcnow().isoformat()}
)
# After an SMS is sent via Twilio
await memory_svc.store_memory(
deal_id=deal_id,
entry_type="outreach",
content=f"SMS sent to {contact_name}: {sms_preview}",
metadata={"channel": "sms", "sent_at": datetime.utcnow().isoformat()}
)
This feels verbose. It is verbose. It pays back in retrieval quality because Hindsight's persistent memory layer now has a complete picture of the deal lifecycle, not just the parts someone decided to log.
Why the embedding text design matters more than the content
The single most impactful decision in the memory architecture was the embedding_text prefix format. Raw content stored as "Price is 40% above current vendor" embeds as a general statement about price. Stored as [PRICING] Deal abc123: Price is 40% above current vendor — it embeds with type context.
The effect on retrieval: queries for "pricing objections" pull [PRICING] entries much more reliably than queries against raw content. The type prefix acts as a soft categorical index on top of the vector space.
entry = {
"id": self._generate_id(deal_id, content),
"deal_id": deal_id,
"type": entry_type,
"content": content,
"embedding_text": f"[{entry_type.upper()}] Deal {deal_id}: {content}"
}
result = await asyncio.to_thread(
self.client.memory.store,
user_id=deal_id,
text=entry["embedding_text"], # This is what gets embedded
metadata={"deal_id": deal_id, "type": entry_type, "content": content}
)
The metadata carries the structured fields. The embedding_text is designed for semantic search. They serve different purposes and should be designed independently.
The interaction store: what Hindsight learns beyond facts
Facts are one layer of memory. Interactions — the back-and-forth of chat sessions — are another. The store_interaction method writes every chat turn to Hindsight's add_message pipeline, which enables the system to learn conversational patterns, not just factual state:
async def store_interaction(
self,
deal_id: str,
role: str,
content: str,
metadata: Optional[Dict] = None
) -> None:
if self.use_hindsight:
await asyncio.to_thread(
self.client.memory.add_message,
user_id=deal_id,
role=role,
content=content,
metadata=metadata or {}
)
This is separate from store_memory. Facts are stable — "the CFO raised a pricing objection on November 3rd" doesn't change. Interactions are temporal — they represent the evolving relationship between the agent and the rep. Hindsight tracks both layers.
The fallback that kept development fast
One of the quieter architectural decisions was building a complete in-process fallback before the Hindsight integration was wired up:
def __init__(self):
self.use_hindsight = HINDSIGHT_AVAILABLE and bool(self.api_key)
if self.use_hindsight:
self.client = hindsight.Client(
api_key=self.api_key,
pipeline_id=self.pipeline_id
)
# else: all operations fall through to _fallback_store dict
The fallback uses a defaultdict(list) keyed by deal_id. It's not semantic search — it's keyword matching. But it's enough to develop and test all downstream features without an API key, and it degrades gracefully if Hindsight is unavailable in production.
The lesson: every external dependency should have a local fallback that makes the system feel complete, not broken. The fallback isn't the real thing — it's scaffolding that lets you build the real thing without blocking on infrastructure.
What the write architecture enables
When you write everything and design the embedding text carefully, the agent memory layer becomes a queryable audit trail of the entire deal. Not just "what happened" but "in what order, in what channel, with what outcome."
That's the substrate for the Autopilot (cross-deal pattern matching), the Roleplay simulator (stakeholder-accurate training), the pre-call briefing (comprehensive context in one query), and the risk heatmap (trending analysis from typed event frequency).
None of those features required building separate databases or pipelines. They're all queries against the same Hindsight memory store because the write architecture was designed to support them from day one.
The key insight: memory quality is determined at write time. By the time someone asks a question, it's too late to improve the answer — the signal is either in the store or it isn't.
GitHub: github.com/chaitanya07-ai/deal-intelligence-agent | Live: deal-intelligence-agent-1.onrender.com
Top comments (0)