DEV Community

Cover image for Why I Stopped Using Session State and Used Hindsight Instead
Rajmohan Arabhavi
Rajmohan Arabhavi

Posted on

Why I Stopped Using Session State and Used Hindsight Instead

I thought saving chat history to a Python dict was good enough for an accountability agent — it took exactly one browser refresh to prove me completely wrong.

What We Built

AXIOM is a discipline AI agent that holds engineering students accountable across days and weeks. It remembers your goals, tracks your progress, scores you out of 1000, and calls out repeated failures by name and date. The key word is "remembers" — not just within a session, but permanently, across every login.

The stack is Streamlit for UI, Groq with Llama 3.1 for inference, and Hindsight for persistent memory. My job on this project was figuring out how to make the memory actually work — which turned out to be much harder than I expected.

The Session State Trap

When we first built AXIOM, memory worked like this:

if "messages" not in st.session_state:
    st.session_state.messages = []
Enter fullscreen mode Exit fullscreen mode

This is standard Streamlit. It stores everything in the browser session. It works great — until the user closes the tab. Or refreshes. Or comes back the next day. Then it's completely gone. The agent had no idea who you were, what you'd said, or what you'd been avoiding.

For a to-do list app, that's fine. For an accountability agent, it's useless. The entire value proposition is that the AI remembers your patterns over time.

Why a Simple Database Wasn't Enough

My first fix was saving messages to a Python list stored in a file. This at least survived browser refreshes. But the retrieval was terrible — I was doing basic keyword matching over raw text. If someone said "I skipped the gym because I had a headache after lab," searching "gym" the next day returned that blob with zero structure. The AI couldn't tell when it happened, why it mattered, or how it connected to the current conversation.

What I needed wasn't just storage — it was intelligent retrieval.

How Hindsight Fixed It

Hindsight's memory system works completely differently from a database or vector store. When you call retain(), an LLM processes your content and extracts structured facts — entities, relationships, timestamps, context. The raw text is never stored verbatim.

When you call recall(), four search strategies run simultaneously: semantic similarity, BM25 keyword matching, knowledge graph traversal, and temporal reasoning. This is why AXIOM can answer "what did this user say about DSP last week?" — because Hindsight understands "last week" as a real temporal reference.

Here's how we store memories:

def save_memory_bg(user_msg, ai_msg):
    def _go():
        try:
            ts = now.strftime("%d %b %Y %I:%M %p")
            mem_client.retain(
                bank_id=BANK_ID,
                content=f"[{ts}] {username}: {user_msg}. AXIOM replied: {ai_msg[:200]}"
            )
        except:
            pass
    threading.Thread(target=_go, daemon=True).start()
Enter fullscreen mode Exit fullscreen mode

The timestamp embedded in the content string is what enables temporal search. Without it, Hindsight can't resolve relative time references like "yesterday" or "three sessions ago."

Before vs After

Before Hindsight: Every session started blank. Users could tell the same excuse repeatedly with zero consequence. The agent gave identical generic advice every day.

After Hindsight: "You mentioned on March 19th that you were too tired to study DSP. It's now March 22nd and you still haven't logged any DSP progress. What's actually blocking you?"

That response is only possible because Hindsight stored a structured fact with a timestamp and retrieved it contextually days later.

Lessons Learned

  • Session state is fine for UI state. It is completely wrong for agent memory.
  • Raw text storage with keyword search returns garbage for conversational agents. You need structured fact extraction.
  • Embedding timestamps in your content strings is the single highest-leverage thing you can do to improve temporal recall quality.
  • Background threads for retain() calls are essential in Streamlit — synchronous calls block the UI completely.
  • Check out agent memory patterns on Vectorize — it changed how I think about building stateful AI systems.

Full code: https://github.com/itzsam10/axiom-discipline-ai

Live demo: https://axiom-discipline-ai-wstowhyf2yr6ehevrcb9nw.streamlit.app/

Top comments (0)