I gave an AI a memory of my interview history — here's what I learned building on Cognee
Every time I've used an AI to prep for interviews, it forgets me the second I close the tab. I paste my resume, it gives me the same advice it gives everyone, and the next session starts from zero. For something that's supposed to help me improve over time, that amnesia is the whole problem.
So for the Hangover Part AI: Where's My Context? hackathon, I built Interview Memory Coach — an AI coach that builds a permanent, structured memory of your resume, your target role, and every past interview where you struggled, then coaches you against your actual gaps instead of generic tips. It runs on Cognee, an open-source memory layer for AI agents.
This is the story of building it: what worked, the bugs that didn't, and what I learned about giving an AI a memory that actually persists.
The core idea: graph-grounded, not just retrieved
Most "AI with memory" projects are really just RAG — you embed some text, retrieve the nearest chunks, and hope the LLM stitches them into something coherent. That works until the answer depends on relationships between facts rather than the facts themselves.
Interview coaching is exactly that kind of problem. "How should I prepare for system design?" shouldn't return textbook advice — it should connect this candidate to their past failures to the role they're targeting. That's a graph traversal, not a similarity search.
Cognee's GRAPH_COMPLETION search does this in one call: it walks a knowledge graph built from your data and synthesizes an answer grounded in connected facts. When my test candidate Jane asks about system design, the coach doesn't recite fundamentals — it says:
"You've struggled with system design in past interviews — you were asked to design a URL shortener at 10M req/day, proposed SQLite, and admitted you couldn't handle distributed scale. The senior role you're targeting requires exactly this."
That specificity is the entire point. And it only works because Cognee turned her resume and interview notes into a graph where Jane → system design → distributed databases → url shortener are connected nodes, not loose text.
Using the full memory lifecycle
Cognee's memory isn't just store-and-retrieve. It has four operations — remember, recall, improve, forget — and I wanted to use all of them meaningfully, not just check boxes:
-
remember — ingest the resume, job description, and past Q&A;
cognifyextracts the graph. -
recall —
GRAPH_COMPLETIONtraverses the graph to answer a coaching question. -
improve — an enrichment pass (
memify) that deepens the stored graph. - forget — wipe stale interview history so it stops polluting coaching.
Getting each of these to behave for a fast, free-tier, single-user demo is where the real engineering happened. Here are the three problems that taught me the most.
Bug 1: the 8B model that couldn't follow a schema
To keep costs at zero, I started with a two-model setup: a small, fast model (llama-3.1-8b-instant via Groq) for graph extraction, and a larger one (llama-3.3-70b-versatile) for the actual coaching answers. Cheap extraction, quality answers — sensible split.
Except ingest kept crashing. The error trace showed the 8B model was producing KnowledgeGraph JSON that violated Cognee's schema — nodes missing their required description field. Cognee retried, the retries also came back malformed, and the whole pipeline died.
The fix taught me something about matching model size to task constraints: strict structured output is hard for small models, and no amount of retrying fixes a model that can't reliably hit a schema. Since ingest only runs once (the sample pre-loads at startup), I moved extraction to the 70B model too. The cost is trivial for a one-time call, the crashes vanished — and as a bonus, the richer model produced a denser graph (12 nodes became 20). Sometimes the "expensive" choice is the right one once you realize how rarely the call actually fires.
Bug 2: the coach that wouldn't answer twice
Early on, asking the same question twice produced a bizarre result: the second time, instead of coaching, the app replied "I'll wait for further clarification." It had decided it already answered.
This one took some digging. Cognee's search() has a session_id parameter that defaults to 'default_session' — so every question I asked landed in the same conversation session. Cognee, reasonably, treated my repeated question as a continuation of an ongoing chat and short-circuited rather than re-answering.
The fix was to pass a fresh uuid.uuid4() as the session_id on every coaching call, plus setting CACHING=false. Now each question is an independent, stateless query — exactly what a coaching tool wants. The lesson: a default that's perfect for a multi-turn agent (session continuity) is wrong for a stateless Q&A tool, and knowing why the default exists made the fix obvious instead of a guess.
Bug 3: the forget() that didn't forget
This was the sneakiest. I'd call cognee.forget(dataset="session"), it would return success — and then recall would still answer with the supposedly-forgotten data. A silent failure: the kind that looks fine in a demo right up until a judge clicks the button and watches it lie.
The root cause was a genuinely subtle interaction. For single-user local mode, you disable access control (ENABLE_BACKEND_ACCESS_CONTROL=false). But with access control off, GRAPH_COMPLETION becomes a global graph traversal — it doesn't filter by dataset. So deleting one dataset's nodes left residual nodes that the global search still happily found.
For a single-dataset demo, the reliable answer was a full prune_system wipe — which, since the app only ever holds one session, is functionally equivalent to forgetting that session. (True multi-dataset surgical forget would mean enabling access control and scoping searches per user — the production path, which I noted but didn't need.) What I appreciated here is that Cognee gave me enough visibility into its own internals to actually find this, rather than just shrugging at a black box.
What I think about Cognee after building on it
Honestly? It's a genuinely powerful piece of infrastructure, and the graph-vector hybrid is the real deal — being able to traverse a knowledge graph and fall back to semantic search, in one call, is exactly what "AI memory" should mean. The lifecycle API (remember/recall/improve/forget) is a clean mental model for something that's usually a tangle.
The rough edges I hit — the schema strictness, the session defaults, the dataset-scoping behavior — weren't bugs so much as behaviors I had to understand. And the fact that I could understand them (open source, readable internals, real error messages) is the difference between a tool you fight and a tool you learn. I'd reach for it again.
The takeaway
The build is fully self-hosted — SQLite, LanceDB, an embedded graph store, local CPU embeddings, and Groq's free tier for the LLM. No data leaves the machine, and it costs nothing to run.
But the part I'm keeping isn't the stack — it's the reframe. "AI memory" isn't a vector store with extra steps. It's a graph of what the system knows about you, that can grow, deepen, and be deliberately forgotten. Once interview prep remembers your actual history, it stops giving you everyone's advice and starts giving you yours.
Code's here: github.com/VignanNallani/interview-memory-coach
Built for the Hangover Part AI hackathon. This project was built with AI coding assistants for implementation; all architecture and debugging decisions were mine, and I can defend every one.
Top comments (0)