Spandana S R

Posted on Mar 22

My agent remembered a rejected application and adjusted strategy

#ai #llm #react #agentmemory

My agent remembered a rejected application and adjusted strategy

"Wait — it already knows this company rejected you." I hadn't told the agent that. It had pulled the rejection from a previous session, cross-referenced the student's updated skill set, and quietly suggested a different role at the same firm.

That moment is when I realised we'd built something genuinely different from the usual career chatbot. Not smarter — just less amnesiac.

What we built

The project is a React + Vite chat app that acts as an AI career advisor for students. A student tells it what skills they have, what they've built, and where they've applied. The advisor gives resume feedback, flags skill gaps, and recommends internships.

The hard part is that internship season plays out over weeks. A student talks to the advisor on Monday, gets rejected Thursday, learns a new framework over the weekend, and comes back the following Monday. A stateless agent treats that second conversation as a blank slate. Ours doesn't.

The stack is deliberately lean:

React + Vite for the frontend — fast dev loop, clean env variable management
Groq (llama-3.3-70b-versatile) as the LLM — fast inference, generous free tier
Hindsight for persistent cloud memory, keyed per student via VITE_HINDSIGHT_MEMORY_BANK
localStorage as a zero-cost fallback when Hindsight keys aren't set

All the interesting logic lives in one file: src/services/advisorService.js. The rest of the app — App.jsx, ChatWindow, ProfilePanel — is mostly UI plumbing around sendMessage() and getProfile(). The full source is on GitHub.

The memory architecture

Every time a student sends a message, sendMessage() in advisorService.js does three things in sequence:

Recall relevant memories from Hindsight using the student's message as the search query
Call Groq with those memories prepended to the system prompt
Extract and save any new career facts from the exchange back to Hindsight's agent memory

Here's what Hindsight's graph view looks like after a real student session — three memory nodes connected by semantic and temporal links, tracking identity, skills, and an application in one graph:

![Hindsight memory graph showing 3 nodes: student identity, skills in C/C++/Java, and Microsoft internship application]

The recall step is what makes the agent feel like it remembers:

// From sendMessage() in advisorService.js
const memories = await getMemory(userId, text);
const memCtx = memories.length > 0
  ? `\nSTUDENT PROFILE FROM MEMORY:\n${memories
      .map(m => m.content || m.text || JSON.stringify(m))
      .join("\n")}\n`
  : "\nNo previous profile found for this student.";

// Memory gets injected at the top of the system prompt
body: JSON.stringify({
  model: MODEL,
  messages: [
    { role: "system", content: SYSTEM_PROMPT + memCtx },
    ...history,
    { role: "user", content: text },
  ],
})

The memory block sits at the top of the system prompt, before the behavioral instructions. We tried appending it at the end first — the agent acknowledged the memories but underweighted them. Moving it to the top made the agent treat recalled facts as ground truth rather than a footnote.

How extraction works

After every exchange, extractAndSave() runs a second Groq call specifically to pull structured career facts out of what just happened:

// From extractAndSave() in advisorService.js
{
  role: "system",
  content: `Extract student career info from the conversation.
Return ONLY valid JSON with these fields (omit if not found):
{
  "skills": ["list of technical skills mentioned"],
  "projects": ["list of projects mentioned"],
  "applications": [{"company":"","role":"","status":"applied/interviewing/rejected/offered"}],
  "targetRoles": ["roles the student wants"]
}`
},
{
  role: "user",
  content: `Student said: "${userMsg}"\nAdvisor replied: "${aiReply.substring(0, 200)}..."`,
}

The extracted JSON gets serialised into natural language before being saved to Hindsight:

const memoryText = `
Student ID: ${userId}
Skills: ${(extracted.skills || []).join(", ")}
Projects: ${(extracted.projects || []).join(", ")}
Applications: ${(extracted.applications || [])
  .map(a => `${a.company} (${a.role}) - ${a.status}`)
  .join(", ")}
Target Roles: ${(extracted.targetRoles || []).join(", ")}
`.trim();

return await saveMemory(userId, memoryText);

The natural language format is intentional. Hindsight does its own fact extraction on ingestion — giving it structured prose lets its embedding model do semantic matching properly at recall time. Storing raw JSON would work for retrieval but makes the recalled context harder for the LLM to read when it lands in the system prompt.

Here's the Hindsight table view after one real session with a student named Spandana — three clean memories stored under the career_profile context, each tagged with the relevant entities:

![Hindsight table view showing 3 stored memories: Microsoft internship application, C/C++/Java skills, and student identity for student_spandana]

The application status field — applied / interviewing / rejected / offered — is the piece that made the rejection-memory scenario work. When a student says "Goldman passed on me," the extractor tags it as status: "rejected". Next session, that fact gets recalled when the student asks where to apply, and the agent skips Goldman without being told again.

The localStorage fallback

One decision I'm glad we made early: the app degrades gracefully when Hindsight keys aren't set. shouldUseHindsight() checks at runtime:

function shouldUseHindsight() {
  return Boolean(HINDSIGHT_KEY && MEMORY_BANK);
}

Every memory operation — saveMemory() and getMemory() — calls this first. If Hindsight is unavailable or throws, the code falls back to a local store in localStorage, keyed by userId. The local store even does lightweight keyword scoring to approximate semantic recall:

function getLocalMemory(userId, query) {
  const tokens = query.toLowerCase().split(/\s+/).filter(Boolean);
  const scored = list
    .map(m => {
      const text = (m.content || "").toLowerCase();
      const score = tokens.reduce(
        (acc, t) => acc + (text.includes(t) ? 1 : 0), 0
      );
      return { ...m, score };
    })
    .sort((a, b) => b.score - a.score)
    .slice(0, 10);
  return scored.map(({ score, ...m }) => m);
}

It's token overlap, not embeddings — but for a career advisor where queries tend to be domain-specific ("backend role", "Goldman rejection", "Python skills"), it's surprisingly effective locally. The store caps at 80 entries per user so localStorage doesn't grow unbounded.

This meant the whole team could develop and demo without burning Hindsight credits. Only when we needed memory to persist across devices did we flip on the cloud keys.

What actually changed

Here's the same scenario before and after adding memory.

Without memory:

Student: I applied to Goldman last week, no response yet. Where else should I apply?

Agent: Based on your Python background, you might consider Goldman Sachs, Two Sigma, or Citadel for quant roles...

The agent recommended Goldman — the company the student had just told it they'd applied to.

With Hindsight memory:

Student: I applied to Goldman last week, no response yet. Where else should I apply?

Agent: Since Goldman is already in play, I'll skip that one. Based on the ML project you mentioned last session and your Python + scikit-learn skills, Stripe's data engineering internship looks like a strong fit — their stack maps closely to what you've built...

Same model. Same system prompt. The only difference was the recalled context injected before the Groq call. The agent knew about the Goldman application because extractAndSave() had tagged it with status: "applied" in a previous session and getMemory() had surfaced it when the student asked where to apply.

The rejection case was more striking. When a student came back after being turned down, the recalled memory included the rejection status. If their skills had grown since — say they'd added a new project the extractor had saved — the agent would suggest a different role at the same company. No logic change. Just better input.

What we got wrong

The extraction prompt needed application status from the start. Our first version stored skills and projects but treated all applications the same. We couldn't distinguish a rejection from a pending application. Adding the status enum (applied / interviewing / rejected / offered) to the extraction prompt was a one-line change that made the rejection-memory behaviour possible.

getProfile() is still keyword matching, not proper recall. The profile panel in the UI calls getProfile(), which does string matching over recalled memories (if text includes "python") rather than anything structured. It works for the demo but would break on any edge case — "I don't know Python" would still add Python to the profile. This needs a proper structured extraction pass, not substring checks.

The extraction call adds latency. Every sendMessage() triggers two Groq calls: one for the reply, one for extraction. On Groq's free tier this is fast enough not to notice, but it's worth being aware of if you're on a slower model or a paid tier with rate limits.

Lessons that carry over

Memory position in the prompt matters more than you'd expect. Putting recalled context at the top of the system prompt — before behavioral instructions — made the agent treat it as ground truth. At the bottom it felt like a suggestion.

Two LLM calls per turn is a reasonable pattern. One call for the user-facing response, one narrow extraction call for structured fact parsing. The extraction prompt is tight and cheap. Trying to combine both into a single call makes the main prompt unwieldy and the extraction unreliable.

The fallback made development faster. Designing the localStorage degradation path first meant we never blocked on API keys during development. The Hindsight documentation covers the memory bank setup well — but having a local fallback meant we could ship a working demo before we'd finished configuring the cloud side.

Natural language beats raw JSON for memory storage. Hindsight's agent memory does semantic indexing on ingestion. Storing "Goldman Sachs (SWE Intern) - rejected" as part of a prose block gives the embedding model something to work with. Storing {"company":"Goldman","status":"rejected"} would retrieve correctly but read awkwardly when injected back into the system prompt.

Same agent, better input. We didn't touch the recommendation logic between the stateless version and the memory-backed version. The improvement came entirely from what the agent received before it started talking. That's the cleaner mental model: memory isn't a feature of the agent — it's better input.

The memory integration is built on Hindsight. If you're building an agent that talks to the same user more than once, persistent memory is worth the integration cost. The alternative is an advisor that asks students to re-introduce themselves every single session — and recommends the company that just rejected them. You can browse the full project source here.

Resources

Hindsight GitHub repository — the open-source memory layer we used. Start here if you want to add persistent agent memory to your own project.
Hindsight documentation — covers memory bank setup, the retain/recall API, metadata filtering, and how semantic indexing works under the hood.
Agent memory on Vectorize — overview of how Hindsight handles long-term agent memory, useful if you're evaluating it against other approaches.
Project repo — the full source for this career advisor app, including advisorService.js and the localStorage fallback.

DEV Community

My agent remembered a rejected application and adjusted strategy

My agent remembered a rejected application and adjusted strategy

What we built

The memory architecture

How extraction works

The localStorage fallback

What actually changed

What we got wrong

Lessons that carry over

Resources

Top comments (0)