<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Aditya Goyal</title>
    <description>The latest articles on DEV Community by Aditya Goyal (@aditya_goyal_1).</description>
    <link>https://dev.to/aditya_goyal_1</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3838353%2F84535556-a95b-417b-bc6a-6cd05ff69fcc.jpg</url>
      <title>DEV Community: Aditya Goyal</title>
      <link>https://dev.to/aditya_goyal_1</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/aditya_goyal_1"/>
    <language>en</language>
    <item>
      <title>Hindsight Made My Study Agent Learn</title>
      <dc:creator>Aditya Goyal</dc:creator>
      <pubDate>Sun, 22 Mar 2026 13:11:35 +0000</pubDate>
      <link>https://dev.to/aditya_goyal_1/hindsight-made-my-study-agent-learn-3bl2</link>
      <guid>https://dev.to/aditya_goyal_1/hindsight-made-my-study-agent-learn-3bl2</guid>
      <description>&lt;p&gt;“Why are you giving me recursion problems again?” I checked the logs and realized the agent wasn’t repeating randomly — it had remembered my past mistakes and was deliberately making me practice my weak topics.&lt;/p&gt;

&lt;p&gt;That was the moment this project stopped being a chatbot and started behaving more like a tutor.&lt;/p&gt;

&lt;p&gt;I originally set out to build a simple AI chatbot that could help with studying and coding practice. The idea was straightforward: chat with an AI, generate quizzes, solve coding problems, and track progress. But very quickly I ran into a problem — the AI forgot everything between sessions. It didn’t remember what I struggled with, what mistakes I made, or what I was trying to improve.&lt;/p&gt;

&lt;p&gt;That’s when I realized the interesting problem wasn’t building another chatbot. The interesting problem was building an agent that could learn from a user over time.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;The system I ended up building is a multi-user AI study assistant that includes:&lt;/p&gt;

&lt;p&gt;AI chat mentor&lt;br&gt;
Coding practice and mistake tracking&lt;br&gt;
Progress tracking&lt;br&gt;
Study plan generation&lt;br&gt;
A structured memory system&lt;br&gt;
A dashboard showing weak topics and progress&lt;br&gt;
&lt;a href="https://edupilot-three.vercel.app/" rel="noopener noreferrer"&gt;Edupilot link&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo5oe1djixjez9iw9rvj5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo5oe1djixjez9iw9rvj5.png" alt="Dashboard Page" width="800" height="396"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmz7ltlaxnzcaequxj0fs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmz7ltlaxnzcaequxj0fs.png" alt="Insight Page" width="800" height="375"&gt;&lt;/a&gt;&lt;br&gt;
Technically, the system is built with a Next.js frontend hosted on Vercel, Supabase for authentication and database storage, and a language model accessed through the Groq API. The interesting part is the memory layer, where I used Hindsight to extract structured memory from interactions and feed it back into the model later.&lt;/p&gt;

&lt;p&gt;Instead of treating each message independently, the system stores structured learning signals like weak topics, mistakes, goals, and progress, and uses them to influence future responses.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Problem With Stateless Chatbots
&lt;/h2&gt;

&lt;p&gt;Most AI assistants today are stateless. They respond very well within a single conversation, but they don’t build long-term understanding of a user. They don’t know what you struggled with last week, which topics you are weak in, or whether you are improving over time.&lt;/p&gt;

&lt;p&gt;For a learning system, this is a big limitation. A real teacher remembers your mistakes, tracks your progress, and adjusts what they teach you. I wanted to see what would happen if an AI assistant could do something similar.&lt;/p&gt;

&lt;p&gt;At first, I tried just storing chat history and sending it back as context, but that quickly became messy and inefficient. Raw chat logs are not really memory — they are just transcripts. What I actually needed was structured memory that the system could reason about.&lt;/p&gt;

&lt;p&gt;This is where Hindsight became useful. Instead of only reacting in the moment, the agent periodically looks back at interactions and extracts structured information that can be used later.&lt;/p&gt;

&lt;p&gt;If you’re interested in the concept, here are useful resources:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/vectorize-io/hindsight" rel="noopener noreferrer"&gt;Hindsight GitHub repository&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://hindsight.vectorize.io/" rel="noopener noreferrer"&gt;Hindsight documentation&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://vectorize.io/features/agent-memory" rel="noopener noreferrer"&gt;Agent memory page on Vectorize&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;These explain why structured memory is important for agent systems.&lt;/p&gt;
&lt;h2&gt;
  
  
  Memory Extraction Instead of Raw Chat Logs
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faqgnw2fdivo1a6etsft4.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faqgnw2fdivo1a6etsft4.jpg" alt="Dataflow" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After each chat or coding session, the system runs a second step that extracts structured memory from the interaction. Instead of storing the entire conversation, it stores things like weak topics, mistakes, goals, and progress updates.&lt;/p&gt;

&lt;p&gt;Conceptually, it looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;extraction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;extractTurnMemory&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="nx"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;assistantReply&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;insertMemoryExtraction&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;weakTopics&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;extraction&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;weakTopics&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;mistakes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;extraction&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;mistakes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;goals&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;extraction&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;goals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;progress&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;extraction&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;progress&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This structured memory is stored in tables like:&lt;/p&gt;

&lt;p&gt;memory_extractions&lt;br&gt;
progress_events&lt;br&gt;
coding_mistakes&lt;br&gt;
chat_history&lt;/p&gt;

&lt;p&gt;This makes it much easier to query and use memory later compared to searching through long chat logs.&lt;/p&gt;
&lt;h2&gt;
  
  
  Retrieval Before Every Response
&lt;/h2&gt;

&lt;p&gt;Storing memory is only half the system. The other half is retrieval.&lt;/p&gt;

&lt;p&gt;Before generating a response, the system retrieves relevant past memory and includes it in the model prompt. The flow looks roughly like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;extractions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;retrieveRelevantExtractions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;mistakes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getRecentCodingMistakes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;memoryContext&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;formatMemoryForPrompt&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="nx"&gt;extractions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;mistakes&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;generateLLMResponse&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;memoryContext&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Because of this, the model can respond with awareness of the user’s history. For example, it might say something like:&lt;/p&gt;

&lt;p&gt;You struggled with recursion base cases earlier. Let’s review that before trying this problem.&lt;/p&gt;

&lt;p&gt;That behavior is not hardcoded — it emerges from feeding structured memory back into the model.&lt;/p&gt;

&lt;h2&gt;
  
  
  Coding Practice and Mistake Tracking
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4tr0g93qy477bfcej9y9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4tr0g93qy477bfcej9y9.png" alt="Code Editor" width="800" height="450"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffewf8911uh49y1k79god.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffewf8911uh49y1k79god.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;br&gt;
One part of the system that turned out to be very useful was coding mistake tracking. When a user submits a solution, the system records the problem, topic, mistake type, and description.&lt;/p&gt;

&lt;p&gt;Conceptually:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;insertCodingMistake&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;problemId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;mistakeType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;off_by_one&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Loop boundary error in array traversal&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Over time, the system builds a profile of common mistakes and weak topics, which are then used to generate study plans and recommendations.&lt;/p&gt;

&lt;p&gt;This turned the system from a chatbot into something closer to a learning tracker.&lt;/p&gt;

&lt;h2&gt;
  
  
  System Architecture
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdxb85i9roq1i83n5dd2s.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdxb85i9roq1i83n5dd2s.jpg" alt="System Architecture" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The architecture is built around a memory loop rather than a simple request-response system.&lt;/p&gt;

&lt;p&gt;The flow looks like this:&lt;/p&gt;

&lt;p&gt;User sends a message or submits code.&lt;br&gt;
API retrieves relevant past memory from the database.&lt;br&gt;
Message + memory are sent to the LLM.&lt;br&gt;
LLM generates a response.&lt;br&gt;
Memory extraction step runs using Hindsight.&lt;br&gt;
Structured memory is stored in Supabase.&lt;br&gt;
Dashboard and study plan are updated.&lt;br&gt;
Future interactions use stored memory.&lt;/p&gt;

&lt;p&gt;This loop allows the agent to gradually learn about the user over time instead of treating every interaction as stateless.&lt;/p&gt;

&lt;h2&gt;
  
  
  Structured Memory Storage
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fczq7xqzu21r6oppfyo4f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fczq7xqzu21r6oppfyo4f.png" alt="Supabase Schema" width="800" height="363"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Instead of storing raw chat logs, the system stores structured learning&lt;br&gt;
signals in the database. These include weak topics, coding mistakes,&lt;br&gt;
progress events, and goals. The main tables include memory_extractions,&lt;br&gt;
progress_events, coding_mistakes, and chat_history.&lt;/p&gt;

&lt;p&gt;This structure makes it easier to retrieve relevant memory later and&lt;br&gt;
use it to influence the model’s responses. Instead of searching through&lt;br&gt;
long chat histories, the system can directly retrieve things like weak&lt;br&gt;
topics or recent mistakes and include them in the prompt.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdg5qo9l6ksui7fdngmak.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdg5qo9l6ksui7fdngmak.png" alt=" " width="800" height="363"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Challenges I Ran Into
&lt;/h2&gt;

&lt;p&gt;One of the biggest challenges was deployment. Initially, I used SQLite and local file storage, which worked perfectly in development but failed in the serverless environment on Vercel because the filesystem is read-only. I had to migrate everything to Supabase PostgreSQL and rewrite several parts of the backend to remove local file usage.&lt;/p&gt;

&lt;p&gt;Another challenge was extracting structured memory reliably from LLM responses. Getting consistent JSON output and validating it properly took more time than expected. I also had to design the memory schema carefully so that it was useful for retrieval and not just a dump of data.&lt;/p&gt;

&lt;p&gt;Authentication and multi-user data separation were also tricky at first. Every table had to include a user_id so that memory and progress were stored separately for each user.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;p&gt;A few things I learned from building this project:&lt;/p&gt;

&lt;p&gt;Memory should be structured, not raw chat logs.&lt;br&gt;
Retrieval is more important than storage.&lt;br&gt;
Tracking user mistakes is very useful for personalization.&lt;br&gt;
Serverless environments change how you design backend systems.&lt;br&gt;
Agents become much more interesting when they learn over time.&lt;br&gt;
A memory loop is more important than a bigger context window.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing Thoughts
&lt;/h2&gt;

&lt;p&gt;This project started as a simple chatbot for studying, but the interesting part ended up being the memory system and hindsight loop.&lt;/p&gt;

&lt;p&gt;The main takeaway for me is this: the most useful agents are not the ones that respond best once, &lt;/p&gt;

&lt;p&gt;but the ones that learn and improve over time.&lt;/p&gt;

&lt;p&gt;Adding memory — especially structured memory extracted using hindsight — turns a stateless chatbot into a system that evolves with the user. And once you see that behavior in practice, it’s very hard to go back to building stateless chatbots.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>webdev</category>
      <category>supabase</category>
    </item>
  </channel>
</rss>
