SUDHANVA C

Posted on Mar 23

I Built a Coding Mentor That Actually Remembers My Mistakes

#ai #llm #api #machinelearning

“Did it seriously just call me out for that… again?”

I was staring at my screen while my own app pointed out that I had repeated the same bug for the third time. That’s when it clicked—this wasn’t just another code reviewer anymore. It was starting to behave like a real mentor.

What This System Actually Does

I built CodeMentor with a simple goal: create a coding assistant that doesn’t forget you.

Most AI tools are stateless. You ask a question, get an answer, and everything resets. That works for quick help, but it completely breaks the learning loop. Real improvement comes from recognizing patterns—especially the mistakes you keep repeating.

So instead of building another chatbot, I focused on one idea: memory.

The system is a React app powered by:

Groq (LLaMA-3.3-70B) for fast, streaming responses
Hindsight for persistent memory

Every interaction follows:

Recall → Analyze → Retain

Before reviewing your code, it remembers your past. Then it analyzes your current solution. Finally, it stores what just happened.

The Core Idea: Memory Changes the Feedback

The biggest shift is how feedback is generated.

Instead of reviewing code in isolation, the system first asks:
“What has this person struggled with before?”

It retrieves past experiences like:

Off-by-one errors
Missing edge cases
Inefficient loops

That history is injected into the model before analysis.

Here’s the core idea in code:

const q = `Student coding in ${lang}${problem ? ". Problem: " + problem : ""}.`;
const r = await hs.recall(bankId, q);
const memories = r.results || r.memories || [];

This pulls relevant past mistakes before every review.

Now instead of:

“Handle edge cases.”

You get:

“You’ve missed edge cases before—this looks like the same pattern.”

That difference makes feedback feel personal—and much harder to ignore.

Turning Code Reviews Into a Learning Loop

The real value comes after the response.

Every review generates a short summary:

What you attempted
What went wrong
A quick explanation

That gets stored as memory:

await hs.retain(
  bankId,
  `Mistakes: ${mistakes.join(", ")}. Summary: ${summary}`
);

This turns every interaction into training data for the next one.

After a few sessions, the system starts to:

Recognize recurring mistakes automatically
Call out patterns without prompting
Adjust feedback based on your history

At that point, it stops feeling like a tool and starts feeling like a mentor.

Letting Weaknesses Drive Practice

Once memory worked, the next step was obvious:

Stop giving random problems. Start giving targeted ones.

The Challenge system uses your past mistakes to generate personalized exercises.

If you:

Miss edge cases → you get edge-case-heavy problems
Write inefficient code → you get optimization tasks
Struggle with logic → you get focused drills

Instead of practicing everything, you practice what you actually need.

Building a Learning Path From History

The same idea extends to long-term learning.

Instead of generic roadmaps, the app generates a 6-week plan based on your history.

It looks at:

Mistakes you repeat
Weak concepts
Progress over time

Then builds weekly plans with:

Topics
Focus areas
Goals
Resources

It’s not perfect—but it’s far more relevant than static guides.

The Memory Browser (Unexpectedly Useful)

One feature I didn’t expect to matter much was the Memory Browser.

It lets you:

Search past mistakes
Reflect on your weaknesses
Manually store insights

The reflection mode is especially interesting.

You can ask:

“What are my biggest weaknesses?”

And it synthesizes an answer from your history.

It feels less like an AI tool—and more like a learning journal.

What I Learned Building This

A few things became clear:

Memory should be structured, not raw logs
Short summaries work better than long conversations.

Prompt design matters more than storage
If memory isn’t injected properly, it’s useless.

You don’t need a complex stack
This is basically:

React frontend
Memory API
LLM calls

Personalization happens fast
After a few interactions, the system noticeably adapts.

Limitations (And What I’d Improve)

It’s not perfect.

Memory retrieval isn’t always precise
Some patterns get overemphasized
No prioritization of important mistakes
API keys are exposed client-side

If I improve it:

Add memory weighting
Introduce decay for old patterns
Move everything to a backend
Improve retrieval relevance

Final Thought

Most AI tools today are stateless.

They respond—but they don’t learn.

Adding memory changes that.

Once your system can:

Remember
Recognize patterns
Adapt

…it stops being a chatbot.

It becomes a mentor.

And once you experience that, going back feels incomplete.

DEV Community