<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sameera Lala</title>
    <description>The latest articles on DEV Community by Sameera Lala (@sameera_).</description>
    <link>https://dev.to/sameera_</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3874794%2F22062f8d-6eba-411f-831b-eb6ab1e3c227.png</url>
      <title>DEV Community: Sameera Lala</title>
      <link>https://dev.to/sameera_</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sameera_"/>
    <language>en</language>
    <item>
      <title>My AI Agent Has Amnesia — And It’s Ruining My Business</title>
      <dc:creator>Sameera Lala</dc:creator>
      <pubDate>Sun, 12 Apr 2026 11:32:56 +0000</pubDate>
      <link>https://dev.to/sameera_/my-ai-agent-has-amnesia-and-its-ruining-my-business-377j</link>
      <guid>https://dev.to/sameera_/my-ai-agent-has-amnesia-and-its-ruining-my-business-377j</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9xdqcei3ygvm7bumupxg.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9xdqcei3ygvm7bumupxg.jpeg" alt=" " width="800" height="375"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F66o4nuciaolmpvr5g1cp.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F66o4nuciaolmpvr5g1cp.jpeg" alt=" " width="800" height="387"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo3cmhnd3igxpxf0pibrj.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo3cmhnd3igxpxf0pibrj.jpeg" alt=" " width="800" height="376"&gt;&lt;/a&gt;&lt;br&gt;
The first time our system recommended a pricing experiment that had already failed, I assumed something was broken.&lt;/p&gt;

&lt;p&gt;We had the data. We had the logs. We even had a post-mortem explaining exactly why the experiment didn’t work.&lt;/p&gt;

&lt;p&gt;And yet, when a similar idea came in, the AI evaluated it as if it had never seen anything like it before.&lt;/p&gt;

&lt;p&gt;That’s when it clicked.&lt;/p&gt;

&lt;p&gt;Nothing was broken.&lt;/p&gt;

&lt;p&gt;The system just couldn’t remember.&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;I’ve been working on ExpTracker.AI, a system designed to help businesses track experiments — mostly around pricing, growth strategies, and product decisions — and use that history to guide future decisions.&lt;/p&gt;

&lt;p&gt;The idea was straightforward: teams shouldn’t repeat failed experiments just because the insight got buried somewhere.&lt;/p&gt;

&lt;p&gt;In practice, that happens all the time.&lt;/p&gt;

&lt;p&gt;A pricing test runs. It fails. Someone writes up a summary in Notion or Slack. Maybe there’s a meeting about it. Then everyone moves on.&lt;/p&gt;

&lt;p&gt;Six months later, someone proposes the same idea again. Not intentionally — it just sounds reasonable. The context is gone.&lt;/p&gt;

&lt;p&gt;We built ExpTracker to solve that.&lt;/p&gt;

&lt;p&gt;At least, that’s what we thought.&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;The system itself wasn’t complicated.&lt;/p&gt;

&lt;p&gt;We had a way to log experiments — hypotheses, pricing changes, outcomes. We stored everything in a structured format. Then we used an LLM to evaluate new proposals.&lt;/p&gt;

&lt;p&gt;The flow looked something like this:&lt;/p&gt;

&lt;p&gt;A new experiment comes in.&lt;br&gt;
We fetch relevant past experiments.&lt;br&gt;
We pass everything into the model.&lt;br&gt;
The model gives a recommendation.&lt;/p&gt;

&lt;p&gt;On paper, this should have worked.&lt;/p&gt;

&lt;p&gt;In reality, it didn’t.&lt;/p&gt;

&lt;p&gt;The model kept missing obvious context. It would recommend ideas that had already failed, even when the data existed in our system.&lt;/p&gt;

&lt;p&gt;At first, I thought retrieval was the issue.&lt;/p&gt;

&lt;p&gt;It wasn’t exactly that.&lt;/p&gt;

&lt;p&gt;The deeper problem was that we were treating logs as memory.&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;Our early data looked like this:&lt;/p&gt;

&lt;p&gt;Experiment: Q1 Pricing Test&lt;br&gt;
Old Price: 49.99&lt;br&gt;
New Price: 59.99&lt;br&gt;
Hypothesis: Users are price insensitive&lt;br&gt;
Outcome: Failure&lt;/p&gt;

&lt;p&gt;This is fine for humans reading a document.&lt;/p&gt;

&lt;p&gt;It’s terrible for a system trying to reason across past decisions.&lt;/p&gt;

&lt;p&gt;The moment a new proposal was phrased slightly differently — “premium users won’t react to a 20% increase” — the connection broke.&lt;/p&gt;

&lt;p&gt;Keyword matching failed. Even basic filtering failed.&lt;/p&gt;

&lt;p&gt;From the model’s perspective, it was seeing a brand new idea.&lt;/p&gt;

&lt;p&gt;That’s when I stopped thinking in terms of storage and started thinking in terms of memory.&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;What we needed was a way for the system to recall past experiments based on meaning, not wording.&lt;/p&gt;

&lt;p&gt;I started looking into different approaches and ended up using Hindsight as a memory layer. It’s essentially built for this exact problem — giving AI systems a way to store and retrieve context semantically.&lt;/p&gt;

&lt;p&gt;More importantly, it forced us to rethink how we structured data.&lt;/p&gt;

&lt;p&gt;Instead of just logging outcomes, we started capturing intent.&lt;/p&gt;

&lt;p&gt;Each experiment now includes:&lt;/p&gt;

&lt;p&gt;What was being tested&lt;br&gt;
Why we thought it would work&lt;br&gt;
How big the change was&lt;br&gt;
What actually happened&lt;br&gt;
Why it succeeded or failed&lt;/p&gt;

&lt;p&gt;That last part — the “why” — turned out to matter more than anything else.&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;Here’s roughly how we store an experiment now:&lt;/p&gt;

&lt;p&gt;We don’t just say “price increased and churn went up.”&lt;/p&gt;

&lt;p&gt;We store something closer to:&lt;/p&gt;

&lt;p&gt;“Starter tier price increased by 30% based on the assumption that users were not sensitive to price changes. Within three weeks, churn increased by 18%, indicating high sensitivity in this segment.”&lt;/p&gt;

&lt;p&gt;That difference is subtle, but it completely changes how retrieval works.&lt;/p&gt;

&lt;p&gt;Now when a new proposal comes in, the system can match on intent.&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;The most important design decision we made came out of this shift:&lt;/p&gt;

&lt;p&gt;The system must look backward before it looks forward.&lt;/p&gt;

&lt;p&gt;Every time a new experiment is proposed, we run a recall step first.&lt;/p&gt;

&lt;p&gt;We take the hypothesis, turn it into a semantic query, and retrieve a small set of similar past experiments — usually the top 3 to 5.&lt;/p&gt;

&lt;p&gt;Those results get passed into the model along with the new proposal.&lt;/p&gt;

&lt;p&gt;That’s it.&lt;/p&gt;

&lt;p&gt;No fancy orchestration. No complex pipelines.&lt;/p&gt;

&lt;p&gt;Just: recall first, then reason.&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;The difference in behavior was immediate.&lt;/p&gt;

&lt;p&gt;Before, the model would respond with something like:&lt;/p&gt;

&lt;p&gt;“This pricing change seems reasonable given typical user behavior.”&lt;/p&gt;

&lt;p&gt;After adding memory, the response changed to something more like:&lt;/p&gt;

&lt;p&gt;“Similar past experiments involving large price increases resulted in significant churn within a short period. A 20% increase may carry similar risks. Consider testing a smaller increment.”&lt;/p&gt;

&lt;p&gt;Same model.&lt;/p&gt;

&lt;p&gt;Completely different answer.&lt;/p&gt;

&lt;p&gt;The only difference was context.&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;One thing that surprised me was how little memory we actually needed.&lt;/p&gt;

&lt;p&gt;Initially, I assumed more data would lead to better decisions.&lt;/p&gt;

&lt;p&gt;It didn’t.&lt;/p&gt;

&lt;p&gt;When we passed too many past experiments into the model, the output got worse. It became vague and less decisive.&lt;/p&gt;

&lt;p&gt;The sweet spot turned out to be a small set of highly relevant examples.&lt;/p&gt;

&lt;p&gt;Usually five or fewer.&lt;/p&gt;

&lt;p&gt;Anything beyond that just added noise.&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;Another mistake we made early on was only storing experiments after they were completed.&lt;/p&gt;

&lt;p&gt;That seemed logical at first — why store incomplete data?&lt;/p&gt;

&lt;p&gt;But it turns out even proposed experiments are valuable.&lt;/p&gt;

&lt;p&gt;If two teams independently propose the same idea, that’s already useful context.&lt;/p&gt;

&lt;p&gt;So we started storing experiments as soon as they’re defined, and then updating them later with outcomes.&lt;/p&gt;

&lt;p&gt;That created a continuous loop:&lt;/p&gt;

&lt;p&gt;Store the idea&lt;br&gt;
Recall it during future decisions&lt;br&gt;
Update it with results&lt;br&gt;
Use it again&lt;/p&gt;

&lt;p&gt;Over time, the system starts to build something that actually resembles learning.&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;There’s also an interesting side effect.&lt;/p&gt;

&lt;p&gt;As the memory grows, the system becomes more opinionated.&lt;/p&gt;

&lt;p&gt;Not because the model changes, but because the context does.&lt;/p&gt;

&lt;p&gt;It starts to push back on risky ideas. It highlights patterns. It reinforces what works and flags what doesn’t.&lt;/p&gt;

&lt;p&gt;It feels less like a generic assistant and more like someone who has been in every experiment review meeting for the past year.&lt;/p&gt;

&lt;p&gt;Which, in a way, it has.&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;One of the more subtle lessons from building this was that most teams don’t have a thinking problem.&lt;/p&gt;

&lt;p&gt;They have a recall problem.&lt;/p&gt;

&lt;p&gt;The insights exist. The data exists. The conclusions are often correct.&lt;/p&gt;

&lt;p&gt;But they’re not available at the moment decisions are made.&lt;/p&gt;

&lt;p&gt;And that’s what really matters.&lt;/p&gt;

&lt;p&gt;Timing, not storage.&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;Another takeaway is that memory quality directly affects decision quality.&lt;/p&gt;

&lt;p&gt;If your stored data is vague, your retrieval will be weak.&lt;/p&gt;

&lt;p&gt;If your retrieval is weak, your model will fall back to generic reasoning.&lt;/p&gt;

&lt;p&gt;And that’s exactly how you end up repeating mistakes.&lt;/p&gt;

&lt;p&gt;Adding more data doesn’t fix that.&lt;/p&gt;

&lt;p&gt;Adding better-structured data does.&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;The last thing that became clear is that AI systems don’t improve just because models get better.&lt;/p&gt;

&lt;p&gt;You can swap in a stronger model, increase context size, tweak prompts — none of that solves the core issue if the system still doesn’t remember.&lt;/p&gt;

&lt;p&gt;What actually improves performance over time is accumulated, accessible experience.&lt;/p&gt;

&lt;p&gt;In other words, memory.&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;Looking back, the biggest shift wasn’t technical.&lt;/p&gt;

&lt;p&gt;It was conceptual.&lt;/p&gt;

&lt;p&gt;We stopped treating past experiments as something to archive and started treating them as something to actively use.&lt;/p&gt;

&lt;p&gt;That changed how we designed the system.&lt;/p&gt;

&lt;p&gt;It also changed how it behaved.&lt;/p&gt;

&lt;p&gt;⸻&lt;/p&gt;

&lt;p&gt;AI doesn’t fail because it can’t think.&lt;/p&gt;

&lt;p&gt;It fails because it can’t remember.&lt;/p&gt;

&lt;p&gt;And once you fix that, everything else starts to make a lot more sense.&lt;/p&gt;

&lt;p&gt;GitHub repo link - &lt;a href="https://github.com/sameeralala/Exptracker.AI" rel="noopener noreferrer"&gt;https://github.com/sameeralala/Exptracker.AI&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>machinelearning</category>
    </item>
  </channel>
</rss>
