<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Amudhan M</title>
    <description>The latest articles on DEV Community by Amudhan M (@amudhan_1603).</description>
    <link>https://dev.to/amudhan_1603</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3877023%2F2b1a35c2-db53-4805-b4ad-9fdd4c968cae.jpg</url>
      <title>DEV Community: Amudhan M</title>
      <link>https://dev.to/amudhan_1603</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/amudhan_1603"/>
    <language>en</language>
    <item>
      <title>How Hindsight caught a vendor bug logs missed</title>
      <dc:creator>Amudhan M</dc:creator>
      <pubDate>Mon, 13 Apr 2026 16:30:32 +0000</pubDate>
      <link>https://dev.to/amudhan_1603/how-hindsight-caught-a-vendor-bug-logs-missed-10ne</link>
      <guid>https://dev.to/amudhan_1603/how-hindsight-caught-a-vendor-bug-logs-missed-10ne</guid>
      <description>&lt;h1&gt;
  
  
  I built Kairo, then realized memory was the hard part
&lt;/h1&gt;

&lt;p&gt;The first version of Kairo worked on day one.&lt;/p&gt;

&lt;p&gt;It could send messages, call tools, search the web, even control&lt;br&gt;
Spotify. From the outside, it looked like a complete system.&lt;/p&gt;

&lt;p&gt;By day three, it was already breaking in ways that were hard to explain.&lt;/p&gt;

&lt;p&gt;Not crashing. Not throwing errors. Just... getting worse over time.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Kairo actually is
&lt;/h2&gt;

&lt;p&gt;Kairo is a Telegram-based agent that connects to real tools.&lt;/p&gt;

&lt;p&gt;It's not just chat. It can: - Read and send emails\&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Control Spotify\&lt;/li&gt;
&lt;li&gt;Query Notion\&lt;/li&gt;
&lt;li&gt;Search the web\&lt;/li&gt;
&lt;li&gt;Set reminders&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of that is wired through tool modules, so the agent can call&lt;br&gt;
functions instead of just generating text.&lt;/p&gt;

&lt;p&gt;The structure is pretty clean:&lt;/p&gt;

&lt;p&gt;src/ ├── index.ts ├── conversation/ ├── gmail/ ├── spotify/ ├── notion/&lt;br&gt;
├── productivity/&lt;/p&gt;

&lt;p&gt;Each module exposes capabilities. The agent decides what to call.&lt;/p&gt;

&lt;p&gt;On paper, it's straightforward.&lt;/p&gt;

&lt;p&gt;In practice, everything depends on how you handle context.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem I didn't expect
&lt;/h2&gt;

&lt;p&gt;The agent didn't fail loudly.&lt;/p&gt;

&lt;p&gt;It failed subtly.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  It forgot what the user asked 5 minutes ago\&lt;/li&gt;
&lt;li&gt;  It repeated actions\&lt;/li&gt;
&lt;li&gt;  It called the wrong tool\&lt;/li&gt;
&lt;li&gt;  It lost track of conversations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Nothing "broke." It just stopped being reliable.&lt;/p&gt;

&lt;p&gt;At first, I thought this was a prompting issue.&lt;/p&gt;

&lt;p&gt;It wasn't.&lt;/p&gt;

&lt;p&gt;The real issue was simple:&lt;/p&gt;

&lt;p&gt;The system had no real memory.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why transcripts don't scale
&lt;/h2&gt;

&lt;p&gt;Kairo stores conversation history with summarization.&lt;/p&gt;

&lt;p&gt;That sounds fine until you run it long enough.&lt;/p&gt;

&lt;p&gt;Summarization introduces drift.&lt;/p&gt;

&lt;p&gt;After a while: - Important details disappear\&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Context gets distorted\&lt;/li&gt;
&lt;li&gt;The agent starts guessing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You end up with something that looks like memory, but behaves like&lt;br&gt;
compression.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bringing in Hindsight
&lt;/h2&gt;

&lt;p&gt;I needed a way to give my agent memory that survives beyond a single&lt;br&gt;
conversation.&lt;/p&gt;

&lt;p&gt;I used Hindsight to store structured events instead of raw chat.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I changed
&lt;/h2&gt;

&lt;p&gt;Instead of storing messages, I started storing events.&lt;/p&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;p&gt;intent: send_email\&lt;br&gt;
tool_used: gmail.send\&lt;br&gt;
result: success&lt;/p&gt;

&lt;p&gt;Now the system remembers behavior, not just text.&lt;/p&gt;

&lt;h2&gt;
  
  
  How this changed Kairo
&lt;/h2&gt;

&lt;p&gt;When a new request comes in:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Process request\&lt;/li&gt;
&lt;li&gt; Search memory for similar actions\&lt;/li&gt;
&lt;li&gt; Adapt behavior&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This made responses more consistent.&lt;/p&gt;

&lt;h2&gt;
  
  
  A concrete example
&lt;/h2&gt;

&lt;p&gt;Before:&lt;/p&gt;

&lt;p&gt;User: "Send the same update I sent yesterday"&lt;/p&gt;

&lt;p&gt;Agent: doesn't remember, asks again.&lt;/p&gt;

&lt;p&gt;After:&lt;/p&gt;

&lt;p&gt;Agent finds past event and reuses it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons learned
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  More context doesn't fix bad memory\&lt;/li&gt;
&lt;li&gt;  Summarization is lossy\&lt;/li&gt;
&lt;li&gt;  Tools increase need for memory\&lt;/li&gt;
&lt;li&gt;  Consistency is harder than intelligence\&lt;/li&gt;
&lt;li&gt;  Memory &amp;gt; prompt engineering&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;The hard part wasn't building the agent.&lt;/p&gt;

&lt;p&gt;It was making it remember.&lt;/p&gt;

&lt;p&gt;Once I fixed that, everything else became simpler.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>microsoft</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
