<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ranjith Sagar</title>
    <description>The latest articles on DEV Community by Ranjith Sagar (@ranjith_sagar_eebde47865f).</description>
    <link>https://dev.to/ranjith_sagar_eebde47865f</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3878158%2Ffbe1f944-07a4-4b56-98bf-cc064dd0447e.jpeg</url>
      <title>DEV Community: Ranjith Sagar</title>
      <link>https://dev.to/ranjith_sagar_eebde47865f</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ranjith_sagar_eebde47865f"/>
    <language>en</language>
    <item>
      <title>Support sage agent</title>
      <dc:creator>Ranjith Sagar</dc:creator>
      <pubDate>Tue, 14 Apr 2026 08:23:21 +0000</pubDate>
      <link>https://dev.to/ranjith_sagar_eebde47865f/support-sage-agent-1pb4</link>
      <guid>https://dev.to/ranjith_sagar_eebde47865f/support-sage-agent-1pb4</guid>
      <description>&lt;h1&gt;
  
  
  Meet Support Sage: The AI Agent That Remembers Every Ticket Your Team Ever Closed
&lt;/h1&gt;

&lt;p&gt;It's 2:47 PM on a Tuesday. A new support engineer — three weeks into the job — pings the team Slack channel: &lt;em&gt;"Hey, has anyone seen this Zapier integration error before? Customer says it stopped syncing overnight."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Two people react with the 👀 emoji. One senior engineer thinks, &lt;em&gt;"We fixed this exact thing in January."&lt;/em&gt; But nobody can find the ticket. It was closed, marked resolved, and its resolution lived exclusively in the memory of someone who left the company in March. The new hire spends two hours triaging a problem that had already been solved. The customer waits. The wheel reinvents itself.&lt;/p&gt;

&lt;p&gt;This is the support knowledge problem — and it's not a tooling problem, it's a memory problem. Ticketing systems are graveyards of institutional knowledge. Wikis go stale. Runbooks get skipped. And every time someone experienced leaves, they take a piece of your resolution history with them.&lt;/p&gt;

&lt;p&gt;That's what I built Support Sage to solve.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Support Sage Does
&lt;/h2&gt;

&lt;p&gt;Support Sage is an AI support agent that remembers. When a new ticket arrives, it doesn't just pattern-match against static documentation — it recalls actual resolved tickets your team has closed, identifies which past resolutions are most relevant, and generates two things: a polished, ready-to-send customer reply and a concise set of internal resolution steps for whoever picks up the ticket.&lt;/p&gt;

&lt;p&gt;The "sage" name is intentional. A sage is a wise advisor — but it's also an herb. And like the herb, Sage grows with use. Every ticket your team resolves and stores makes the next response sharper.&lt;/p&gt;

&lt;h3&gt;
  
  
  Architecture
&lt;/h3&gt;

&lt;p&gt;The system has four moving parts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Vanilla HTML Frontend]
        |
        v
[FastAPI Backend]  ──── retain endpoint ────&amp;gt;  [Hindsight Cloud]
        |                                              |
        v                                              |
[Groq API (llama3-70b)]  &amp;lt;── recalled context ────────┘
        |
        v
[Customer Reply + Internal Steps]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When a support engineer closes a ticket, they POST the resolution to &lt;code&gt;/retain&lt;/code&gt;. That resolution is embedded and stored in &lt;a href="https://github.com/vectorize-io/hindsight" rel="noopener noreferrer"&gt;Hindsight&lt;/a&gt;, an open-source agent memory layer built on top of Vectorize. When a new ticket arrives at &lt;code&gt;/resolve&lt;/code&gt;, Sage queries Hindsight for the most semantically similar past resolutions, injects them into a prompt, and sends the enriched context to Groq's llama3-70b for generation.&lt;/p&gt;

&lt;p&gt;The entire backend is about 180 lines of Python. The intelligence isn't in the code — it's in the memory layer.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Core Pattern: Retain and Recall
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://vectorize.io/what-is-agent-memory" rel="noopener noreferrer"&gt;Agent memory&lt;/a&gt; is what separates a stateless LLM call from an agent that compounds value over time. Hindsight gives you that memory as a managed service, with a clean API that abstracts away vector storage, embedding, and retrieval. Here's how the retain/recall pattern looks in practice.&lt;/p&gt;

&lt;h3&gt;
  
  
  Retaining a Resolved Ticket
&lt;/h3&gt;

&lt;p&gt;When a ticket is closed, the support engineer submits the issue description, root cause, and resolution steps. The &lt;code&gt;/retain&lt;/code&gt; endpoint packages this and stores it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/retain&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;retain_ticket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ticket&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ResolvedTicket&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Store a resolved ticket in Hindsight memory.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;memory_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
ISSUE: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ticket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;issue_description&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;
ROOT CAUSE: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ticket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;root_cause&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;
RESOLUTION: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ticket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;resolution_steps&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;
TAGS: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ticket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pipeline_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;HINDSIGHT_PIPELINE_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;document&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;memory_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;metadata&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ticket_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ticket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ticket_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;resolved_at&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ticket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;resolved_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tags&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ticket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tags&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;AsyncClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;HINDSIGHT_BASE_URL&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/retain&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;HINDSIGHT_API_KEY&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stored&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ticket_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ticket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ticket_id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Recalling Relevant Context
&lt;/h3&gt;

&lt;p&gt;When a new ticket arrives, Sage queries for the top semantically similar past resolutions before calling the LLM:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;recall_similar_tickets&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;issue_description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;top_k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Retrieve the most relevant resolved tickets from Hindsight.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pipeline_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;HINDSIGHT_PIPELINE_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;issue_description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;top_k&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;top_k&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;AsyncClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;HINDSIGHT_BASE_URL&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/recall&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;HINDSIGHT_API_KEY&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Generating the Response
&lt;/h3&gt;

&lt;p&gt;The recalled context is injected into the Groq prompt as institutional memory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/resolve&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;resolve_ticket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ticket&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;NewTicket&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;recalled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;recall_similar_tickets&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ticket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;issue_description&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;memory_context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;recalled&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;recalled&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No similar tickets found.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;You are Support Sage, an expert support agent with access to
your team&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s historical resolution database.

PAST RESOLVED TICKETS (most relevant):
&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;memory_context&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;

NEW TICKET:
&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ticket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;issue_description&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;

Generate:
1. CUSTOMER REPLY: A professional, empathetic message to send directly to the customer.
2. INTERNAL STEPS: Concise resolution steps for the engineer picking this up.

Base your response on the historical patterns above where applicable.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;groq_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama3-70b-8192&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;parse_sage_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The full &lt;a href="https://hindsight.vectorize.io/" rel="noopener noreferrer"&gt;Hindsight docs&lt;/a&gt; cover additional configuration options — including pipeline setup, embedding model selection, and metadata filtering for multi-tenant deployments.&lt;/p&gt;




&lt;h2&gt;
  
  
  Before and After: The OAuth Token Moment
&lt;/h2&gt;

&lt;p&gt;This is where the rubber meets the road. Here's what Sage returns on a fresh deployment with zero stored tickets:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;New Ticket:&lt;/strong&gt; &lt;em&gt;"Our Zapier integration stopped syncing data overnight. No error messages in Zapier, but records aren't flowing through."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sage (no memory):&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"It sounds like there may be a connectivity issue with your Zapier integration. I'd recommend trying the following: 1) Check that your API credentials are still valid, 2) Try disconnecting and reconnecting the integration, 3) Verify that your Zapier zap is still enabled. If the issue persists, please contact Zapier support."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Generic. Useless. The kind of reply that makes customers feel unseen.&lt;/p&gt;

&lt;p&gt;Now here's Sage after five resolved tickets have been retained — including one from January where the root cause was: &lt;em&gt;"Zapier OAuth tokens expire after 90 days; customer had connected the integration in October. Resolution: navigate to Settings &amp;gt; Integrations &amp;gt; Zapier &amp;gt; Reconnect, then re-authenticate via OAuth. Token silently expires without error notification."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sage (with memory):&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Customer Reply:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Thanks for reaching out! Based on the symptoms you're describing — especially the silent failure with no error messages — this looks like an OAuth token expiry issue. Our Zapier integration tokens expire after 90 days, and unfortunately Zapier doesn't surface an error when this happens, so it can appear to stop working without warning. To fix this: navigate to &lt;strong&gt;Settings → Integrations → Zapier → Reconnect&lt;/strong&gt; and complete the OAuth re-authentication flow. Your zaps should resume syncing within a few minutes. Let us know if you run into anything!"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Internal Steps:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;ol&gt;
&lt;li&gt;Confirm token age: check integration connected_at timestamp in admin panel&lt;/li&gt;
&lt;li&gt;Direct customer to Settings → Integrations → Zapier → Reconnect&lt;/li&gt;
&lt;li&gt;After reconnect, verify sync in Zapier task history (allow 5 min)&lt;/li&gt;
&lt;li&gt;If token was &amp;lt; 90 days old, escalate — may indicate a revocation issue&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;

&lt;p&gt;The difference is not prompt engineering. It's memory. Sage recalled the exact issue, the exact mechanism (90-day expiry), and the exact path to resolution. In seconds. Without paging anyone.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. The memory layer earns its weight immediately.&lt;/strong&gt; I initially considered rolling my own retrieval with a local vector store, but Hindsight's managed pipeline — including chunking, embedding, and similarity search — saved several days of infrastructure work. For a support agent specifically, where retrieval precision matters more than latency, having a dedicated memory service rather than bolting on vector search as an afterthought made the architecture cleaner and the results noticeably better.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Resolution quality at write-time determines everything at read-time.&lt;/strong&gt; The biggest performance variable isn't the LLM — it's how well engineers document resolutions when they close tickets. A resolution that says "fixed the config issue" gives Sage nothing to work with. One that says "updated the SMTP relay port from 465 to 587 after confirming TLS was required by the customer's mail provider" gives it everything. I added a structured resolution form — issue description, root cause, exact steps, tags — specifically to enforce quality at the input stage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Groq's speed changes the UX equation.&lt;/strong&gt; Running llama3-70b through Groq means end-to-end response time (recall + generation) typically lands under two seconds. This matters because the frontend isn't a background job — it's a synchronous assistant that a support engineer is actively waiting on. Slow generation would break the flow. Fast generation makes it feel like asking a colleague.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Semantic recall surfaces non-obvious connections.&lt;/strong&gt; One thing that surprised me: Hindsight occasionally surfaces tickets that don't look obviously related by keywords but are semantically close. A ticket about "dashboard not loading after password change" recalled a previous resolution about "session tokens invalidated after SSO configuration update." Different symptoms, same root mechanic. A keyword search would have missed it entirely.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Honest Limitation
&lt;/h2&gt;

&lt;p&gt;Sage is only as wise as the resolutions your team writes.&lt;/p&gt;

&lt;p&gt;If your engineers close tickets with one-line notes, Sage will generate one-line advice. If your historical resolutions are vague, incomplete, or flat-out wrong, Sage will confidently reproduce that vagueness with perfect grammar. The model doesn't know what it doesn't know — it just retrieves and synthesizes.&lt;/p&gt;

&lt;p&gt;This is the &lt;strong&gt;garbage in, garbage out&lt;/strong&gt; problem applied to institutional memory, and it's worth being direct about. Sage amplifies whatever resolution culture your team already has. If that culture is weak, the first thing to fix isn't the AI — it's the closing hygiene. A structured resolution template, a peer review step on complex tickets, even a weekly "resolution quality" spot-check — these compound over time into a memory store that actually earns the name &lt;em&gt;sage&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The system is also bounded by the domain of what's been seen before. Novel failure modes — new integrations, infrastructure changes, edge-case bugs — will fall back to generic LLM reasoning until similar tickets accumulate. That's expected behavior, not a bug. Memory-augmented agents are a complement to expertise, not a replacement for it.&lt;/p&gt;




&lt;p&gt;Support Sage is a small system that solves a real problem: institutional knowledge that walks out the door every time someone does. The retain/recall pattern is deceptively simple, but the compounding value is real. Every ticket your team resolves and stores is a piece of expertise that never leaves — and the next time a new hire hits that 2:47 PM wall, Sage already knows the answer.&lt;/p&gt;

&lt;p&gt;The herb grows every time you water it.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>support</category>
      <category>llm</category>
    </item>
  </channel>
</rss>
