<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: JunYoungMoon</title>
    <description>The latest articles on DEV Community by JunYoungMoon (@junyoungmoon).</description>
    <link>https://dev.to/junyoungmoon</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3661229%2F0ef274d1-4bb0-4b14-b3de-17fdfc86d80d.png</url>
      <title>DEV Community: JunYoungMoon</title>
      <link>https://dev.to/junyoungmoon</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/junyoungmoon"/>
    <language>en</language>
    <item>
      <title>Are We in an AI Bubble? A Tech Professional's Perspective</title>
      <dc:creator>JunYoungMoon</dc:creator>
      <pubDate>Thu, 18 Dec 2025 05:17:15 +0000</pubDate>
      <link>https://dev.to/junyoungmoon/are-we-in-an-ai-bubble-a-tech-professionals-perspective-3m3b</link>
      <guid>https://dev.to/junyoungmoon/are-we-in-an-ai-bubble-a-tech-professionals-perspective-3m3b</guid>
      <description>&lt;p&gt;&lt;strong&gt;TLDR&lt;/strong&gt;: We're witnessing unprecedented AI investment and hype. Historical patterns suggest this looks like a bubble, but that doesn't mean AI isn't revolutionary. Smart engineers should stay engaged but cautious, avoiding career decisions based purely on FOMO.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Current Landscape
&lt;/h2&gt;

&lt;p&gt;As software engineers, we're living through what might be the most significant technological shift of our careers. Companies are pouring unprecedented capital into AI infrastructure, with spending levels that rival entire national budgets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Numbers Are Staggering:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI accounts for ~90% of S&amp;amp;P 500 capex growth&lt;/li&gt;
&lt;li&gt;Nvidia went from $600M market cap (1999 IPO) to briefly touching $5 trillion&lt;/li&gt;
&lt;li&gt;Multiple AI startups raising $1B+ seed rounds with no product&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What Makes This Time Different (And What Doesn't)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The "This Time It's Different" Arguments
&lt;/h3&gt;

&lt;p&gt;As developers, we hear these daily:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Real products with real users&lt;/strong&gt;: ChatGPT hit 1 billion users faster than any technology in history&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Immediate revenue&lt;/strong&gt;: Companies like Anthropic seeing 10x YoY revenue growth&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Proven leaders&lt;/strong&gt;: Microsoft, Google, Meta aren't dot-com startups&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reasonable valuations&lt;/strong&gt;: Current AI leaders' P/E ratios are lower than 1999 tech stocks&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Historical Echoes
&lt;/h3&gt;

&lt;p&gt;But experienced engineers recognize familiar patterns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1860s: Railroads
1920s: Radio &amp;amp; Aviation  
1990s: Dot-com
2000s: Subprime housing
2020s: AI?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each revolution followed the same arc:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Revolutionary technology emerges&lt;/li&gt;
&lt;li&gt;Early adopters make fortunes&lt;/li&gt;
&lt;li&gt;FOMO drives everyone else in&lt;/li&gt;
&lt;li&gt;Overbuilding and speculation&lt;/li&gt;
&lt;li&gt;Crash (but technology persists)&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Technical Reality Check
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What We Know
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;LLMs are genuinely transformative&lt;/li&gt;
&lt;li&gt;Coding assistants (Claude Code, Cursor) are already replacing significant developer hours&lt;/li&gt;
&lt;li&gt;GPU demand is real and growing&lt;/li&gt;
&lt;li&gt;The technology works and is improving rapidly&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What We Don't Know
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Who will be the ultimate winners&lt;/li&gt;
&lt;li&gt;Whether first-movers can maintain advantages&lt;/li&gt;
&lt;li&gt;If AI services become commoditized (price competition → thin margins)&lt;/li&gt;
&lt;li&gt;Economic lifespan of AI infrastructure&lt;/li&gt;
&lt;li&gt;Whether AGI is achievable (and on what timeline)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Circular Money Problem
&lt;/h2&gt;

&lt;p&gt;As engineers, we should be skeptical of "circular deals":&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Simplified illustration of circular revenue problem&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CircularDeal&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Company A invests $1B in Company B&lt;/span&gt;
    &lt;span class="n"&gt;companyA&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;invest&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;companyB&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1_000_000_000&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Company B uses that $1B to buy services from Company A&lt;/span&gt;
    &lt;span class="n"&gt;companyB&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;purchaseServices&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;companyA&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1_000_000_000&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Both companies report revenue&lt;/span&gt;
    &lt;span class="c1"&gt;// But where's the real value creation?&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Historical parallel: 1990s telecom companies buying fiber from each other, funded by vendor financing. It inflated revenues without real demand.&lt;/p&gt;

&lt;p&gt;Current concerns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenAI receiving billions from tech companies&lt;/li&gt;
&lt;li&gt;OpenAI paying billions back to those same companies for compute&lt;/li&gt;
&lt;li&gt;Nvidia investing in customers who buy Nvidia chips&lt;/li&gt;
&lt;li&gt;Goldman estimates 15% of Nvidia's revenue comes from such arrangements&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Debt Factor: A Developer's Perspective
&lt;/h2&gt;

&lt;p&gt;Tech companies traditionally funded growth through equity or cash flow. Now:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Microsoft, Google, Meta issuing 30-year bonds for AI&lt;/li&gt;
&lt;li&gt;Startups borrowing billions for data centers&lt;/li&gt;
&lt;li&gt;SPVs (Special Purpose Vehicles) hiding debt off balance sheets&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The Risk&lt;/strong&gt;: In software engineering, we understand that:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;InvestmentRisk&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="cm"&gt;/**
     * Risk Amplification Formula
     * When outcomes are uncertain, debt amplifies both gains AND losses
     */&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="nf"&gt;calculateRisk&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;uncertainty&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;debtLeverage&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;uncertainty&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;debtLeverage&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Exponentially higher risk&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;AI outcomes are highly uncertain. Adding leverage amplifies both potential gains AND losses.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means for Your Career
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Jobs That Are Already Changing
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Junior coding positions&lt;/strong&gt;: AI can now write production-quality code&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation writing&lt;/strong&gt;: Automated by LLMs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Basic code review&lt;/strong&gt;: Increasingly automated&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Junior legal/analyst work&lt;/strong&gt;: Pattern matching that AI excels at&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Jobs That Remain Valuable
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;System architecture&lt;/strong&gt;: Requiring holistic judgment&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Product decisions&lt;/strong&gt;: Requiring user empathy&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Team leadership&lt;/strong&gt;: Requiring emotional intelligence&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debugging complex systems&lt;/strong&gt;: Requiring deep contextual understanding&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Physical infrastructure&lt;/strong&gt;: Can't be automated (yet)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The "Skill Obsolescence" Risk
&lt;/h3&gt;

&lt;p&gt;One concerning pattern: If we eliminate junior positions, where do senior experts come from? You don't become a 10x engineer without years of practice.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical Advice for Engineers
&lt;/h2&gt;

&lt;h3&gt;
  
  
  DO:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Learn AI tools&lt;/strong&gt;: They're productivity multipliers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stay informed&lt;/strong&gt;: Technology is real, even if valuations aren't&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Maintain fundamentals&lt;/strong&gt;: Core CS skills remain valuable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Diversify&lt;/strong&gt;: Don't bet entire career on AI hype&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Watch for warning signs&lt;/strong&gt;: Vague promises, circular revenue, unrealistic timelines&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  DON'T:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Join companies purely for AI hype&lt;/strong&gt;: Fundamentals still matter&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ignore traditional skills&lt;/strong&gt;: Algorithms, systems design, etc.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Panic about job loss&lt;/strong&gt;: Transitions take time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Assume current leaders will win&lt;/strong&gt;: Remember Netscape, Yahoo, MySpace&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Believe "$1T valuation" stories&lt;/strong&gt;: These often end badly&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Infrastructure Perspective
&lt;/h2&gt;

&lt;p&gt;As someone who's built systems at scale, I'm watching the data center buildout with concern:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Current trajectory:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;JP Morgan estimates $5T needed for AI infrastructure&lt;/li&gt;
&lt;li&gt;Companies' cash reserves: ~$350B&lt;/li&gt;
&lt;li&gt;Gap must be filled with... debt&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Historical parallel:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1990s: Overbuilt fiber optic networks&lt;/li&gt;
&lt;li&gt;Early 2000s: Massive telecom bankruptcies&lt;/li&gt;
&lt;li&gt;Long-term: Infrastructure was used, but investors lost billions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The technology was real. The overinvestment was also real.&lt;/p&gt;

&lt;h2&gt;
  
  
  Questions to Ask Your Employer
&lt;/h2&gt;

&lt;p&gt;If you're evaluating AI companies or roles:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Revenue model&lt;/strong&gt;: "How do you make money?" (Be skeptical of vague answers)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Burn rate&lt;/strong&gt;: "How long is your runway?"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Differentiation&lt;/strong&gt;: "What happens when OpenAI/Google does this?"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debt load&lt;/strong&gt;: "How much is borrowed vs. equity?"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Customer concentration&lt;/strong&gt;: "Is revenue real or circular?"&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Optimistic Case
&lt;/h2&gt;

&lt;p&gt;Despite concerns, reasons for hope:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Baby boomers retiring&lt;/strong&gt;: 16M retiring by 2035, AI fills gaps&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Productivity gains&lt;/strong&gt;: Real time savings for developers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;New problem spaces&lt;/strong&gt;: AI might create more jobs than it destroys&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pattern breaks&lt;/strong&gt;: 20% of the time, "this time IS different"&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion: The Middle Path
&lt;/h2&gt;

&lt;p&gt;Neither blind optimism nor pessimism serves engineers well.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My recommendation:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="cm"&gt;/**
 * AI Career Strategy Pattern
 */&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AICareerStrategy&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;ENGAGEMENT_LEVEL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"MODERATE"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Not zero, not all-in&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Stay informed and skilled&lt;/span&gt;
        &lt;span class="n"&gt;learnAITools&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="n"&gt;maintainFundamentals&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

        &lt;span class="c1"&gt;// Make decisions based on fundamentals&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hasRealRevenue&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;hasReasonableValuation&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;considerOpportunity&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;// But don't bet everything&lt;/span&gt;
        &lt;span class="n"&gt;avoidAllInBets&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"STAY_FLEXIBLE_AND_INFORMED"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;learnAITools&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Stay current with AI developments&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;maintainFundamentals&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Keep core CS skills sharp&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="nf"&gt;hasRealRevenue&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Verify actual revenue model&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="nf"&gt;hasReasonableValuation&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Check P/E ratios, fundamentals&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We're witnessing either the most important technological revolution of our lifetimes, or the most spectacular bubble, or (most likely) both simultaneously.&lt;/p&gt;

&lt;p&gt;The technology is real. The hype is also real. Navigate accordingly.&lt;/p&gt;




&lt;h2&gt;
  
  
  Discussion Questions
&lt;/h2&gt;

&lt;p&gt;What's your experience with AI in your daily work? Are you seeing real productivity gains or just hype? How is your company approaching AI investment?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; #ai #career #bubble #machinelearning #investing #technology #opinion&lt;/p&gt;

</description>
      <category>ai</category>
      <category>career</category>
      <category>machinelearning</category>
      <category>programming</category>
    </item>
    <item>
      <title>Solving Access Denied Issues After SSE Implementation in Spring Security</title>
      <dc:creator>JunYoungMoon</dc:creator>
      <pubDate>Wed, 17 Dec 2025 06:41:14 +0000</pubDate>
      <link>https://dev.to/junyoungmoon/solving-access-denied-issues-after-sse-implementation-in-spring-security-3nd4</link>
      <guid>https://dev.to/junyoungmoon/solving-access-denied-issues-after-sse-implementation-in-spring-security-3nd4</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TLDR:&lt;/strong&gt; Implemented SSE streaming for LLM responses, but second event onwards hit "Access Denied". Root cause: SecurityContext not propagating to ASYNC dispatches. &lt;strong&gt;Fix: One line of Spring Security config.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Background: Why SSE?
&lt;/h2&gt;

&lt;p&gt;Recently I improved our LLM response mechanism by introducing SSE (Server-Sent Events). Previously, we returned complete API responses in one shot. Now we deliver LLM responses via &lt;strong&gt;Flux-based streaming&lt;/strong&gt; to the client.&lt;/p&gt;

&lt;p&gt;Quick definitions:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fframylv9wf74gkpvde30.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fframylv9wf74gkpvde30.webp" alt=" " width="667" height="107"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Flux-based streaming&lt;/strong&gt;: Reactive Streams API from Spring WebFlux that delivers data streams asynchronously in multiple chunks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SSE (Server-Sent Events)&lt;/strong&gt;: HTTP-based unidirectional streaming where server continuously pushes data to client&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words, when delivering LLM responses, we're not using a traditional single HTTP request/response structure. Instead, we maintain one connection and deliver multiple events in real-time using &lt;strong&gt;SSE&lt;/strong&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7lnxnpz1rwnshnb83myu.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7lnxnpz1rwnshnb83myu.gif" alt=" " width="800" height="641"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The first API request worked perfectly. The &lt;strong&gt;Dispatcher Type&lt;/strong&gt; was &lt;code&gt;REQUEST&lt;/code&gt;, and &lt;code&gt;JwtAuthenticationFilter&lt;/code&gt; validated the cookie properly, storing authentication info in &lt;code&gt;SecurityContextHolder&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;But starting from the second SSE event, problems emerged. The second request came in with Dispatcher Type &lt;code&gt;ASYNC&lt;/code&gt;, bypassing the existing &lt;code&gt;JwtAuthenticationFilter&lt;/code&gt;. As a result, &lt;code&gt;SecurityContextHolder&lt;/code&gt; had no authentication info. Despite being the same URL, the security context was empty, causing an &lt;strong&gt;Access Denied&lt;/strong&gt; error.&lt;/p&gt;


&lt;h2&gt;
  
  
  Root Cause Analysis
&lt;/h2&gt;

&lt;p&gt;Spring Security manages &lt;code&gt;SecurityContext&lt;/code&gt; throughout the request lifecycle. In normal requests (&lt;code&gt;REQUEST&lt;/code&gt;), authentication info is stored properly through the filter chain. However, for asynchronous requests (&lt;code&gt;ASYNC&lt;/code&gt;) or re-dispatches (e.g., &lt;code&gt;FORWARD&lt;/code&gt;, &lt;code&gt;INCLUDE&lt;/code&gt;), the security context may not propagate automatically.&lt;/p&gt;

&lt;p&gt;In SSE and other async response environments, basic security configuration alone doesn't guarantee context propagation, leading to Access Denied issues.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Solution
&lt;/h2&gt;

&lt;p&gt;The fix is configuring &lt;code&gt;SecurityContextRepository&lt;/code&gt; to propagate the session (technically the context within request scope). We can use this option provided in Spring Security 5.8+:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;securityContext&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;requireExplicitSave&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F40986cazmhytkbmjlnkp.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F40986cazmhytkbmjlnkp.gif" alt=" " width="760" height="609"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With this option, &lt;code&gt;SecurityContext&lt;/code&gt; is automatically saved to &lt;code&gt;SecurityContextRepository&lt;/code&gt; when changed, and the context propagates across re-dispatches (&lt;code&gt;REQUEST&lt;/code&gt;, &lt;code&gt;ASYNC&lt;/code&gt;, &lt;code&gt;FORWARD&lt;/code&gt;, &lt;code&gt;INCLUDE&lt;/code&gt;) within the same request.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is SecurityContextRepository?
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;SecurityContextRepository&lt;/code&gt; is &lt;strong&gt;the strategy interface for storing/restoring SecurityContext&lt;/strong&gt; in Spring Security. Behavior varies by implementation, but in SSE environments, it typically stores context in HttpServletRequest attributes (request scope) to maintain authentication info across async processing.&lt;/p&gt;

&lt;p&gt;This approach maintains stateless operation (no sessions) while safely sharing security context across re-dispatches within a single HTTP request.&lt;/p&gt;




&lt;h2&gt;
  
  
  Understanding requireExplicitSave
&lt;/h2&gt;

&lt;p&gt;Since Spring Security 5.8, you can explicitly control how &lt;code&gt;SecurityContext&lt;/code&gt; is saved:&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;requireExplicitSave(false)&lt;/code&gt; (Auto-save mode, default)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Automatically saves to &lt;code&gt;SecurityContextRepository&lt;/code&gt; when &lt;code&gt;SecurityContext&lt;/code&gt; changes&lt;/li&gt;
&lt;li&gt;No manual save code needed&lt;/li&gt;
&lt;li&gt;Same as legacy Spring Security behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;requireExplicitSave(true)&lt;/code&gt; (Explicit save mode)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Must explicitly call &lt;code&gt;SecurityContextRepository.saveContext()&lt;/code&gt; after context changes&lt;/li&gt;
&lt;li&gt;No automatic saving&lt;/li&gt;
&lt;li&gt;Performance benefits, but developers must manage save timing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Since our system isn't complex yet, we decided auto-save mode (&lt;code&gt;requireExplicitSave(false)&lt;/code&gt;) is more appropriate.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Re-dispatch Happens in SSE
&lt;/h2&gt;

&lt;p&gt;Re-dispatching in SSE stems from Servlet 3.0 async processing spec's core design principle.&lt;/p&gt;

&lt;p&gt;When async work completes, the result must go through the existing filter/servlet chain for consistent processing—hence the mandatory re-dispatch.&lt;/p&gt;

&lt;p&gt;Without re-dispatch, the async request lifecycle wouldn't complete properly, causing resource leaks, connection management issues, security context mismatches, and other problems.&lt;/p&gt;

&lt;p&gt;Therefore, Servlet 3.0 spec mandates re-dispatch after all async work completes, classifying such requests as &lt;code&gt;DispatcherType.ASYNC&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.spring.io/spring-framework/reference/web/webmvc/mvc-ann-async.html#mvc-ann-async-processing" rel="noopener noreferrer"&gt;https://docs.spring.io/spring-framework/reference/web/webmvc/mvc-ann-async.html#mvc-ann-async-processing&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Verifying the Call Order
&lt;/h2&gt;

&lt;p&gt;To confirm when data is sent and when threads separate, I wrote a simple example.&lt;/p&gt;

&lt;p&gt;In Spring MVC with SSE, controllers return &lt;strong&gt;Flux streams&lt;/strong&gt; instead of regular &lt;code&gt;ResponseEntity&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@PostMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"/gpt4/v3/test"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;produces&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MediaType&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;TEXT_EVENT_STREAM_VALUE&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Flux&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Flux&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;just&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"data1"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"data2"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"data3"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;delayElements&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofSeconds&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;doOnComplete&lt;/span&gt;&lt;span class="o"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"All data sent"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="o"&gt;})&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;doFinally&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;signalType&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Stream cleanup started: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;signalType&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="o"&gt;});&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This code maintains the connection with the client while sending data sequentially at 1-second intervals.&lt;/p&gt;

&lt;p&gt;Instead of terminating the request immediately, it delivers responses asynchronously using &lt;strong&gt;Servlet 3.0's AsyncContext&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tracking Async Transitions
&lt;/h2&gt;

&lt;p&gt;I created &lt;code&gt;AsyncDebugFilter&lt;/code&gt; to precisely identify the async transition point:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Slf4j&lt;/span&gt;
&lt;span class="nd"&gt;@Component&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AsyncDebugFilter&lt;/span&gt; &lt;span class="kd"&gt;implements&lt;/span&gt; &lt;span class="nc"&gt;Filter&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

  &lt;span class="nd"&gt;@Override&lt;/span&gt;
  &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;doFilter&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ServletRequest&lt;/span&gt; &lt;span class="n"&gt;servletRequest&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;ServletResponse&lt;/span&gt; &lt;span class="n"&gt;servletResponse&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;FilterChain&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
          &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;IOException&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;ServletException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
      &lt;span class="o"&gt;...&lt;/span&gt;

      &lt;span class="nc"&gt;HttpServletRequestWrapper&lt;/span&gt; &lt;span class="n"&gt;requestWrapper&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HttpServletRequestWrapper&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
          &lt;span class="o"&gt;...&lt;/span&gt;
          &lt;span class="nd"&gt;@Override&lt;/span&gt;
          &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;AsyncContext&lt;/span&gt; &lt;span class="nf"&gt;startAsync&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ServletRequest&lt;/span&gt; &lt;span class="n"&gt;servletRequest&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;ServletResponse&lt;/span&gt; &lt;span class="n"&gt;servletResponse&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
              &lt;span class="nc"&gt;AsyncContext&lt;/span&gt; &lt;span class="n"&gt;ac&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;super&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;startAsync&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;servletRequest&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;servletResponse&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
              &lt;span class="n"&gt;attachListener&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ac&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;requestId&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
              &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;info&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"[{}] startAsync(req,res) called. DispatcherType={}, Thread={}"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                      &lt;span class="n"&gt;requestId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kd"&gt;super&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getDispatcherType&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;currentThread&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getName&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
              &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ac&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
          &lt;span class="o"&gt;}&lt;/span&gt;

          &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;attachListener&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;AsyncContext&lt;/span&gt; &lt;span class="n"&gt;ac&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;requestId&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;ac&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;addListener&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;AsyncListener&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="nd"&gt;@Override&lt;/span&gt;
                &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;onComplete&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;AsyncEvent&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;info&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"[{}] AsyncListener.onComplete - Thread={}"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;requestId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;currentThread&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getName&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
                &lt;span class="o"&gt;}&lt;/span&gt;
                &lt;span class="o"&gt;...&lt;/span&gt;
            &lt;span class="o"&gt;});&lt;/span&gt;
          &lt;span class="o"&gt;}&lt;/span&gt;
      &lt;span class="o"&gt;}&lt;/span&gt;

      &lt;span class="nc"&gt;HttpServletResponseWrapper&lt;/span&gt; &lt;span class="n"&gt;responseWrapper&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HttpServletResponseWrapper&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
          &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;ServletOutputStream&lt;/span&gt; &lt;span class="n"&gt;loggingStream&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
          &lt;span class="nd"&gt;@Override&lt;/span&gt;
          &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;ServletOutputStream&lt;/span&gt; &lt;span class="nf"&gt;getOutputStream&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;IOException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
              &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loggingStream&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                  &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;ServletOutputStream&lt;/span&gt; &lt;span class="n"&gt;original&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;super&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getOutputStream&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
                  &lt;span class="n"&gt;loggingStream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ServletOutputStream&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                      &lt;span class="o"&gt;...&lt;/span&gt;
                      &lt;span class="nd"&gt;@Override&lt;/span&gt;
                      &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;off&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;IOException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                          &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;debug&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"[{}] output.write(byte[{}]) thread={} len={}"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;requestId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;length&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;currentThread&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getName&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                          &lt;span class="n"&gt;original&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;write&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;off&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                      &lt;span class="o"&gt;}&lt;/span&gt;
                      &lt;span class="nd"&gt;@Override&lt;/span&gt;
                      &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;flush&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;IOException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                          &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;debug&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"[{}] output.flush thread={}"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;requestId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;currentThread&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getName&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
                          &lt;span class="n"&gt;original&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;flush&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
                      &lt;span class="o"&gt;}&lt;/span&gt;
                      &lt;span class="o"&gt;...&lt;/span&gt;
                  &lt;span class="o"&gt;};&lt;/span&gt;
              &lt;span class="o"&gt;}&lt;/span&gt;
              &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;loggingStream&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
          &lt;span class="o"&gt;}&lt;/span&gt;
          &lt;span class="o"&gt;...&lt;/span&gt;
      &lt;span class="o"&gt;};&lt;/span&gt;
  &lt;span class="o"&gt;}&lt;/span&gt;  
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This filter intercepts &lt;code&gt;request.startAsync()&lt;/code&gt; calls and registers an &lt;code&gt;AsyncListener&lt;/code&gt;, allowing us to track exactly when async processing begins.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Execution Flow
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwvk6ll2rfjwb9b0we9di.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwvk6ll2rfjwb9b0we9di.webp" alt=" " width="800" height="383"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;First, let's examine the call sequence. When a request arrives, it passes through security filters into business logic.&lt;br&gt;
When it hits async logic, &lt;code&gt;startAsync&lt;/code&gt; is called to separate the thread. &lt;strong&gt;(→ This is the async transition point)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr4b3fuvdigcld0pgxu1m.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr4b3fuvdigcld0pgxu1m.webp" alt=" " width="800" height="517"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After data transmission completes, &lt;code&gt;doOnComplete&lt;/code&gt; and &lt;code&gt;doFinally&lt;/code&gt; are called sequentially, then the servlet container performs ASYNC re-dispatch.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjfn3jnrnsrwxhozmdyle.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjfn3jnrnsrwxhozmdyle.gif" alt=" " width="760" height="397"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Without ASYNC re-dispatch, the request would remain open.&lt;br&gt;
Therefore, the container must perform ASYNC dispatch to properly terminate the request, ensuring connections are cleaned up correctly.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The Access Denied issue after implementing SSE was caused by Spring Security's behavior of not propagating &lt;code&gt;SecurityContext&lt;/code&gt; in async requests. &lt;/p&gt;

&lt;p&gt;By applying &lt;code&gt;http.securityContext(context -&amp;gt; context.requireExplicitSave(false))&lt;/code&gt; to utilize &lt;code&gt;SecurityContextRepository&lt;/code&gt;, we can reliably maintain authentication info even in SSE environments.&lt;/p&gt;

&lt;p&gt;If the system becomes more complex and performance optimization becomes necessary, we could consider explicit save mode (&lt;code&gt;requireExplicitSave(true)&lt;/code&gt;). But for now, auto-save mode provides stable operation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SSE uses Servlet 3.0 async processing&lt;/strong&gt;, which requires ASYNC re-dispatch after completion&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SecurityContext doesn't auto-propagate&lt;/strong&gt; across dispatcher types by default&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One config line fixes it&lt;/strong&gt;: &lt;code&gt;requireExplicitSave(false)&lt;/code&gt; enables automatic context propagation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No sessions needed&lt;/strong&gt;: Uses request-scoped attributes for stateless context sharing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AsyncDebugFilter is your friend&lt;/strong&gt;: Helps visualize the complete async lifecycle&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; #spring-security #sse #async #spring-boot #webflux #security-context #servlet&lt;/p&gt;




&lt;p&gt;💭 Questions or feedback? Join the discussion in the comments!&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>springboot</category>
      <category>security</category>
    </item>
    <item>
      <title>Our RAG system still failed on hierarchical metrics — Part 2</title>
      <dc:creator>JunYoungMoon</dc:creator>
      <pubDate>Mon, 15 Dec 2025 02:43:23 +0000</pubDate>
      <link>https://dev.to/junyoungmoon/how-i-built-a-rag-system-that-actually-understands-business-metrics-part-2-hierarchical-search-13mk</link>
      <guid>https://dev.to/junyoungmoon/how-i-built-a-rag-system-that-actually-understands-business-metrics-part-2-hierarchical-search-13mk</guid>
      <description>&lt;h1&gt;
  
  
  How I Built a RAG System That Actually Understands Business Metrics (Part 2: Hierarchical Search)
&lt;/h1&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TLDR:&lt;/strong&gt; Part 1 could find "total_revenue", but failed at "Naver traffic". This article shows how I fixed that with a tree-based metric system and 2-stage GPT filtering. &lt;strong&gt;Result: 95% → 98% accuracy.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Real Problem
&lt;/h2&gt;

&lt;p&gt;Remember Part 1? We built this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User: "What's the conversion rate?"
System: Found "conversion_rate"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But then this happened:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User: "How's my Naver traffic?"
System: Found "traffic_by_channel"
        But which channel? Naver? Google? Facebook?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The gap:&lt;/strong&gt; Our system understood categories but couldn't navigate their subcategories.&lt;/p&gt;




&lt;h2&gt;
  
  
  💡 The Solution in 3 Moves
&lt;/h2&gt;

&lt;p&gt;Think of it like a chess game. We need three moves to checkmate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Move 1: SEARCH      → Cast a wide net (50 candidates)
Move 2: CLASSIFY    → Filter intelligently (3 names → 12 nodes)
Move 3: REFINE      → Validate relationships (final 2 metrics)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's see each move in action.&lt;/p&gt;




&lt;h2&gt;
  
  
  Move 1: The Setup (Vector Search)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Challenge:&lt;/strong&gt; Metrics have hierarchy&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;traffic_by_channel/           ← Parent (1-depth)
├── naver                     ← Child (2-depth)
├── google
├── facebook
└── direct_traffic
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The Strategy:&lt;/strong&gt; Use TWO search indices&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Index&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;th&gt;Top N&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1-depth&lt;/td&gt;
&lt;td&gt;Find categories&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2-depth&lt;/td&gt;
&lt;td&gt;Find specific values&lt;/td&gt;
&lt;td&gt;40&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Why 40 vs 10?&lt;/strong&gt; 2-depth metrics are like finding "John" in a phone book with 1000 Johns. We need more candidates.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example Result:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Query: "How's my Naver traffic?"

1-depth finds: traffic_by_channel, site_duration, bounce_rate...
2-depth finds: naver, google, facebook, traffic_count...

Total: 50 candidates
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key Insight:&lt;/strong&gt; Don't search for "traffic_by_channel|naver" directly. Search for each piece separately, then connect them.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Move 2: The Play (Smart Classification)
&lt;/h2&gt;

&lt;p&gt;This move has 3 sub-plays. Watch closely.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sub-Play 2.1: Remove the Noise 🧹
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; 50 candidates include junk&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;For "Naver traffic":
naver          ← Relevant
traffic_count  ← Maybe relevant?
instagram      ← Wrong channel
bounce_rate    ← Not about traffic sources
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; Ask GPT (Structured Outputs)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Force GPT to return clean JSON array&lt;/span&gt;
&lt;span class="s"&gt;"response_format"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"json_schema"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
  &lt;span class="s"&gt;"schema"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"object"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"properties"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
      &lt;span class="s"&gt;"metrics"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"array"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;"items"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"string"&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
      &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
  &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"metrics"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"naver"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"direct_traffic"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"traffic_count"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; 50 → 3 ✂️&lt;/p&gt;




&lt;h3&gt;
  
  
  Sub-Play 2.2: From Names to Nodes
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; "naver" is just a string. We need context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What we need to know:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is this advertiser's data or industry average?&lt;/li&gt;
&lt;li&gt;What are the parent metrics?&lt;/li&gt;
&lt;li&gt;What other "naver" nodes exist?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; MetricForest lookup&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"naver" in nameIndex → [
  "traffic_by_channel|naver|advertiser",
  "traffic_by_channel|naver|industry",
  "channel_revenue|naver|advertiser",
  "channel_revenue|naver|industry",
  "first_purchase|naver|advertiser",
  "repeat_purchase|naver|advertiser"
]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The Magic:&lt;/strong&gt; O(1) lookup using HashMap&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Without nameIndex: O(N) scan through all nodes&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt; &lt;span class="n"&gt;allNodes&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;"naver"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  &lt;span class="c1"&gt;// Slow!&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// With nameIndex: O(1) direct access&lt;/span&gt;
&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;MetricNode&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;nodes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nameIndex&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"naver"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;  &lt;span class="c1"&gt;// Fast!&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; 3 names → 12 nodes (each name maps to multiple nodes) &lt;/p&gt;




&lt;h3&gt;
  
  
  Sub-Play 2.3: Pick Your Side
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; Same metric, two meanings&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;traffic_by_channel|naver|advertiser  → "Our Naver traffic"
traffic_by_channel|naver|industry    → "Industry average"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Question:&lt;/strong&gt; Which one does the user want?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; Let GPT decide&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Query Analysis:
"How's MY Naver traffic?"     → Advertiser
"Industry average?"           → Industry  
"Compare to industry"         → Both
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Implementation:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;scope&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;classifyScope&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"common"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"advertiser"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"industry"&lt;/span&gt;&lt;span class="o"&gt;];&lt;/span&gt;  &lt;span class="c1"&gt;// Return both&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="o"&gt;];&lt;/span&gt;  &lt;span class="c1"&gt;// Return one&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; 12 nodes → Filter to advertiser domain only&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Move 2 Summary:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Started with: 50 candidates&lt;/li&gt;
&lt;li&gt;Ended with: 12 nodes, advertiser domain confirmed&lt;/li&gt;
&lt;li&gt;But we're not done yet...&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Move 3: Checkmate (Relationship Validation)
&lt;/h2&gt;

&lt;p&gt;Here's where it gets interesting. We have 12 nodes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;✓ traffic_by_channel|naver|advertiser
✓ channel_revenue|naver|advertiser        ← Wrong parent!
✓ traffic_by_channel|direct_traffic|advertiser
✓ new_members|traffic_count|advertiser   ← Wrong parent!
... 8 more
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; Not all are correct. "channel_revenue|naver" isn't about traffic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; 3-step validation&lt;/p&gt;




&lt;h3&gt;
  
  
  Step 3.1: Collect the Neighbors
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Rule:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If 1-depth node → Get children&lt;/li&gt;
&lt;li&gt;If 2-depth node → Get parents&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why?&lt;/strong&gt; To understand relationships.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Our 12 nodes are all 2-depth
→ Collect their parents
→ Get 6 unique parent names:

1. traffic_by_channel    ← Traffic related ✓
2. channel_revenue       ← Revenue, not traffic ✗
3. first_purchase        ← Purchase, not traffic ✗
4. repeat_purchase       ← Purchase, not traffic ✗
5. new_members          ← Not about channels ✗
6. existing_members     ← Not about channels ✗
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Step 3.2: Filter Parents with GPT
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Ask GPT:&lt;/strong&gt; "Which parents are relevant to 'Naver traffic'?"&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Input: 6 parent names
GPT thinking:
  - traffic_by_channel? YES → About traffic AND channels
  - channel_revenue? NO → About revenue, not traffic
  - first_purchase? NO → About purchases
  - new_members? NO → Not related to Naver

Output: ["traffic_by_channel"]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;This is CRITICAL:&lt;/strong&gt; Only 1 parent passed! This means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Keep: traffic_by_channel|naver&lt;/li&gt;
&lt;li&gt;Remove: channel_revenue|naver&lt;/li&gt;
&lt;li&gt;Remove: new_members|traffic_count&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Step 3.3: Final Validation
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Now apply 3 rules:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rule 1: Leaf nodes (no children)&lt;/strong&gt; → Accept immediately&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;depth&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;children&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// Already final&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Rule 2: Parent nodes (has children)&lt;/strong&gt; → Return matching children&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;depth&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;children&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;children&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;child&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;secondResult&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;contains&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Rule 3: Child nodes&lt;/strong&gt; → Validate parent exists in secondResult&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;depth&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;parents&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;anyMatch&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parent&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;secondResult&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;contains&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parent&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Applying to our 12 nodes:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Check: traffic_by_channel|naver|advertiser
  → Parent "traffic_by_channel" in secondResult? 
  → ACCEPT

Check: channel_revenue|naver|advertiser
  → Parent "channel_revenue" in secondResult? 
  → REJECT

Check: new_members|traffic_count|advertiser
  → Parent "new_members" in secondResult? 
  → REJECT
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Final Result:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="s2"&gt;"traffic_by_channel|naver|advertiser"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="s2"&gt;"traffic_by_channel|direct_traffic|advertiser"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Checkmate!&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Complete Game Replay
&lt;/h2&gt;

&lt;p&gt;Let's watch the entire sequence:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Query: "How's my Naver traffic?"

   Move 1: SEARCH
   ├─ 1-depth index → 10 results
   ├─ 2-depth index → 40 results
   └─ Total: 50 candidates

   Move 2: CLASSIFY
   ├─ 2.1: GPT Filter → 3 names
   ├─ 2.2: Node Lookup → 12 nodes
   └─ 2.3: Domain Filter → ["advertiser"]

   Move 3: REFINE
   ├─ 3.1: Collect neighbors → 6 parent names
   ├─ 3.2: GPT Filter → 1 parent name
   └─ 3.3: Validate → 2 final metrics

   RESULT:
   ✓ traffic_by_channel|naver|advertiser
   ✓ traffic_by_channel|direct_traffic|advertiser
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Time taken:&lt;/strong&gt; &amp;lt; 1 second&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Cost per query:&lt;/strong&gt; ~$0.002&lt;/p&gt;


&lt;h2&gt;
  
  
  Before vs After
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Part 1 (Basic Search)
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; "What's the conversion rate?"
 → conversion_rate

 "How's my Naver traffic?"
 → Found "traffic_by_channel" but couldn't find "naver"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Accuracy:&lt;/strong&gt; 95% for flat metrics only&lt;/p&gt;


&lt;h3&gt;
  
  
  Part 2 (Hierarchical Search)
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;   "What's the conversion rate?"
   → conversion_rate

   "How's my Naver traffic?"
   → traffic_by_channel|naver|advertiser
   → traffic_by_channel|direct_traffic|advertiser

   "Compare my Google traffic to industry"
   → traffic_by_channel|google|advertiser
   → traffic_by_channel|google|industry
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Accuracy:&lt;/strong&gt; 98% for all queries including hierarchical&lt;/p&gt;


&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;
&lt;h3&gt;
  
  
  1. Use AI Where It Shines
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Good:&lt;/strong&gt; Semantic understanding&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Is "revenue" related to "sales"? → Ask GPT&lt;/span&gt;
&lt;span class="c1"&gt;// Is "naver" relevant to "traffic"? → Ask GPT&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Bad:&lt;/strong&gt; Deterministic logic&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Is this node depth 1 or 2? → Just check node.depth&lt;/span&gt;
&lt;span class="c1"&gt;// Does parent exist? → Just check hashmap&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Two-Stage Filtering is Magic
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;One GPT call on 50 candidates → 70% accuracy
Two GPT calls (50 → 6 → final) → 98% accuracy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The second filter on a &lt;strong&gt;small, focused set&lt;/strong&gt; is what makes it work.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Data Structure = Foundation
&lt;/h3&gt;

&lt;p&gt;Without MetricForest:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Can't collect neighbors&lt;/li&gt;
&lt;li&gt;Can't validate relationships
&lt;/li&gt;
&lt;li&gt;Can't distinguish contexts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The tree structure makes everything else possible.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Filter Domain LATE, Not Early
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Wrong approach:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Filter domain first&lt;/span&gt;
&lt;span class="n"&gt;nodes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;filterByDomain&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nodes&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"advertiser"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// Now we can't see that both domains existed!&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Right approach:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Let GPT see both domains&lt;/span&gt;
&lt;span class="n"&gt;nodes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;getAllNodes&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;domain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;askGPT&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Which domain does user want?"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// Now filter&lt;/span&gt;
&lt;span class="n"&gt;nodes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;filterByDomain&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nodes&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;domain&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This enables "compare" queries that need both domains.&lt;/p&gt;




&lt;h2&gt;
  
  
  Real-World Code
&lt;/h2&gt;

&lt;p&gt;Here's the orchestration service that ties it all together:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Service&lt;/span&gt;
&lt;span class="nd"&gt;@RequiredArgsConstructor&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MetricSearchService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;VectorSearchService&lt;/span&gt; &lt;span class="n"&gt;vectorSearch&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;StructuredOutputsService&lt;/span&gt; &lt;span class="n"&gt;gptFilter&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;MetricForest&lt;/span&gt; &lt;span class="n"&gt;metricForest&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;DomainClassifier&lt;/span&gt; &lt;span class="n"&gt;domainClassifier&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Move 1: Search&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;candidates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vectorSearch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;search&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// Move 2.1: First filter&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;names&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gptFilter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;candidates&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// Move 2.2: Resolve to nodes&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;nodes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;metricForest&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findByNames&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;names&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// Move 2.3: Decide domain&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;domain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;domainClassifier&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;classify&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;nodes&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// Move 3.1-3.2: Collect and filter neighbors&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;neighbors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collectNeighbors&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nodes&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;validParents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gptFilter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;neighbors&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// Move 3.3: Final validation&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;validate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nodes&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;domain&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;validParents&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Clean, simple, effective.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try These Queries
&lt;/h2&gt;

&lt;p&gt;The system now handles:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Flat metrics&lt;/strong&gt; (from Part 1)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"What's the conversion rate?"
"Show order count"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Hierarchical metrics&lt;/strong&gt; (NEW!)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"My Naver traffic?"
"Google revenue?"
"Facebook first purchase rate?"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Comparisons&lt;/strong&gt; (NEW!)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Compare my Naver traffic to industry average"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Category queries&lt;/strong&gt; (NEW!)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Show all channel traffic"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;Current system handles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Flat metrics&lt;/li&gt;
&lt;li&gt;Hierarchical metrics
&lt;/li&gt;
&lt;li&gt;Domain disambiguation&lt;/li&gt;
&lt;li&gt;Comparison queries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Coming soon:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Time ranges ("last month's Naver traffic")&lt;/li&gt;
&lt;li&gt;Aggregations ("total revenue by channel")&lt;/li&gt;
&lt;li&gt;Conversational follow-ups&lt;/li&gt;
&lt;li&gt;Multi-metric queries&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Need Help Implementing RAG?
&lt;/h2&gt;

&lt;p&gt;I help companies integrate AI systems with their existing Spring Boot infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Specializing in:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Spring Boot + OpenAI integration&lt;/li&gt;
&lt;li&gt;Custom RAG pipelines&lt;/li&gt;
&lt;li&gt;E-commerce analytics systems&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;💭 Questions or feedback? Join the discussion in the comments!&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; #rag #hierarchical-search #spring-boot #openai #system-design #ai-engineering&lt;/p&gt;

</description>
      <category>rag</category>
      <category>springboot</category>
      <category>openai</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Our RAG system failed to understand KPIs — Part 1: Metric retrieval design</title>
      <dc:creator>JunYoungMoon</dc:creator>
      <pubDate>Sun, 14 Dec 2025 12:42:10 +0000</pubDate>
      <link>https://dev.to/junyoungmoon/how-i-built-a-rag-system-that-actually-understands-business-metrics-part-1-foundation-3ao9</link>
      <guid>https://dev.to/junyoungmoon/how-i-built-a-rag-system-that-actually-understands-business-metrics-part-1-foundation-3ao9</guid>
      <description>&lt;h1&gt;
  
  
  How I Built a RAG System That Actually Understands Business Metrics (Part 1: Foundation)
&lt;/h1&gt;

&lt;h2&gt;
  
  
  The Problem: When AI Can't Find Your Data
&lt;/h2&gt;

&lt;p&gt;Picture this: Your marketing team asks your AI assistant, "How's this month's revenue?"&lt;/p&gt;

&lt;p&gt;The AI responds: "I don't have access to that data."&lt;/p&gt;

&lt;p&gt;Frustrating, right? But here's the real problem - &lt;strong&gt;you DO have the data&lt;/strong&gt;. It's sitting in your database, perfectly organized, just waiting to be used. The AI just doesn't know which metrics to look for.&lt;/p&gt;

&lt;p&gt;This is the story of how I solved that problem by building a Retrieval-Augmented Generation (RAG) system that bridges the gap between natural language questions and structured business metrics.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why RAG? Understanding LLM Limitations
&lt;/h2&gt;

&lt;p&gt;Large Language Models like GPT have three fundamental limitations:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Time Constraint&lt;/strong&gt;: Trained only up to a specific cutoff date&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Domain Constraint&lt;/strong&gt;: No knowledge of your company's internal data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time Constraint&lt;/strong&gt;: Can't access continuously changing data&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;RAG solves this by combining the AI's language understanding with your real-time data retrieval system.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture Overview
&lt;/h2&gt;

&lt;p&gt;Here's what we're building:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User Question 
    ↓
[1] Text Embedding (Vector Search)
    ↓
[2] Structured Outputs (GPT Filtering)
    ↓
[3] Database Query
    ↓
Final Answer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This article covers steps 1-2, which form the foundation of intelligent metric retrieval.&lt;/p&gt;

&lt;h2&gt;
  
  
  Starting Point: Understanding Our Data Structure
&lt;/h2&gt;

&lt;p&gt;Our Spring Batch process generates metrics in this format:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;order_count|advertiser
conversion_rate|advertiser
total_revenue|advertiser
campaign_roas|ad_impressions|advertiser
traffic_by_channel|google|industry
traffic_by_channel|facebook|industry
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The problem? This flat structure makes semantic search difficult.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution: Restructure into hierarchical maps&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Step 1 - Separate by domain (Advertiser vs Industry):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"order_count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"total_revenue"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"campaign_roas"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"ad_impressions"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"ad_clicks"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"ad_spend"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"traffic_by_channel"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"google"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"facebook"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"direct"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Step 2 - Create domain classification:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"order_count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"advertiser"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"conversion_rate"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"common"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"total_revenue"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"common"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"avg_purchase_count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"industry"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 1: Text Embedding for Semantic Search
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Core Concept&lt;/strong&gt;: Convert text into numerical vectors that capture semantic meaning.&lt;/p&gt;

&lt;h3&gt;
  
  
  How It Works
&lt;/h3&gt;

&lt;p&gt;When a user asks "How's the revenue?", we need to find that this relates to the "total_revenue" metric - even though the words don't match exactly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Process:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Convert question → 1536-dimensional vector&lt;/li&gt;
&lt;li&gt;Convert all metrics → 1536-dimensional vectors
&lt;/li&gt;
&lt;li&gt;Calculate cosine similarity&lt;/li&gt;
&lt;li&gt;Extract top candidates&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  OpenAI Embedding API
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Service&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="nd"&gt;@Value&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"${openai.api.key}"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="nf"&gt;getEmbedding&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;requestBody&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="s"&gt;"model"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"text-embedding-3-small"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"input"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;
        &lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;HttpHeaders&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HttpHeaders&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setBearerAuth&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setContentType&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MediaType&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;APPLICATION_JSON&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;HttpEntity&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HttpEntity&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;requestBody&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;ResponseEntity&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;restTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;postForEntity&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="s"&gt;"https://api.openai.com/v1/embeddings"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;
        &lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// Extract embedding array from response&lt;/span&gt;
        &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;)&lt;/span&gt; 
            &lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;)&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getBody&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"data"&lt;/span&gt;&lt;span class="o"&gt;)).&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Double&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;)&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"embedding"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;mapToDouble&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;Double:&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;doubleValue&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toArray&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This service generates embeddings using OpenAI's API.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cosine Similarity Calculation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;VectorMath&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="nf"&gt;cosineSimilarity&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;double&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;vec1&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;vec2&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;dotProduct&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;norm1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;norm2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;vec1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;length&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;dotProduct&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;vec1&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;vec2&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;];&lt;/span&gt;
            &lt;span class="n"&gt;norm1&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;vec1&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;vec1&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;];&lt;/span&gt;
            &lt;span class="n"&gt;norm2&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;vec2&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;vec2&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;];&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;dotProduct&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sqrt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;norm1&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nc"&gt;Math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sqrt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;norm2&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This calculates semantic similarity between two vectors.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example Output
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Question: "How's this month's revenue?"

Top Matches:
- "total_revenue": 0.89 similarity ✓
- "order_count": 0.72 similarity
- "conversion_rate": 0.45 similarity
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Real Example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Question: "Show me sales performance"

Vector Comparison:
[0.4, 0.1, 0.6, ...] (question)
    vs
[0.3, 0.2, 0.7, ...] (total_revenue) → 89% match
[0.1, 0.5, -0.2, ...] (order_count) → 72% match
[0.2, 0.6, -0.4, ...] (conversion_rate) → 45% match

Result: Extract top 10 candidates
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Problem&lt;/strong&gt;: 10 candidates is still too many for accurate answers. We need more precise filtering.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Structured Outputs for Precise Classification
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why Structured Outputs?
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Without Structured Outputs:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AI Response: "You should check total_revenue and maybe order_count as well..."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;↓ Unpredictable format, hard to parse&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With Structured Outputs:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"metrics"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"total_revenue"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"order_count"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;↓ Clean array, ready to use&lt;/p&gt;

&lt;h3&gt;
  
  
  The JSON Schema
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;schema&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"object"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"properties"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
        &lt;span class="s"&gt;"metrics"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"array"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"items"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"string"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
            &lt;span class="s"&gt;"description"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"List of relevant metric names"&lt;/span&gt;
        &lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;),&lt;/span&gt;
    &lt;span class="s"&gt;"required"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"metrics"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
    &lt;span class="s"&gt;"additionalProperties"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This schema forces GPT to return exactly the format we need.&lt;/p&gt;

&lt;h3&gt;
  
  
  System Prompt Design
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;systemPrompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;format&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""
    You are an analytics assistant for e-commerce advertisers.

    ### Critical Rules
    1) If you don't understand the question → return []
    2) If unrelated to advertising data → return []
    3) Extract ONLY metric names, no explanations
    4) Use exact names from the list below

    ### Available Metrics
    %s
    """&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;join&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;", "&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;candidateMetrics&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Clear instructions ensure consistent results.&lt;/p&gt;

&lt;h3&gt;
  
  
  Java Implementation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Service&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MetricClassificationService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="nd"&gt;@Value&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"${openai.api.key}"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;RestTemplate&lt;/span&gt; &lt;span class="n"&gt;restTemplate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;classifyMetrics&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;userQuestion&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; 
            &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;candidateMetrics&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;systemPrompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;buildSystemPrompt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;candidateMetrics&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;requestBody&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="s"&gt;"model"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"gpt-4o-mini"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"messages"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"role"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"system"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"content"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;systemPrompt&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
                &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"role"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"user"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"content"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;userQuestion&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;),&lt;/span&gt;
            &lt;span class="s"&gt;"response_format"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"json_schema"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="s"&gt;"json_schema"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                    &lt;span class="s"&gt;"name"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"map_metrics"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                    &lt;span class="s"&gt;"strict"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                    &lt;span class="s"&gt;"schema"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buildSchema&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;HttpHeaders&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HttpHeaders&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setBearerAuth&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;apiKey&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setContentType&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MediaType&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;APPLICATION_JSON&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;HttpEntity&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;entity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HttpEntity&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;requestBody&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;ResponseEntity&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;restTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;postForEntity&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="s"&gt;"https://api.openai.com/v1/chat/completions"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;entity&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;
        &lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;parseResponse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getBody&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;parseResponse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;choice&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;)&lt;/span&gt; 
                &lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;)&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"choices"&lt;/span&gt;&lt;span class="o"&gt;)).&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 
                &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;)&lt;/span&gt; &lt;span class="n"&gt;choice&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"message"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"content"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

            &lt;span class="nc"&gt;ObjectMapper&lt;/span&gt; &lt;span class="n"&gt;mapper&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ObjectMapper&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
            &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mapper&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readValue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;)&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"metrics"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to parse structured output"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Collections&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;emptyList&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This service handles the complete Structured Outputs workflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Real-World Example
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Input: "How's this month's revenue?" + 50 candidates

GPT Processing:
- "total_revenue" : revenue mentioned (selected)
- "order_count" : related to revenue (selected)
- "page_views" : not about revenue (rejected)
- "cart_abandonment" : not about revenue (rejected)

Output: ["total_revenue", "order_count"]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Complete Flow Example
&lt;/h2&gt;

&lt;p&gt;Let's see the entire system in action:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Step 1: Vector Search
User: "How's this month's revenue?"
↓
Top 10 candidates:
- total_revenue (0.89)
- order_count (0.78)
- gross_profit (0.76)
- net_revenue (0.75)
- sales_amount (0.74)
- purchase_count (0.71)
- transaction_value (0.69)
- page_views (0.45)
- bounce_rate (0.32)
- conversion_rate (0.28)

Step 2: GPT Structured Filtering
Input: 10 candidates
↓
GPT Analysis:
"The question asks about 'revenue', so:
- total_revenue ✓ (direct match)
- net_revenue ✓ (revenue metric)
- order_count ✗ (count, not revenue)
- page_views ✗ (engagement, not revenue)"
↓
Output: ["total_revenue", "net_revenue"]

Step 3: Database Query
SELECT total_revenue, net_revenue 
FROM metrics 
WHERE period = 'current_month'
↓
Final Answer: "This month's total revenue is $125,000..."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Results So Far
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Before RAG:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User: "How's this month's revenue?"
AI: "I don't have access to that data."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;After Part 1:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User: "How's this month's revenue?"
System: Successfully identified ["total_revenue", "net_revenue"]
→ Can now query the correct metrics
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Success Rate&lt;/strong&gt;: 95% accuracy for flat metrics like "total_revenue", "order_count"&lt;/p&gt;

&lt;h2&gt;
  
  
  Current Limitation
&lt;/h2&gt;

&lt;p&gt;This system works well for simple metrics but fails on hierarchical ones:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User: "How's my Google traffic?"

Current System:
- Finds "traffic_by_channel" ✓
- Can't identify the "google" sub-metric ✗

We need: "traffic_by_channel|google|advertiser"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The Problem&lt;/strong&gt;: We need to search not just for parent metrics, but also navigate their hierarchical structure to find specific child metrics like "google" under "traffic_by_channel".&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;In &lt;strong&gt;Part 2&lt;/strong&gt;, I'll show you how I solved this using:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;MetricForest Data Structure&lt;/strong&gt;: A tree-based system for hierarchical metrics&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Neighbor Node Exploration&lt;/strong&gt;: Finding related parent-child metric pairs
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Two-Stage GPT Filtering&lt;/strong&gt;: Precise classification at each hierarchy level&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Example of what Part 2 solves:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User: "How's my Google traffic?"
System: "traffic_by_channel|google|advertiser"

User: "Compare my Facebook traffic to industry average"
System: 
- "traffic_by_channel|facebook|advertiser"
- "traffic_by_channel|facebook|industry"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result&lt;/strong&gt;: 98% accuracy even for complex hierarchical queries.&lt;/p&gt;




&lt;h2&gt;
  
  
  Need Help Implementing RAG?
&lt;/h2&gt;

&lt;p&gt;I help companies integrate AI systems with their existing Spring Boot infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Specializing in:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Spring Boot + OpenAI integration&lt;/li&gt;
&lt;li&gt;Custom RAG pipelines&lt;/li&gt;
&lt;li&gt;E-commerce analytics systems&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;💭 Questions or feedback? Join the discussion in the comments!&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; #rag #system-design #vectordatabase #backend&lt;/p&gt;

</description>
      <category>rag</category>
      <category>systemdesign</category>
      <category>vectordatabase</category>
      <category>backend</category>
    </item>
  </channel>
</rss>
