<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Pallavi Mudkhede</title>
    <description>The latest articles on DEV Community by Pallavi Mudkhede (@pallavi_mudkhede).</description>
    <link>https://dev.to/pallavi_mudkhede</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4010539%2F907b355a-327f-4dca-a918-a64b43cfffdd.jpg</url>
      <title>DEV Community: Pallavi Mudkhede</title>
      <link>https://dev.to/pallavi_mudkhede</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/pallavi_mudkhede"/>
    <language>en</language>
    <item>
      <title>How I Built Sherlog: an AI Log Analyzer with RAG, Spring AI, Groq &amp; pgvector</title>
      <dc:creator>Pallavi Mudkhede</dc:creator>
      <pubDate>Wed, 01 Jul 2026 08:22:11 +0000</pubDate>
      <link>https://dev.to/pallavi_mudkhede/how-i-built-sherlog-an-ai-log-analyzer-with-rag-spring-ai-groq-pgvector-5foc</link>
      <guid>https://dev.to/pallavi_mudkhede/how-i-built-sherlog-an-ai-log-analyzer-with-rag-spring-ai-groq-pgvector-5foc</guid>
      <description>&lt;p&gt;Every developer knows the feeling: production throws an error, and you're staring at a&lt;br&gt;
wall of stack-trace text trying to find the one line that matters. So I built &lt;strong&gt;Sherlog&lt;/strong&gt; —&lt;br&gt;
an AI "log detective" that reads an application log, figures out the root cause, and hands&lt;br&gt;
you a step-by-step fix as clean JSON. The twist: it doesn't just ask an LLM blindly. It uses&lt;br&gt;
&lt;strong&gt;RAG (Retrieval-Augmented Generation)&lt;/strong&gt; to ground every answer in a knowledge base of past&lt;br&gt;
incidents.&lt;/p&gt;

&lt;p&gt;In this post I'll walk through how it works and the real lessons I learned building it.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🔗 &lt;strong&gt;Repo:&lt;/strong&gt; github.com/pallavimudkhede21/Sherlog&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  The stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Java 21 + Spring Boot 4.1&lt;/strong&gt; — the backend&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spring AI 2.0&lt;/strong&gt; — the LLM framework (ChatClient, structured output, RAG advisors)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Groq&lt;/strong&gt; (&lt;code&gt;llama-3.1-8b-instant&lt;/code&gt;) — fast, OpenAI-compatible LLM inference&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local ONNX embeddings&lt;/strong&gt; (&lt;code&gt;all-MiniLM-L6-v2&lt;/code&gt;) — text → vectors, in-process, free&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PostgreSQL + pgvector&lt;/strong&gt; — the vector database (in Docker)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  The idea: from a chatbot to a RAG system
&lt;/h2&gt;

&lt;p&gt;A naive version just sends the log to an LLM and prints the answer. It works, but the advice&lt;br&gt;
is generic — the model only knows its training data and the single log you pasted.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RAG changes that.&lt;/strong&gt; Before asking the LLM, we &lt;em&gt;retrieve&lt;/em&gt; relevant knowledge you own — past&lt;br&gt;
incidents and their proven fixes — and &lt;em&gt;augment&lt;/em&gt; the prompt with them. Now the model answers&lt;br&gt;
grounded in &lt;em&gt;your&lt;/em&gt; reality.&lt;/p&gt;

&lt;p&gt;Here's the whole pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;log → embed (local MiniLM) → search pgvector (top-3 incidents)
    → inject into prompt → Groq (JSON mode) → typed response
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Part 1 — Structured output with Spring AI
&lt;/h2&gt;

&lt;p&gt;The first win is getting &lt;strong&gt;typed JSON back from the LLM&lt;/strong&gt; instead of parsing free text. Spring&lt;br&gt;
AI's &lt;code&gt;ChatClient&lt;/code&gt; does this with &lt;code&gt;.entity()&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;system&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;SYSTEM_PROMPT&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;u&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Analyze these logs:\n\n{logs}"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;param&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"logs"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getLogs&lt;/span&gt;&lt;span class="o"&gt;()))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;OpenAiChatOptions&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;responseFormat&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;OpenAiChatModel&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ResponseFormat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;OpenAiChatModel&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ResponseFormat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;JSON_OBJECT&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;()))&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;call&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;entity&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;LogAnalysisResponse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;   &lt;span class="c1"&gt;// ← schema + parsing, automatic&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;.entity(LogAnalysisResponse.class)&lt;/code&gt; generates a JSON schema from the POJO, tells the model to&lt;br&gt;
honor it, and maps the reply straight onto the object. Groq's &lt;strong&gt;JSON mode&lt;/strong&gt;&lt;br&gt;
(&lt;code&gt;response_format: json_object&lt;/code&gt;) forces valid JSON so the mapping never fails on prose.&lt;/p&gt;
&lt;h2&gt;
  
  
  Part 2 — Embeddings + pgvector
&lt;/h2&gt;

&lt;p&gt;To find "similar past incidents," we don't keyword-match — we compare &lt;strong&gt;meaning&lt;/strong&gt;. An embedding&lt;br&gt;
model turns text into a vector (a list of numbers); similar meaning → nearby vectors.&lt;/p&gt;

&lt;p&gt;I run the embedding model &lt;strong&gt;locally&lt;/strong&gt; with Spring AI's Transformers starter (ONNX&lt;br&gt;
&lt;code&gt;all-MiniLM-L6-v2&lt;/code&gt;, 384 dimensions) — no embedding API, no cost. The vectors live in&lt;br&gt;
&lt;strong&gt;pgvector&lt;/strong&gt;, a Postgres extension:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;--name&lt;/span&gt; pgvector-db &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;POSTGRES_PASSWORD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;postgres &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;POSTGRES_DB&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;lograg &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; 5432:5432 pgvector/pgvector:pg17
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Spring AI auto-creates the &lt;code&gt;vector_store&lt;/code&gt; table (with an HNSW cosine index) on startup, and a&lt;br&gt;
loader seeds it from a small &lt;code&gt;incidents.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="n"&gt;vectorStore&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;   &lt;span class="c1"&gt;// embeds each text and stores the vector&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Part 3 — Wiring RAG in one line
&lt;/h2&gt;

&lt;p&gt;This is the magic. Spring AI has a purpose-built &lt;code&gt;QuestionAnswerAdvisor&lt;/code&gt; that does the&lt;br&gt;
retrieve-and-augment step automatically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;advisors&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;QuestionAnswerAdvisor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vectorStore&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;searchRequest&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;SearchRequest&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;topK&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add that to the &lt;code&gt;ChatClient&lt;/code&gt; call and every request now retrieves the 3 most similar past&lt;br&gt;
incidents and injects them into the prompt before Groq answers. That's the whole "R" and "A"&lt;br&gt;
of RAG in one line.&lt;/p&gt;

&lt;h2&gt;
  
  
  The payoff: does RAG actually help?
&lt;/h2&gt;

&lt;p&gt;I added a toggle (&lt;code&gt;?rag=true|false&lt;/code&gt;) to measure it. Same connection-timeout log, both ways:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RAG OFF:&lt;/strong&gt; &lt;em&gt;"Increase the maximum pool size or adjust the connection timeout."&lt;/em&gt; (vague)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RAG ON:&lt;/strong&gt; &lt;em&gt;"Increase &lt;code&gt;spring.datasource.hikari.maximum-pool-size&lt;/code&gt;, ensure connections are&lt;br&gt;
closed, and investigate long-running transactions."&lt;/em&gt; (specific — pulled from the knowledge base)&lt;/p&gt;

&lt;p&gt;Same model, same log. The only difference is retrieval — and the grounded answer is measurably&lt;br&gt;
more actionable.&lt;/p&gt;

&lt;h2&gt;
  
  
  The hard-won lessons
&lt;/h2&gt;

&lt;p&gt;The tutorial makes it look smooth. It wasn't. The real lessons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Check library ↔ framework versions before you adopt.&lt;/strong&gt; Spring Boot 4 uses &lt;strong&gt;Jackson 3&lt;/strong&gt;
(&lt;code&gt;tools.jackson&lt;/code&gt;), not Jackson 2, and needs &lt;strong&gt;Spring AI 2.0&lt;/strong&gt;. Mixing versions cost me hours.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Read the error literally.&lt;/strong&gt; A &lt;code&gt;404 Unknown request URL&lt;/code&gt; was just a missing &lt;code&gt;/v1&lt;/code&gt; in the
base URL — Spring AI 2.0 changed the convention from the 1.x docs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Structured output isn't magic.&lt;/strong&gt; A small model returns prose unless you &lt;em&gt;force&lt;/em&gt; JSON mode.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Trust the jar, not the blog.&lt;/strong&gt; When an import failed, &lt;code&gt;javap&lt;/code&gt; on the actual jar showed the
class had moved to a nested type between milestone and GA. Reading bytecode beats guessing.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;The full source is on GitHub with a README that walks through setup:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🔗 &lt;strong&gt;github.com/pallavimudkhede21/Sherlog&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you're learning &lt;strong&gt;Spring AI&lt;/strong&gt;, &lt;strong&gt;RAG&lt;/strong&gt;, or &lt;strong&gt;pgvector&lt;/strong&gt;, clone it and poke around — the&lt;br&gt;
&lt;code&gt;?rag=true|false&lt;/code&gt; toggle is a fun way to &lt;em&gt;see&lt;/em&gt; what retrieval actually buys you.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Built with Spring Boot 4, Spring AI 2.0, Groq, and pgvector. Questions welcome in the comments!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>springboot</category>
      <category>tutorial</category>
      <category>java</category>
    </item>
    <item>
      <title>title: "How I Built Sherlog: an AI Log Analyzer with RAG, Spring AI, Groq &amp; pgvector"
published: false
tags: java, springboot, ai, tutorial
cover_image: 
 ---


Every developer knows the feeling: production throws an error, and you're staring at a
wall of</title>
      <dc:creator>Pallavi Mudkhede</dc:creator>
      <pubDate>Wed, 01 Jul 2026 08:17:07 +0000</pubDate>
      <link>https://dev.to/pallavi_mudkhede/title-how-i-built-sherlog-an-ai-log-analyzer-with-rag-spring-ai-groq-pgvector-published-4b5n</link>
      <guid>https://dev.to/pallavi_mudkhede/title-how-i-built-sherlog-an-ai-log-analyzer-with-rag-spring-ai-groq-pgvector-published-4b5n</guid>
      <description></description>
      <category>ai</category>
      <category>rag</category>
      <category>showdev</category>
      <category>springboot</category>
    </item>
  </channel>
</rss>
