<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Davide Mibelli</title>
    <description>The latest articles on DEV Community by Davide Mibelli (@kharonte).</description>
    <link>https://dev.to/kharonte</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2481326%2F673f4cd1-cfb7-4029-b743-ae0fa7f5dc76.png</url>
      <title>DEV Community: Davide Mibelli</title>
      <link>https://dev.to/kharonte</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/kharonte"/>
    <language>en</language>
    <item>
      <title>RAG in Production: What the Tutorials Don't Tell You</title>
      <dc:creator>Davide Mibelli</dc:creator>
      <pubDate>Thu, 14 May 2026 13:00:00 +0000</pubDate>
      <link>https://dev.to/kharonte/rag-in-production-what-the-tutorials-dont-tell-you-30cj</link>
      <guid>https://dev.to/kharonte/rag-in-production-what-the-tutorials-dont-tell-you-30cj</guid>
      <description>&lt;p&gt;I built a RAG system that scored 91% on our internal eval suite. It retrieved the right chunks four out of five times in every benchmark we ran. We shipped it. Users thought it was broken.&lt;/p&gt;

&lt;p&gt;The gap between "works in evaluation" and "works in production" is the thing every RAG tutorial skips. This article is what I learned closing that gap across three different production deployments — a customer support bot, an internal knowledge base, and a document Q&amp;amp;A tool for a legal team.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why your evals lie to you
&lt;/h2&gt;

&lt;p&gt;The typical RAG eval flow: take 50 question-answer pairs, run retrieval, score chunk relevance, measure answer quality. The benchmark looks good. Production does not.&lt;/p&gt;

&lt;p&gt;The problem is evaluation datasets are clean. Real user questions are not. Users ask ambiguous things, reference context from earlier in the conversation, use company-specific jargon that is not in your embeddings vocabulary, and ask questions that span multiple documents. Your 50-pair eval dataset does not cover any of this.&lt;/p&gt;

&lt;p&gt;The more subtle problem: retrieval correctness is not the same as answer usefulness. A chunk can be semantically relevant to the query but contain outdated information, contradict another retrieved chunk, or be missing the specific number the user actually needs. Cosine similarity does not catch any of this.&lt;/p&gt;

&lt;p&gt;Before you optimize retrieval metrics, instrument what users actually do. In my customer support deployment, the clearest signal was not retrieval recall — it was how often users rephrased their question immediately after getting an answer. That rephrasing rate was the real quality metric.&lt;/p&gt;

&lt;h2&gt;
  
  
  Chunking is where it actually breaks
&lt;/h2&gt;

&lt;p&gt;The default chunking strategy in most tutorials: split every 512 tokens with 50-token overlap. This is almost always wrong.&lt;/p&gt;

&lt;p&gt;The problem is that 512 tokens is an arbitrary number based on older embedding model limits, not on the structure of your documents. A 512-token chunk cut out of the middle of a legal clause or a technical procedure is often meaningless without the surrounding context.&lt;/p&gt;

&lt;p&gt;What actually works depends on your document type:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For structured documents&lt;/strong&gt; (FAQs, product docs, knowledge base articles): chunk by logical unit — one question-answer pair, one procedure step, one concept section. Use your document's own structure as the chunking boundary. If your docs use consistent heading patterns, split on those.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For long-form prose&lt;/strong&gt; (contracts, reports, research papers): hierarchical chunking. Keep a parent chunk of 1000–2000 tokens for context retrieval, and child chunks of 150–300 tokens for precise matching. At retrieval time, return the child chunk for relevance scoring but pass the parent chunk to the LLM as context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For code documentation or READMEs&lt;/strong&gt;: file-level or function-level chunks, never mid-function splits.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcb90ddpsg1zb1mhlqs9d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcb90ddpsg1zb1mhlqs9d.png" alt="Three chunking strategies side by side — Fixed 512-token (showing a chunk cutting mid-sentence), Logical/heading-based (showing clean boundaries at section breaks), Hierarchical parent-child (showing small retrieval chunks nested inside larger context chunks). Label the failure mode of the fixed approach." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The overlap parameter also matters more than people expect. Overlap exists to avoid losing information at chunk boundaries, but it inflates your vector store size and retrieves duplicate context. I found that semantic chunking — splitting at sentence boundaries that mark topic shifts — eliminated the need for overlap almost entirely.&lt;/p&gt;

&lt;h2&gt;
  
  
  Retrieval is rarely the problem you think it is
&lt;/h2&gt;

&lt;p&gt;When RAG produces wrong answers, the instinct is to improve retrieval. Better embeddings, more chunks, higher top-k. This is usually the wrong lever.&lt;/p&gt;

&lt;p&gt;In my experience, retrieval is failing only about 30% of the time when users report bad answers. The other 70% is one of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The right chunk was retrieved but the LLM ignored it&lt;/strong&gt; — this is a prompt engineering problem, not a retrieval problem&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The answer requires synthesizing across multiple chunks&lt;/strong&gt; — retrieval returned individually correct chunks but the LLM could not connect them&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The question is genuinely unanswerable from the knowledge base&lt;/strong&gt; — the document does not exist or is outdated&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To separate these, add logging at both retrieval and generation time. Log the top-k chunks and the final answer separately. A human spot-check of 20 failure cases per week will show you very quickly which category you are actually in.&lt;/p&gt;

&lt;p&gt;When retrieval genuinely is the problem, the highest-leverage fix is hybrid search: dense retrieval (embeddings + cosine similarity) combined with sparse retrieval (BM25 keyword matching). Dense retrieval handles semantic similarity. BM25 handles exact matches — product names, error codes, version numbers, any term where "sounds like" is the wrong answer.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.retrievers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;EnsembleRetriever&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.retrievers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BM25Retriever&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_chroma&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Chroma&lt;/span&gt;

&lt;span class="n"&gt;dense_retriever&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Chroma&lt;/span&gt;&lt;span class="p"&gt;(...).&lt;/span&gt;&lt;span class="nf"&gt;as_retriever&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;k&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="n"&gt;sparse_retriever&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BM25Retriever&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;ensemble&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;EnsembleRetriever&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;retrievers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;dense_retriever&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sparse_retriever&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;weights&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;# tune based on your query distribution
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The weight split between dense and sparse depends on your documents. Technical documentation with lots of product names and version numbers benefits from higher BM25 weight. Conversational knowledge bases lean toward dense.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reranking changes the answer quality more than anything else
&lt;/h2&gt;

&lt;p&gt;If there is one thing to add to a RAG pipeline that makes the biggest difference in production answer quality, it is a cross-encoder reranker between retrieval and generation.&lt;/p&gt;

&lt;p&gt;The retrieval step uses bi-encoder embeddings — query and document are embedded independently, similarity is a dot product. This is fast but imprecise. A cross-encoder takes the query and a candidate chunk together as a single input and scores their relevance jointly. Much more accurate, but too slow to run over your entire corpus.&lt;/p&gt;

&lt;p&gt;The standard pattern: retrieve top-20 with the fast bi-encoder, rerank with the cross-encoder, pass top-3 or top-5 to the LLM.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sentence_transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;CrossEncoder&lt;/span&gt;

&lt;span class="n"&gt;reranker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CrossEncoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cross-encoder/ms-marco-MiniLM-L-6-v2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;rerank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;top_n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;pairs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;reranker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pairs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ranked&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ranked&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="n"&gt;top_n&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the legal document system, adding reranking reduced hallucinations in answers by roughly 40% — not because retrieval improved, but because the LLM stopped having to sort through marginally relevant context and could focus on the actually relevant chunks.&lt;/p&gt;

&lt;h2&gt;
  
  
  The infrastructure issues no tutorial shows
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Context window budget.&lt;/strong&gt; Top-5 chunks at 500 tokens each is 2500 tokens before you add the system prompt and the user message. With GPT-4 this is fine. With smaller models or high-volume APIs where you want to minimize token cost, you need explicit context budgeting. Know your LLM's context window, subtract your fixed prompt overhead, and set your chunk count and size to fit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stale chunks.&lt;/strong&gt; Documents in production change. Your chunk embeddings do not update automatically. You need a pipeline that detects document changes (by checksum or last-modified timestamp) and re-embeds only the changed documents. I've seen teams manually re-index their entire corpus monthly as a workaround. That is not a solution for any corpus over a few thousand documents.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conflicting information.&lt;/strong&gt; If the same concept is documented in multiple places with different details, retrieval will return both. The LLM will either pick one arbitrarily or produce a contradictory answer. The fix is upstream — deduplicate your knowledge base and establish a single source of truth. Retrieval cannot save you from bad source data.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmpfsdu6f9hqzsl2re9w3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmpfsdu6f9hqzsl2re9w3.png" alt="RAG pipeline with annotated failure points — Ingestion stage: stale chunk failure point. Retrieval stage: bi-encoder imprecision failure point. Reranking stage: this is where you recover. Generation stage: context budget exceeded failure point, conflicting chunks failure point." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Context window management
&lt;/h2&gt;

&lt;p&gt;One problem that grows slowly and then all at once: as your knowledge base expands and you increase top-k to improve recall, your context window fills up. You pass 10 chunks at 500 tokens each, plus system prompt, plus conversation history, and suddenly you are at 7000 tokens per request on a model with an 8000-token limit.&lt;/p&gt;

&lt;p&gt;The naive fix is to increase top-k and hope the model attends to the right parts. The correct fix is explicit context budgeting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;MAX_CONTEXT_TOKENS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4000&lt;/span&gt;  &lt;span class="c1"&gt;# reserve headroom for prompt + answer
&lt;/span&gt;&lt;span class="n"&gt;TOKENS_PER_CHUNK&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;     &lt;span class="c1"&gt;# approximate after chunking
&lt;/span&gt;
&lt;span class="n"&gt;max_chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MAX_CONTEXT_TOKENS&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="n"&gt;TOKENS_PER_CHUNK&lt;/span&gt;  &lt;span class="c1"&gt;# = 10
&lt;/span&gt;
&lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;rerank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;retrieve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;top_n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;max_chunks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Calculate the budget before retrieval, not after. If you are hitting the limit regularly, reduce chunk size rather than reducing top-k — smaller chunks at the same top-k gives the model more coverage with the same token budget.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to actually monitor
&lt;/h2&gt;

&lt;p&gt;Stop measuring retrieval precision in isolation. In production, instrument:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Rephrasing rate&lt;/strong&gt;: how often a user asks a follow-up that is essentially the same question reworded. High rate means the answer was not useful, regardless of retrieval metrics.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Answer rejection rate&lt;/strong&gt;: if your UI has thumbs-down feedback, track it. Correlate failures with retrieved chunk IDs to identify which documents produce bad answers consistently.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Latency by pipeline stage&lt;/strong&gt;: retrieval, reranking, and generation each have different latency profiles and different optimization paths. Aggregate P95 latency tells you nothing useful about where to look.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unanswerable rate&lt;/strong&gt;: how often the LLM says "I don't have information on this." Below 5% and your system is probably hallucinating answers it should refuse. Above 30% and your knowledge base has coverage gaps.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chunk age&lt;/strong&gt;: track when each chunk was last re-indexed. Any chunk older than your document update frequency is potentially stale. This one takes ten minutes to add to your ingestion pipeline and saves hours of debugging mysterious wrong answers three months from now.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The RAG tutorial teaches you to build a pipeline. Production teaches you to instrument one. The teams I've seen succeed were instrumenting before they had users, not after.&lt;/p&gt;

&lt;p&gt;What is the failure mode you hit first in your own RAG deployment?&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://medium.com/@davide.mib/rag-in-production-what-the-tutorials-dont-tell-you-56470d9785ec" rel="noopener noreferrer"&gt;Medium&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>python</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Spring Boot JWT Authentication: The Complete Setup Most Tutorials Get Wrong</title>
      <dc:creator>Davide Mibelli</dc:creator>
      <pubDate>Tue, 12 May 2026 05:18:42 +0000</pubDate>
      <link>https://dev.to/kharonte/spring-boot-jwt-authentication-the-complete-setup-most-tutorials-get-wrong-2f8d</link>
      <guid>https://dev.to/kharonte/spring-boot-jwt-authentication-the-complete-setup-most-tutorials-get-wrong-2f8d</guid>
      <description>&lt;p&gt;I've read probably forty Spring Boot JWT tutorials over the years. They all show you the same thing: how to generate a token on login, how to validate it on each request, and how to wire up &lt;code&gt;SecurityFilterChain&lt;/code&gt;. And they all stop right there.&lt;/p&gt;

&lt;p&gt;What they skip is everything that matters in production: refresh token rotation, token revocation, and not sending tokens in JavaScript-readable headers when you don't have to. I've inherited two codebases where a "working JWT implementation" turned out to be a security hole you could drive a truck through. This article is the setup I now use by default.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the typical tutorial gives you
&lt;/h2&gt;

&lt;p&gt;The standard walkthrough produces something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Bean&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;SecurityFilterChain&lt;/span&gt; &lt;span class="nf"&gt;filterChain&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;HttpSecurity&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;csrf&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;AbstractHttpConfigurer:&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;disable&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sessionManagement&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sessionCreationPolicy&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;SessionCreationPolicy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;STATELESS&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;authorizeHttpRequests&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;auth&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;auth&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;requestMatchers&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/auth/**"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;permitAll&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;anyRequest&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;authenticated&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;addFilterBefore&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jwtAuthFilter&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;UsernamePasswordAuthenticationFilter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is fine. The filter reads the &lt;code&gt;Authorization: Bearer &amp;lt;token&amp;gt;&lt;/code&gt; header, validates the signature, sets the &lt;code&gt;SecurityContext&lt;/code&gt;. You can deploy this. The problem is what comes next.&lt;/p&gt;

&lt;p&gt;The access token has an expiry — typically 15 to 60 minutes. What happens when it expires? In most tutorials: the user logs in again. In production: users complain, sessions die mid-work, and someone "fixes" it by setting expiry to 7 days. Now you have a long-lived token you can't revoke.&lt;/p&gt;

&lt;h2&gt;
  
  
  The right structure: short-lived access + rotating refresh
&lt;/h2&gt;

&lt;p&gt;The pattern I use in every Spring Boot project now:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Access token&lt;/strong&gt;: 15 minutes, stored in memory (JS variable or React state), never in &lt;code&gt;localStorage&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Refresh token&lt;/strong&gt;: 7 days, stored in an &lt;code&gt;HttpOnly; Secure; SameSite=Strict&lt;/code&gt; cookie, not accessible to JavaScript&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rotation&lt;/strong&gt;: every refresh request issues a new refresh token and invalidates the old one&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Revocation store&lt;/strong&gt;: a small table (or Redis set) of invalidated refresh token IDs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft697svepu72exsk8yp3e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft697svepu72exsk8yp3e.png" alt="Sequence diagram — login issues access token (15min) + HttpOnly refresh cookie (7d); expired access token triggers POST /auth/refresh; server rotates refresh token and issues new access token" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here's the refresh token entity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Entity&lt;/span&gt;
&lt;span class="nd"&gt;@Table&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"refresh_tokens"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RefreshToken&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;@Id&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// UUID, stored in the cookie value&lt;/span&gt;

    &lt;span class="nd"&gt;@ManyToOne&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fetch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FetchType&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;LAZY&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Instant&lt;/span&gt; &lt;span class="n"&gt;expiresAt&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;revoked&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="nd"&gt;@CreationTimestamp&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Instant&lt;/span&gt; &lt;span class="n"&gt;createdAt&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the service that handles rotation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Service&lt;/span&gt;
&lt;span class="nd"&gt;@Transactional&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RefreshTokenService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;RefreshTokenRepository&lt;/span&gt; &lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Duration&lt;/span&gt; &lt;span class="n"&gt;refreshExpiry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofDays&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;RefreshToken&lt;/span&gt; &lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;User&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;RefreshToken&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;RefreshToken&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setId&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;UUID&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;randomUUID&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setUser&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setExpiresAt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Instant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;now&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;plus&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;refreshExpiry&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
        &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setRevoked&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;save&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;RefreshToken&lt;/span&gt; &lt;span class="nf"&gt;rotate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;oldTokenId&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;RefreshToken&lt;/span&gt; &lt;span class="n"&gt;old&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findById&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;oldTokenId&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;orElseThrow&lt;/span&gt;&lt;span class="o"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;InvalidTokenException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Refresh token not found"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;old&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isRevoked&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;old&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getExpiresAt&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;isBefore&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Instant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;now&lt;/span&gt;&lt;span class="o"&gt;()))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// Possible token reuse attack — revoke entire family&lt;/span&gt;
            &lt;span class="n"&gt;revokeAllForUser&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;old&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getUser&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;InvalidTokenException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Refresh token expired or revoked"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="n"&gt;old&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setRevoked&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;save&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;old&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;old&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getUser&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;revokeAllForUser&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;User&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;revokeAllByUser&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The reuse detection is the part most tutorials omit entirely. If an attacker steals a refresh token and uses it, you now have two parties trying to use it. When the legitimate client tries to rotate and finds its token already consumed, you revoke everything and force a new login. This is the refresh token family pattern from the OAuth 2.0 Security Best Current Practice spec.&lt;/p&gt;

&lt;h2&gt;
  
  
  The cookie setup
&lt;/h2&gt;

&lt;p&gt;The refresh token goes out in the response as a cookie, not in the JSON body:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@PostMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/auth/login"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;ResponseEntity&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;AccessTokenResponse&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;login&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
        &lt;span class="nd"&gt;@RequestBody&lt;/span&gt; &lt;span class="nc"&gt;LoginRequest&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
        &lt;span class="nc"&gt;HttpServletResponse&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="nc"&gt;User&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;authService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;authenticate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;email&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;password&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;

    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;accessToken&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;jwtService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;generateAccessToken&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="nc"&gt;RefreshToken&lt;/span&gt; &lt;span class="n"&gt;refreshToken&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;refreshTokenService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="nc"&gt;ResponseCookie&lt;/span&gt; &lt;span class="n"&gt;cookie&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ResponseCookie&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;from&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"refresh_token"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;refreshToken&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getId&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;httpOnly&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;secure&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sameSite&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Strict"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/auth/refresh"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;maxAge&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofDays&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;addHeader&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;HttpHeaders&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;SET_COOKIE&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cookie&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;ResponseEntity&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ok&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;AccessTokenResponse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;accessToken&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note &lt;code&gt;.path("/auth/refresh")&lt;/code&gt; — the cookie is scoped to the refresh endpoint only. The browser won't send it on any other request. This reduces the attack surface on CSRF (though &lt;code&gt;SameSite=Strict&lt;/code&gt; already handles most of that).&lt;/p&gt;

&lt;p&gt;The refresh endpoint reads the cookie:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@PostMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/auth/refresh"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;ResponseEntity&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;AccessTokenResponse&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;refresh&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
        &lt;span class="nd"&gt;@CookieValue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"refresh_token"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;required&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;refreshTokenId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
        &lt;span class="nc"&gt;HttpServletResponse&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;refreshTokenId&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;ResponseEntity&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;HttpStatus&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;UNAUTHORIZED&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="nc"&gt;RefreshToken&lt;/span&gt; &lt;span class="n"&gt;newRefreshToken&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;refreshTokenService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;rotate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;refreshTokenId&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;newAccessToken&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;jwtService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;generateAccessToken&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;newRefreshToken&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getUser&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;

    &lt;span class="nc"&gt;ResponseCookie&lt;/span&gt; &lt;span class="n"&gt;cookie&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ResponseCookie&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;from&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"refresh_token"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;newRefreshToken&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getId&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;httpOnly&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;secure&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sameSite&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Strict"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/auth/refresh"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;maxAge&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofDays&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;addHeader&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;HttpHeaders&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;SET_COOKIE&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cookie&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;ResponseEntity&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ok&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;AccessTokenResponse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;newAccessToken&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz1wnjzlodlz8eti3bjgm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz1wnjzlodlz8eti3bjgm.png" alt="Browser DevTools Application &amp;gt; Cookies panel showing the refresh_token cookie with HttpOnly checked, Secure checked, SameSite=Strict, Path=/auth/refresh, and a 7-day expiry date" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The JWT service itself
&lt;/h2&gt;

&lt;p&gt;Nothing exotic here, but I'll include it for completeness. I use &lt;code&gt;io.jsonwebtoken:jjwt-api&lt;/code&gt; (JJWT 0.12.x) with Spring Boot 3:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Service&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;JwtService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="nd"&gt;@Value&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"${app.jwt.secret}"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;secret&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Duration&lt;/span&gt; &lt;span class="no"&gt;ACCESS_EXPIRY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofMinutes&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;SecretKey&lt;/span&gt; &lt;span class="nf"&gt;key&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Keys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;hmacShaKeyFor&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Decoders&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;BASE64&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;decode&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;secret&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;generateAccessToken&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;User&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Jwts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;subject&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getId&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;claim&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"email"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getEmail&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;claim&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"roles"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getRoles&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;issuedAt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;expiration&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;from&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Instant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;now&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;plus&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;ACCESS_EXPIRY&lt;/span&gt;&lt;span class="o"&gt;)))&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;signWith&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;compact&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Claims&lt;/span&gt; &lt;span class="nf"&gt;validateAndParse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Jwts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;verifyWith&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;parseSignedClaims&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getPayload&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The secret must be at least 256 bits (32 bytes) for HS256. Generate it once and store it in your secrets manager, not in &lt;code&gt;application.yml&lt;/code&gt; committed to git:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openssl rand &lt;span class="nt"&gt;-base64&lt;/span&gt; 32
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The filter
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Component&lt;/span&gt;
&lt;span class="nd"&gt;@RequiredArgsConstructor&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;JwtAuthFilter&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;OncePerRequestFilter&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;JwtService&lt;/span&gt; &lt;span class="n"&gt;jwtService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;UserDetailsService&lt;/span&gt; &lt;span class="n"&gt;userDetailsService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="nd"&gt;@Override&lt;/span&gt;
    &lt;span class="kd"&gt;protected&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;doFilterInternal&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="nc"&gt;HttpServletRequest&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;HttpServletResponse&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;FilterChain&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;ServletException&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;IOException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;header&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getHeader&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;HttpHeaders&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;AUTHORIZATION&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;header&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;header&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;startsWith&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Bearer "&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;doFilter&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;header&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;substring&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;Claims&lt;/span&gt; &lt;span class="n"&gt;claims&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;jwtService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;validateAndParse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;claims&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getSubject&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nc"&gt;SecurityContextHolder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getContext&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getAuthentication&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="nc"&gt;UserDetails&lt;/span&gt; &lt;span class="n"&gt;userDetails&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;userDetailsService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;loadUserByUsername&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                &lt;span class="nc"&gt;UsernamePasswordAuthenticationToken&lt;/span&gt; &lt;span class="n"&gt;auth&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;UsernamePasswordAuthenticationToken&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;userDetails&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;userDetails&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getAuthorities&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
                &lt;span class="n"&gt;auth&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setDetails&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;WebAuthenticationDetailsSource&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;buildDetails&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
                &lt;span class="nc"&gt;SecurityContextHolder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getContext&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;setAuthentication&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;auth&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;JwtException&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// Token invalid or expired — let the request proceed unauthenticated&lt;/span&gt;
            &lt;span class="c1"&gt;// The security config will reject it at the endpoint level&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;doFilter&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I catch &lt;code&gt;JwtException&lt;/code&gt; broadly here rather than differentiating expired vs. tampered. From the filter's perspective the right behavior is the same: don't authenticate. The client should catch the 401 and attempt a refresh.&lt;/p&gt;

&lt;h2&gt;
  
  
  Logout
&lt;/h2&gt;

&lt;p&gt;Logout needs to revoke the refresh token and clear the cookie:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@PostMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/auth/logout"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;ResponseEntity&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Void&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;logout&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
        &lt;span class="nd"&gt;@CookieValue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"refresh_token"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;required&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;refreshTokenId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
        &lt;span class="nc"&gt;HttpServletResponse&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;refreshTokenId&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;refreshTokenService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;revoke&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;refreshTokenId&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="nc"&gt;ResponseCookie&lt;/span&gt; &lt;span class="n"&gt;cleared&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ResponseCookie&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;from&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"refresh_token"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;httpOnly&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;secure&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sameSite&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Strict"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/auth/refresh"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;maxAge&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;addHeader&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;HttpHeaders&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;SET_COOKIE&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cleared&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;ResponseEntity&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;noContent&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Setting &lt;code&gt;maxAge(0)&lt;/code&gt; tells the browser to delete the cookie immediately.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I skip in this setup
&lt;/h2&gt;

&lt;p&gt;A few things I deliberately leave out of a standard deployment:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;JWT blocklist for access tokens.&lt;/strong&gt; Once you have 15-minute expiry and a working refresh flow, blocking access tokens before expiry is usually not worth the latency of a blocklist check on every request. If you need immediate revocation (e.g. "terminate all sessions now"), revoke all refresh tokens — the access tokens will expire naturally within 15 minutes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RS256 asymmetric signing.&lt;/strong&gt; Useful if multiple services need to verify tokens without sharing the secret. In a monolith or a small microservices setup where the auth service also issues tokens to itself, HMAC-SHA256 is simpler and faster.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Remember me / device management.&lt;/strong&gt; Out of scope here, but the refresh token table already gives you the foundation: add a &lt;code&gt;deviceName&lt;/code&gt; column, show the user their active sessions, let them revoke specific ones.&lt;/p&gt;




&lt;p&gt;The only thing preventing most teams from shipping this instead of the tutorial version is the extra thirty minutes it takes to add the refresh token table and rotation logic. It's worth it.&lt;/p&gt;

&lt;p&gt;What does your current JWT setup look like — are you doing refresh rotation, or relying on long-lived tokens?&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://medium.com/@davide.mib/spring-boot-jwt-authentication-the-complete-setup-most-tutorials-get-wrong-6379bb364040" rel="noopener noreferrer"&gt;Medium&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>java</category>
      <category>springboot</category>
      <category>security</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>I Tested Claude Code, OpenCode, and Codex for 30 Days — Here's My Verdict</title>
      <dc:creator>Davide Mibelli</dc:creator>
      <pubDate>Wed, 06 May 2026 10:02:19 +0000</pubDate>
      <link>https://dev.to/kharonte/i-tested-claude-code-opencode-and-codex-for-30-days-heres-my-verdict-4nf4</link>
      <guid>https://dev.to/kharonte/i-tested-claude-code-opencode-and-codex-for-30-days-heres-my-verdict-4nf4</guid>
      <description>&lt;p&gt;In March 2026, I made a rule: every piece of non-trivial code I write has to go through at least one AI coding agent before I commit it. Not for autocomplete — for the whole thing. Architecture decisions, test generation, refactoring, debugging. The full workflow.&lt;/p&gt;

&lt;p&gt;I used three tools in rotation: Claude Code, OpenCode, and OpenAI Codex CLI. All on the same projects — a Spring Boot microservice, a Flutter mobile app, and a personal side project in Python. Thirty days, 100+ hours of active coding across all three.&lt;/p&gt;

&lt;p&gt;A note on scope: I deliberately chose terminal-first agents, not IDE-integrated tools like Cursor or GitHub Copilot. My workflow already lives in the terminal, and I wanted agents that reason about entire codebases rather than just the file I have open. If you're evaluating Cursor, this comparison won't map directly.&lt;/p&gt;

&lt;p&gt;Here's what actually happened.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is a preview. Read the full article on Medium: &lt;a href="https://medium.com/@davide.mib/i-tested-claude-code-opencode-and-codex-for-30-days-heres-my-verdict-89a3fc72076f" rel="noopener noreferrer"&gt;I Tested Claude Code, OpenCode, and Codex for 30 Days — Here's My Verdict&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>programming</category>
      <category>claude</category>
    </item>
    <item>
      <title>Spring Boot 4.0 Migration: What Nobody Tells You About the Breaking Changes</title>
      <dc:creator>Davide Mibelli</dc:creator>
      <pubDate>Wed, 29 Apr 2026 13:14:24 +0000</pubDate>
      <link>https://dev.to/kharonte/spring-boot-40-migration-what-nobody-tells-you-about-the-breaking-changes-1ln2</link>
      <guid>https://dev.to/kharonte/spring-boot-40-migration-what-nobody-tells-you-about-the-breaking-changes-1ln2</guid>
      <description>&lt;p&gt;I upgraded two production applications to Spring Boot 4.0 the week it went GA. I read the migration guide, skimmed the release notes, and thought the hardest part would be the Jackson 3 change everyone was talking about.&lt;/p&gt;

&lt;p&gt;The Jackson 3 change was not the hardest part.&lt;/p&gt;

&lt;p&gt;Two applications, a few days of debugging, and one very confusing test suite later, I have a clear picture of what actually breaks — and what the official guide glosses over.&lt;/p&gt;

&lt;p&gt;This is not a walkthrough of the migration guide. This is the stuff that costs you real time.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is a preview. Read the full article on Medium: &lt;a href="https://medium.com/@davide.mib/spring-boot-4-0-migration-what-nobody-tells-you-about-the-breaking-changes-94482a2f886f" rel="noopener noreferrer"&gt;Spring Boot 4.0 Migration: What Nobody Tells You About the Breaking Changes&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>java</category>
      <category>springboot</category>
      <category>tutorial</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Stop Fighting CSS: The Google Stitch + Antigravity Stack for Solo Developers</title>
      <dc:creator>Davide Mibelli</dc:creator>
      <pubDate>Fri, 24 Apr 2026 09:07:34 +0000</pubDate>
      <link>https://dev.to/kharonte/stop-fighting-css-the-google-stitch-antigravity-stack-for-solo-developers-2l2k</link>
      <guid>https://dev.to/kharonte/stop-fighting-css-the-google-stitch-antigravity-stack-for-solo-developers-2l2k</guid>
      <description>&lt;p&gt;Most developers I know have the same problem: the logic is solid, but the UI looks like it was built in a hurry — because it was.&lt;br&gt;
I spent two days on a login flow for a side project before I gave up and tried a different approach: Google Stitch for the UI, Antigravity as the IDE, and the MCP bridge between them. Here's what the workflow actually looks like.&lt;/p&gt;
&lt;h2&gt;
  
  
  The core idea
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwpsp2ahy88das4x58tf9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwpsp2ahy88das4x58tf9.png" alt=" " width="800" height="360"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Google Stitch gives you pre-configured UI patterns — "Stitches" — that are already accessible and mathematically sound. You pick a functional category (Hero, Card, List), set one primary color and one font, and it handles the rest. The MCP connection to Antigravity means you never manually export manifests: change a button style in the Stitch web UI, and it reflects in your running simulator instantly.&lt;/p&gt;
&lt;h2&gt;
  
  
  The implementation
&lt;/h2&gt;

&lt;p&gt;Once you've connected Stitch to Antigravity via MCP: Configure Server in the command palette, your components become requestable by name:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;useStitch&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@antigravity/react-hooks&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;DashboardCard&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Component&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;loading&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useStitch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;UserCard&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;loading&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Placeholder&lt;/span&gt; &lt;span class="o"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Component&lt;/span&gt;
      &lt;span class="nx"&gt;username&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="nx"&gt;onAction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{(&lt;/span&gt;&lt;span class="nx"&gt;ev&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;User tapped:&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;ev&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
    &lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your React component has no idea what UserCard looks like. It just requests it. If you decide a List works better than a Card, you change it in Stitch — the code stays the same.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I didn't expect
&lt;/h2&gt;

&lt;p&gt;The MCP connection is bidirectional. You can send performance feedback from Antigravity back to Stitch, so the design tool knows which components are causing frame drops on real devices. The documentation barely mentions this — it took me a while to find it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The full guide
&lt;/h2&gt;

&lt;p&gt;I wrote a more detailed walkthrough on Medium covering the full setup, the Gravity Fields fetch pattern, and the telemetry configuration: &lt;a href="https://medium.com/p/02aa7c97e131" rel="noopener noreferrer"&gt;https://medium.com/p/02aa7c97e131&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What's your approach for UI as a solo dev — do you start from the design or the logic?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>productivity</category>
    </item>
    <item>
      <title>From DALL-E to gpt-image-2: The Architectural Bet That Finally Fixed AI Text</title>
      <dc:creator>Davide Mibelli</dc:creator>
      <pubDate>Thu, 23 Apr 2026 21:04:44 +0000</pubDate>
      <link>https://dev.to/kharonte/from-dall-e-to-gpt-image-2-the-architectural-bet-that-finally-fixed-ai-text-5347</link>
      <guid>https://dev.to/kharonte/from-dall-e-to-gpt-image-2-the-architectural-bet-that-finally-fixed-ai-text-5347</guid>
      <description>&lt;p&gt;&lt;em&gt;This article was originally published on Medium.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Two years ago, if you asked an AI to design a menu for a Mexican restaurant, you’d get a beautiful layout of “enchuita” and “churiros.” It looked like food, and the font looked like letters, but it was essentially a visual fever dream. The “burrto” became a classic meme in dev circles — a reminder that while AI could paint like Caravaggio, it had the literacy of a toddler.&lt;/p&gt;

&lt;p&gt;Yesterday, OpenAI launched ChatGPT Images 2.0 (gpt-image-2). I ran the same test. The menu was perfect. Not just the spelling, but the hierarchy, the prices, and the specialized diacritics. It is no longer just “generating pixels.” It is communicating.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fph1qz3v3md8n9mjm3pet.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fph1qz3v3md8n9mjm3pet.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This isn’t a minor version bump or a better training set. It’s a total architectural pivot that signals the end of an era. If you’ve spent the last three years building workflows around diffusion models, it’s time to rethink your pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Why text was broken (and how they fixed it)
&lt;/h2&gt;

&lt;p&gt;To understand why gpt-image-2 works, you have to understand why DALL-E 3 failed at spelling. Diffusion models — the tech behind almost every major generator until now — work by denoising. They start with static and try to “find” an image. Because text pixels make up a tiny fraction of a training image, the model learned the texture of text rather than the logic of characters. To a diffusion model, an “A” is just a specific arrangement of lines, not a semantic unit.&lt;/p&gt;

&lt;p&gt;OpenAI has quietly abandoned diffusion. While they won’t officially confirm the guts of the system, the PNG metadata and the model’s behavior tell the story: this is an autoregressive model.&lt;/p&gt;

&lt;p&gt;It generates images the same way GPT-4 generates code — by predicting the next token. By integrating image generation directly into the language model pipeline, the model isn’t “drawing” a word; it’s “writing” an image. When the architecture treats a pixel and a letter as parts of the same conceptual stream, the “enchuita” problem simply vanishes.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. The end of the CSS overlay hack
&lt;/h2&gt;

&lt;p&gt;For those of us in agency work or product dev, AI images have always been a “background only” tool. If a client wanted a marketing banner with a specific CTA, we’d generate the art, then use a graphics library or CSS to overlay the text. It was the only way to ensure the brand name wasn’t spelled “Gooogle.”&lt;/p&gt;

&lt;p&gt;Gpt-image-2 changes that calculus. With near-perfect rendering of Latin, Kanji, and Hindi scripts, the “post-processing” stage of the workflow is suddenly on the chopping block. You can now generate multi-paneled assets or social media posts where the text is baked into the composition with proper lighting and perspective.&lt;/p&gt;

&lt;p&gt;But there’s a catch for your budget. At approximately $0.21 per high-quality 1024x1024 render, this is roughly 60% more expensive than the previous generation. If you’re at a high-volume startup, that’s a significant line item.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1dtwi9rh2x4cnvmw4on.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1dtwi9rh2x4cnvmw4on.png" alt=" " width="800" height="1101"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Thinking before rendering
&lt;/h2&gt;

&lt;p&gt;The most impressive part of the new model isn’t the resolution — it’s the “thinking mode.” Borrowed from reasoning models like o3, the generator now spends compute time planning the layout before it touches a single pixel.&lt;/p&gt;

&lt;p&gt;I watched it handle a prompt for “a grid of six distinct objects, each with a label in a different language.” Previous models would lose count by object four and turn the labels into Sanskrit-flavored gibberish. Gpt-image-2 paused, “thought” (generating reasoning tokens), and then executed. It can count. It can follow layout constraints.&lt;/p&gt;

&lt;p&gt;This moves AI generation from “creative toy” to “reliable infrastructure.” Reliability is what we actually need in production. I’d much rather pay more for a single correct image than spend credits on ten “cheap” re-rolls.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. The DALL-E eulogy
&lt;/h2&gt;

&lt;p&gt;OpenAI is shutting down DALL-E 2 and 3 on May 12, 2026. Not moving them to a legacy tier — shutting them down.&lt;/p&gt;

&lt;p&gt;This is a massive signal. It’s an admission that the diffusion approach hit a ceiling that no amount of fine-tuning could break. By retiring the DALL-E brand in favor of a unified ChatGPT Image model, OpenAI is betting that the future of Multimodality is a single, unified architecture.&lt;/p&gt;

&lt;p&gt;The wall between “thinking” and “seeing” is being torn down. We used to have a brain (LLM) that sent instructions to a hand (Diffusion model). Now, the brain is doing the drawing itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. What I’m still worried about
&lt;/h2&gt;

&lt;p&gt;Despite the polish, there are gaps. The knowledge cutoff is December 2025. If you need a render involving a trend or news event from early 2026, you’re reliant on the web search tool, which adds latency and even more cost.&lt;/p&gt;

&lt;p&gt;Furthermore, the pricing model is now “tokenized” for images. Thinking mode adds a variable cost based on how many reasoning tokens the model uses to plan the composition. This makes it incredibly hard to predict API costs for complex apps. You aren’t just paying for an image; you’re paying for the “brain power” required to frame it.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. The 2026 reality check
&lt;/h2&gt;

&lt;p&gt;If you are building a simple placeholder tool, stick to cheaper, older models. But for any workflow where the image is the content — marketing, UI prototyping, or localized assets — the shift to autoregressive generation is a one-way door.&lt;/p&gt;

&lt;p&gt;We’re entering a phase where the term “image model” feels dated. We just have models. They happen to output pixels sometimes and Python code others. The fact that it can finally spell “Burrito” is just the first sign that the gap between human intent and machine execution has finally closed.&lt;/p&gt;

</description>
      <category>chatgpt</category>
      <category>ai</category>
      <category>productivity</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
