<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Piyush Kumar Singh</title>
    <description>The latest articles on DEV Community by Piyush Kumar Singh (@piyushsingh_dev).</description>
    <link>https://dev.to/piyushsingh_dev</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3909333%2Fe0081e0c-5341-42fe-b711-92dafd3e0fdb.jpg</url>
      <title>DEV Community: Piyush Kumar Singh</title>
      <link>https://dev.to/piyushsingh_dev</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/piyushsingh_dev"/>
    <language>en</language>
    <item>
      <title>Spring AI Explained — ChatClient, RAG, Advisors, and Every Core Component</title>
      <dc:creator>Piyush Kumar Singh</dc:creator>
      <pubDate>Mon, 18 May 2026 04:30:23 +0000</pubDate>
      <link>https://dev.to/piyushsingh_dev/spring-ai-explained-chatclient-rag-advisors-and-every-core-component-14dl</link>
      <guid>https://dev.to/piyushsingh_dev/spring-ai-explained-chatclient-rag-advisors-and-every-core-component-14dl</guid>
      <description>&lt;p&gt;Most Spring AI tutorials jump straight to code. You copy the dependency, paste the config, call ChatClient, and something works. But when you need to actually build something — a chatbot that remembers conversations, an API that answers questions from your own documents — you hit a wall. Because you don't know what's actually doing what. Friend’s Link&lt;/p&gt;

&lt;h2&gt;
  
  
  What Spring AI actually is — in one sentence
&lt;/h2&gt;

&lt;p&gt;Spring AI is an abstraction layer that lets you wire LLMs into your Spring Boot app without hardcoding any particular AI provider.&lt;/p&gt;

&lt;p&gt;That last part matters. OpenAI, Google Gemini, Anthropic Claude, and Ollama are running locally on your machine — Spring AI talks to all of them through the same API. Swap providers without touching your business logic. That’s the entire value proposition, and everything else is built on top of it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm7k5ulejjqctedmvvu7w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm7k5ulejjqctedmvvu7w.png" alt=" " width="680" height="580"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Spring AI Components
&lt;/h2&gt;

&lt;p&gt;ChatClient — the front door&lt;br&gt;
ChatClient is the component you'll interact with the most. It's the fluent API that sits at the top of the stack and handles the actual request-response cycle with the LLM.&lt;/p&gt;

&lt;p&gt;Think of it like a RestTemplate or a WebClient— but instead of calling a REST endpoint, you're sending a prompt and getting a response back. It handles all the low-level connection details, request formatting, and response parsing so you don't have to.&lt;/p&gt;

&lt;p&gt;What makes ChatClient genuinely well-designed is its fluent builder style. You don't configure it once globally and hope for the best. Each call is composable — you can set the system prompt, attach advisors, pass user input, and control the output format all in one readable chain.&lt;/p&gt;

&lt;p&gt;It also separates two things that often get conflated: the default configuration you set at startup (your system prompt, default advisors, model parameters) and the per-request configuration you apply at call time. That separation matters in production, where different endpoints need different behaviours from the same underlying client.&lt;/p&gt;

&lt;h2&gt;
  
  
  PromptTemplate — how you talk to the LLM properly
&lt;/h2&gt;

&lt;p&gt;A raw string shoved into an LLM is not a prompt. A prompt is a structured piece of text with placeholders, context, and instructions — and this PromptTemplate is how Spring AI handles that.&lt;/p&gt;

&lt;p&gt;The idea is simple: you define a template with variables, and at runtime, you fill those variables in. Instead of building prompt strings with Java string concatenation — which gets messy fast — you define the shape of the prompt separately from the data that goes into it.&lt;/p&gt;

&lt;p&gt;This matters for three reasons. First, it keeps prompts readable and maintainable. Second, it separates the “what to ask” from the “what data to inject” which is the same separation concerns you apply everywhere else in your codebase. Third, it makes prompt versioning possible. When your prompt needs tweaking, you’re editing a template, not hunting through business logic.&lt;/p&gt;

&lt;p&gt;PromptTemplate also gives you a proper Prompt object that carries both the human message and the system message. That distinction — system prompt (the instructions) vs user prompt (the question) — is one of the most important things to understand when working with LLMs, and Spring AI models it explicitly.&lt;/p&gt;

&lt;h2&gt;
  
  
  EmbeddingModel — the piece that makes search smart
&lt;/h2&gt;

&lt;p&gt;An EmbeddingModel takes text and converts it into a vector — a list of floating point numbers that represents the meaning of that text in multi-dimensional space.&lt;/p&gt;

&lt;p&gt;That sounds abstract. Here’s the concrete thing to grasp: two pieces of text that mean similar things will produce vectors that are close to each other mathematically. “What’s your return policy?” and “How do I get a refund?” are different strings, but their vectors will be very close — because semantically, they’re the same question.&lt;/p&gt;

&lt;p&gt;This is what makes semantic search possible. Traditional search matches keywords. Embedding-based search matches meaning. A user asking “how do I cancel my order” will find a document titled “Order cancellation policy” even if the words don’t overlap, because the meanings are geometrically close in vector space.&lt;/p&gt;

&lt;p&gt;In Spring AI, EmbeddingModel is the interface that abstracts over whatever embedding service you're using — OpenAI's text-embedding-ada-002, Gemini's embedding API, or a local model via Ollama. The abstraction is consistent regardless of provider, which means your RAG pipeline doesn't break if you switch models.&lt;/p&gt;

&lt;h2&gt;
  
  
  VectorStore — where embeddings live
&lt;/h2&gt;

&lt;p&gt;VectorStore is the database for embeddings. You put vectors in, and you query them by similarity — "give me the top 5 stored vectors that are closest to this query vector."&lt;/p&gt;

&lt;p&gt;It’s worth understanding that this is a fundamentally different kind of database from what you’re used to. You don’t query it with SQL. You don’t look things up by ID. You ask: which stored content is most semantically similar to this input? And it returns the matches ranked by similarity score.&lt;/p&gt;

&lt;p&gt;Spring AI’s VectorStore interface abstracts over the actual storage engine underneath. In development, you might use SimpleVectorStore an in-memory implementation. In production, you'd swap to Pinecone, Weaviate, pgvector on top of Postgres, or Elasticsearch. The interface stays identical. Your code doesn't change.&lt;/p&gt;

&lt;p&gt;The VectorStore is also responsible for handling the metadata that travels alongside each vector — the document title, page number, source URL, whatever you stored at ingestion time. When it returns matching chunks, that metadata comes with it, so your prompt builder knows where the information came from.&lt;/p&gt;

&lt;h2&gt;
  
  
  Advisors — the middleware nobody talks about enough
&lt;/h2&gt;

&lt;p&gt;This is the component most tutorials skip, and it’s arguably the most powerful part of the whole framework.&lt;/p&gt;

&lt;p&gt;An Advisor in Spring AI is a piece of middleware that wraps around every ChatClient request. Before the request goes to the LLM, advisors can intercept it and modify it — add context, inject memory, apply safety rules, log the conversation, filter the input. After the response comes back, they can post-process it too.&lt;/p&gt;

&lt;p&gt;The important thing to understand is that advisors form a chain. Each one wraps the next, like servlet filters in a web application. You configure which advisors run in which order, and each one has a defined responsibility.&lt;/p&gt;

&lt;p&gt;QuestionAnswerAdvisor is the one you'll use for RAG. Before your question reaches the LLM, this advisor takes that question, queries VectorStore for the most relevant chunks, and injects them into the prompt automatically. From ChatClient's perspective, you just asked a question. Internally, your question has been enriched with your own data before the LLM ever sees it.&lt;/p&gt;

&lt;p&gt;MessageChatMemoryAdvisor is what makes conversations persistent. Without it, every call to ChatClient starts fresh — no memory of what was said before. With it, previous turns from ChatMemory are injected into each new request so the LLM has context.&lt;/p&gt;

&lt;p&gt;You can write your own advisors too. Any cross-cutting concern that applies to every LLM call — rate limiting, PII detection, response caching, A/B testing between prompts — belongs in an advisor, not in your business logic.&lt;/p&gt;

&lt;h2&gt;
  
  
  ChatMemory — giving the LLM a memory
&lt;/h2&gt;

&lt;p&gt;LLMs are stateless. Every API call is completely independent. Ask an LLM “what’s the capital of France,” then ask “what did I just ask you,” and it has no idea — because, from its perspective, that second request is the first thing you’ve ever said.&lt;/p&gt;

&lt;p&gt;ChatMemory is how Spring AI solves this. It's a storage abstraction for conversation history. After each exchange, the message — both the user's question and the LLM's response — gets saved. On the next request, that history gets loaded and injected into the prompt so the LLM has context.&lt;/p&gt;

&lt;p&gt;InMemoryChatMemory is the default — history lives in your application's heap and disappears on restart. That's fine for development and short stateless sessions. For production chatbots that need to remember users across sessions, you'd implement a persistent ChatMemory backed by Redis or a database.&lt;/p&gt;

&lt;p&gt;There’s a real constraint here worth knowing upfront: every message you inject into the conversation history costs tokens. LLMs have a context window limit — usually somewhere between 8K and 128K tokens, depending on the model. If a conversation goes on long enough, the accumulated history will either exceed the limit and fail, or you’ll need to implement a summarisation strategy to compress older messages.&lt;/p&gt;

&lt;p&gt;This is not a Spring AI problem — it’s a fundamental LLM constraint. But ChatMemory is where you manage it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo7bqasro35b8lns9oo5w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo7bqasro35b8lns9oo5w.png" alt=" " width="696" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  RAG Flow
&lt;/h2&gt;

&lt;p&gt;How RAG brings it all together&lt;br&gt;
RAG — Retrieval-Augmented Generation — is the pattern that makes Spring AI genuinely production-useful. The diagram above shows both phases. Here’s the thinking behind it.&lt;/p&gt;

&lt;p&gt;The core problem: your LLM knows nothing about your company. It doesn’t know your product documentation, your internal policies, your customer data. Fine-tuning a model on your data is expensive, slow, and goes stale every time the data changes.&lt;/p&gt;

&lt;p&gt;RAG is the pragmatic answer. Instead of teaching the model your data, you just hand it the relevant pages at the moment it needs them. Like giving a contractor a specific clause from the contract rather than asking them to memorise the whole thing.&lt;/p&gt;

&lt;p&gt;The ingestion phase runs once, or whenever your data changes. Your documents are loaded, split into manageable chunks, embedded into vectors, and stored in a VectorStore. This is how your data gets indexed for semantic retrieval.&lt;/p&gt;

&lt;p&gt;The query phase runs on every request. The user’s question is embedded into a vector. That vector is used to query the VectorStore for the closest matching chunks. Those chunks — plus the original question — get injected into the prompt. The LLM reads them as context and answers based on what it finds there.&lt;/p&gt;

&lt;p&gt;The LLM never “learned” your data. It reads it fresh on each request, like an open-book exam. That framing matters because it sets the right expectations: if the relevant information isn’t in the retrieved chunks, the model will still try to answer — and that’s when hallucinations happen. RAG reduces hallucinations by providing grounding. It doesn’t eliminate them.&lt;/p&gt;

&lt;p&gt;The part that controls retrieval quality isn’t the LLM and isn’t the vector database — it’s the chunking strategy. How you split your documents determines what gets retrieved. A chunk that’s too large buries the relevant detail in noise. A chunk too small loses the surrounding context that makes it meaningful. Getting chunking right is usually where the real tuning work happens.&lt;/p&gt;

&lt;h2&gt;
  
  
  The one-line mental model for each component
&lt;/h2&gt;

&lt;p&gt;ChatClient — you talk to the LLM through this. PromptTemplate — You structure what you say. EmbeddingModel — converts meaning into math. VectorStore — stores and searches that math. Advisors — middleware that enriches every request automatically. ChatMemory — gives the conversation a past. Together, they’re the full stack for building LLM features that actually behave like software — predictable, configurable, and debuggable.&lt;/p&gt;

</description>
      <category>java</category>
      <category>springboot</category>
      <category>ai</category>
      <category>webdev</category>
    </item>
    <item>
      <title>https://dev.to/piyush_kumarsingh_da3833/how-redis-actually-works-ram-single-thread-and-the-expiry-behavior-nobody-explains-2j4n</title>
      <dc:creator>Piyush Kumar Singh</dc:creator>
      <pubDate>Sun, 03 May 2026 16:48:10 +0000</pubDate>
      <link>https://dev.to/piyushsingh_dev/-h21</link>
      <guid>https://dev.to/piyushsingh_dev/-h21</guid>
      <description>&lt;div class="crayons-card c-embed text-styles text-styles--secondary"&gt;
    &lt;div class="c-embed__content"&gt;
        &lt;div class="c-embed__cover"&gt;
          &lt;a href="https://dev.to/piyushsingh_dev/how-redis-actually-works-ram-single-thread-and-the-expiry-behavior-nobody-explains-2j4n" class="c-link align-middle" rel="noopener noreferrer"&gt;
            &lt;img alt="" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F59plhuurbjomf48y12e1.png" height="533" class="m-0" width="800"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="c-embed__body"&gt;
        &lt;h2 class="fs-xl lh-tight"&gt;
          &lt;a href="https://dev.to/piyushsingh_dev/how-redis-actually-works-ram-single-thread-and-the-expiry-behavior-nobody-explains-2j4n" rel="noopener noreferrer" class="c-link"&gt;
            How Redis Actually Works — RAM, Single Thread, and the Expiry Behavior Nobody Explains - DEV Community
          &lt;/a&gt;
        &lt;/h2&gt;
          &lt;p class="truncate-at-3"&gt;
            A RAM read takes about 100 nanoseconds. A disk read — even on a modern SSD — takes around 100,000...
          &lt;/p&gt;
        &lt;div class="color-secondary fs-s flex items-center"&gt;
            &lt;img alt="favicon" class="c-embed__favicon m-0 mr-2 radius-0" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8j7kvp660rqzt99zui8e.png" width="300" height="299"&gt;
          dev.to
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;


</description>
    </item>
    <item>
      <title>How Redis Actually Works — RAM, Single Thread, and the Expiry Behavior Nobody Explains</title>
      <dc:creator>Piyush Kumar Singh</dc:creator>
      <pubDate>Sun, 03 May 2026 16:45:01 +0000</pubDate>
      <link>https://dev.to/piyushsingh_dev/how-redis-actually-works-ram-single-thread-and-the-expiry-behavior-nobody-explains-2j4n</link>
      <guid>https://dev.to/piyushsingh_dev/how-redis-actually-works-ram-single-thread-and-the-expiry-behavior-nobody-explains-2j4n</guid>
      <description>&lt;p&gt;A RAM read takes about 100 nanoseconds. A disk read — even on a modern SSD — takes around 100,000 nanoseconds. That single gap explains most of Redis’s speed, before it does a single thing clever. Friend’s Link&lt;/p&gt;

&lt;p&gt;But RAM alone isn’t the full story. The other half is a design decision that looks like a limitation on paper — and turns out to be one of the smartest choices in the codebase. More on that in a moment. Here’s what’s actually happening inside Redis when your app talks to it.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Why is Redis so fast?&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe0kih88g61ntm1pqmq58.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe0kih88g61ntm1pqmq58.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The first reason is obvious once you hear it: Redis keeps everything in RAM. Your PostgreSQL instance, however well-tuned, writes to disk. Redis doesn’t. Every key lives in memory, which is why a GET on a Redis key can return in under a millisecond even under load. There’s no disk seek, no page cache miss, no I/O wait. But here’s where most explanations stop — and they shouldn’t.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Single-threaded — and that’s the point&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Redis processes one command at a time—one thread. No parallelism, no concurrency. That sounds like a bottleneck. It’s actually a feature.&lt;/p&gt;

&lt;p&gt;In a multi-threaded system, shared state requires locks. Locks mean threads waiting on each other. Waiting introduces latency spikes that are hard to reproduce and harder to debug. Redis avoids the entire problem by never having two threads compete for the same data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# These three clients connect simultaneously
Client 1: SET counter 100     ← executes fully first
Client 2: INCR counter        ← executes next, sees 100, returns 101
Client 3: GET counter         ← executes last, returns 101
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The order is deterministic. Always. You can reason about it. With threads and locks, you can’t—not without careful synchronization, which adds complexity and latency.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Redis is fast, not just because of RAM, but because it never waits on itself.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;The six data structures — with the tradeoffs that actually matter&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Redis isn’t just strings. Each data structure solves a specific problem, and knowing when to pick one over another is more useful than knowing the commands.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;String&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SET user:1001:name "John"
GET user:1001:name        # "John"
SET page:views 0
INCR page:views           # atomic - safe under concurrent load
GET page:views            # "1"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The default choice for simple values, flags, and counters. INCR is atomic — a thousand clients calling it simultaneously will never produce a wrong count.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hash&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;HSET user:1001 name "John" email "j@example.com" role "admin"
HGET user:1001 name       # "John"
HGETALL user:1001         # all fields
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A Hash is better than a String when you have a structured object with multiple fields you’ll update independently. If you stored this as a JSON string, updating a single field means deserializing the whole blob, changing one value, and reserializing. A Hash lets you update one field with one command.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;List&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;RPUSH notifications:1001 "Order shipped"
RPUSH notifications:1001 "Payment received"
LRANGE notifications:1001 0 -1   # all items in order
LPOP notifications:1001           # "Order shipped"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Lists maintain insertion order. Use them for notification feeds, activity timelines, and simple job queues where you’re okay with at-most-once delivery. If you need guaranteed delivery, a List isn’t enough — use Kafka or RabbitMQ.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sorted Set&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ZADD leaderboard 9500 "alice"
ZADD leaderboard 11200 "John"
ZREVRANGE leaderboard 0 2 WITHSCORES
# John    11200
# alice   9500
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every member has a score. Redis keeps them sorted automatically. Real-time leaderboards, priority queues, and rate limiting windows — sorted sets handle all three. The reason to reach for this over a List is when rank or score matters, not just insertion order.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Set&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SADD online:users "user:1001" "user:1002"
SISMEMBER online:users "user:1002"   # 1 (true)
SCARD online:users                   # 2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No duplicates, O(1) membership check. Good for tracking online users, visited pages, or anything where “is X in this group” is the question. Use a Set over a List when you need uniqueness and don’t care about order.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;HyperLogLog&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PFADD page:views:home "ip1" "ip2" "ip3" "ip1"
PFCOUNT page:views:home   # 3 (deduplicated)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;HyperLogLog gives you approximate unique counts using a fixed 12KB of memory — regardless of whether you have 1,000 or 100 million unique values. A plain Set would work too, but each unique member consumes memory. For a site with 50 million daily visitors, the Set version could eat gigabytes. HyperLogLog stays at 12KB with a ~0.81% error margin. That tradeoff is almost always worth it for analytics.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ferwpwba7a76prrnfvyu8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ferwpwba7a76prrnfvyu8.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;**&lt;/p&gt;

&lt;h2&gt;
  
  
  TTL and the expiry behavior nobody explains
&lt;/h2&gt;

&lt;p&gt;**&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SET session:abc123 "user_data" EX 3600   # expires in 1 hour
TTL session:abc123                        # 3597
# one hour later
GET session:abc123                        # (nil)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Most developers assume Redis runs a background job that scans for expired keys and deletes them at the exact moment of expiry. It doesn’t. That would be expensive — imagine scanning millions of keys every second. Instead, Redis uses two strategies in parallel:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lazy deletion&lt;/strong&gt;: When you read a key, Redis checks its expiry first. If it’s expired, Redis deletes it right then and returns nil. Memory is reclaimed at access time, not expiry time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Active sampling&lt;/strong&gt;: Every 100ms, Redis randomly picks 20 keys that have TTLs set. If more than 25% of them are expired, it runs the loop again immediately. It keeps looping until the expired ratio drops below 25%.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The consequence&lt;/strong&gt;: if you have 10 million keys expiring at 3 am and nothing reads them, the active sampler will gradually clean them up over the following minutes. Your memory won’t drop instantly. If you’re sizing Redis memory around key expiry, that lag is real, and you need to account for it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Persistence — what survives a restart
&lt;/h2&gt;

&lt;p&gt;Redis lives in RAM. Restart the process, lose everything — unless you’ve configured persistence.&lt;/p&gt;

&lt;h2&gt;
  
  
  RDB snapshots
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="c"&gt;# redis.conf
&lt;/span&gt;&lt;span class="n"&gt;save&lt;/span&gt; &lt;span class="m"&gt;900&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;       &lt;span class="c"&gt;# snapshot if 1+ keys changed in 15 minutes
&lt;/span&gt;&lt;span class="n"&gt;save&lt;/span&gt; &lt;span class="m"&gt;300&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;      &lt;span class="c"&gt;# snapshot if 10+ keys changed in 5 minutes
&lt;/span&gt;&lt;span class="n"&gt;save&lt;/span&gt; &lt;span class="m"&gt;60&lt;/span&gt; &lt;span class="m"&gt;10000&lt;/span&gt;    &lt;span class="c"&gt;# snapshot if 10000+ keys changed in 1 minute
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Redis forks a child process and writes everything to dump.rdb. Fast to recover from. The risk: if your server crashes between snapshots, you lose whatever happened in that window. Fine for cache. Not fine for anything where losing recent writes matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  AOF — Append Only File
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="c"&gt;# redis.conf
&lt;/span&gt;&lt;span class="n"&gt;appendonly&lt;/span&gt; &lt;span class="n"&gt;yes&lt;/span&gt;
&lt;span class="n"&gt;appendfsync&lt;/span&gt; &lt;span class="n"&gt;everysec&lt;/span&gt;   &lt;span class="c"&gt;# flush to disk every second
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every write command gets appended to a log. On restart, Redis replays the log. With every second, you lose at most one second of data. With always, you lose nothing, but your write throughput drops noticeably.&lt;/p&gt;

&lt;h2&gt;
  
  
  Production setup — use both
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="n"&gt;save&lt;/span&gt; &lt;span class="m"&gt;900&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;appendonly&lt;/span&gt; &lt;span class="n"&gt;yes&lt;/span&gt;
&lt;span class="n"&gt;appendfsync&lt;/span&gt; &lt;span class="n"&gt;everysec&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;RDB handles fast restarts. AOF handles durability. Together, they cover both failure modes without adding much overhead.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Spring Boot integration — with the why behind the config&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Dependencies and connection&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;!-- pom.xml --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.springframework.boot&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;spring-boot-starter-data-redis&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
# application.yml
spring:
  redis:
    host: localhost
    port: 6379
    timeout: 2000ms
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;RedisTemplate — why you need a custom serializer&lt;/strong&gt;&lt;br&gt;
By default, Spring uses Java serialization for values. That works, but it stores class names alongside data, making keys unreadable and tying you to your class structure. Switch to JSON serialization so your data is readable outside Spring too:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Bean&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;RedisTemplate&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;redisTemplate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;RedisConnectionFactory&lt;/span&gt; &lt;span class="n"&gt;factory&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;RedisTemplate&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;template&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;RedisTemplate&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;
    &lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setConnectionFactory&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;factory&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setKeySerializer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;StringRedisSerializer&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="c1"&gt;// Jackson JSON instead of Java serialization&lt;/span&gt;
    &lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setValueSerializer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;GenericJackson2JsonRedisSerializer&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="nd"&gt;@Cacheable&lt;/span&gt; &lt;span class="err"&gt;—&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;two&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt; &lt;span class="n"&gt;layer&lt;/span&gt;
&lt;span class="nc"&gt;Spring&lt;/span&gt;&lt;span class="err"&gt;’&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="n"&gt;caching&lt;/span&gt; &lt;span class="n"&gt;abstraction&lt;/span&gt; &lt;span class="n"&gt;lets&lt;/span&gt; &lt;span class="n"&gt;you&lt;/span&gt; &lt;span class="n"&gt;add&lt;/span&gt; &lt;span class="nc"&gt;Redis&lt;/span&gt; &lt;span class="n"&gt;caching&lt;/span&gt; &lt;span class="n"&gt;without&lt;/span&gt; &lt;span class="n"&gt;touching&lt;/span&gt; &lt;span class="n"&gt;repository&lt;/span&gt; &lt;span class="n"&gt;logic&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="nc"&gt;The&lt;/span&gt; &lt;span class="n"&gt;first&lt;/span&gt; &lt;span class="n"&gt;call&lt;/span&gt; &lt;span class="n"&gt;hits&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;database&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;every&lt;/span&gt; &lt;span class="n"&gt;subsequent&lt;/span&gt; &lt;span class="n"&gt;call&lt;/span&gt; &lt;span class="n"&gt;with&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;same&lt;/span&gt; &lt;span class="no"&gt;ID&lt;/span&gt; &lt;span class="n"&gt;returns&lt;/span&gt; &lt;span class="n"&gt;from&lt;/span&gt; &lt;span class="nl"&gt;Redis:&lt;/span&gt;

&lt;span class="nd"&gt;@Cacheable&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"users"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"#id"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt; &lt;span class="nf"&gt;getUserById&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Long&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;userRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findById&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;orElseThrow&lt;/span&gt;&lt;span class="o"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;RuntimeException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"User not found"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="nd"&gt;@CacheEvict&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"users"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"#user.id"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt; &lt;span class="nf"&gt;updateUser&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;User&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;userRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;save&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;   &lt;span class="c1"&gt;// evicts stale cache on update&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="nd"&gt;@SpringBootApplication&lt;/span&gt;
&lt;span class="nd"&gt;@EnableCaching&lt;/span&gt;   &lt;span class="c1"&gt;// don't forget this&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Application&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Rate limiting with sorted sets&lt;/strong&gt;&lt;br&gt;
A sliding window rate limiter is one of Redis’s cleanest use cases. The sorted set score is the timestamp — so you can count requests within a time window with a range query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="nf"&gt;isAllowed&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;maxRequests&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;windowSeconds&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"ratelimit:"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;currentTimeMillis&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;windowStart&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;windowSeconds&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="nc"&gt;ZSetOperations&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ops&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redisTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;opsForZSet&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="n"&gt;ops&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;removeRangeByScore&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;windowStart&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;   &lt;span class="c1"&gt;// drop old requests&lt;/span&gt;
    &lt;span class="nc"&gt;Long&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ops&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;zCard&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;maxRequests&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;ops&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;valueOf&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;redisTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;expire&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;windowSeconds&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;TimeUnit&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;SECONDS&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;**&lt;/p&gt;

&lt;h2&gt;
  
  
  When your tech lead says “add Redis,” — ask this first
&lt;/h2&gt;

&lt;p&gt;**&lt;br&gt;
There’s a version of this story everyone knows: the tech lead says “add Redis,” you add Redis, and something gets faster. Nobody questions it. But Redis has real constraints, and using it wrong is a common way to create problems that look like infrastructure issues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Don’t use it as your primary database&lt;/strong&gt;. Redis has no foreign keys, no joins, no complex queries. If your data has relationships, use a relational database. Redis is the layer on top, not the foundation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Don’t store large values&lt;/strong&gt;. Redis works well with small, hot data. A 5MB JSON blob in Redis is possible and wasteful — you’re burning expensive RAM, hurting the event loop for every other client, and making serialization your bottleneck.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Don’t use pub/sub for anything you can’t afford to lose&lt;/strong&gt;. Redis pub/sub has no persistence. If a subscriber goes offline for 30 seconds, those messages are gone. Use Kafka or RabbitMQ when reliability matters.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Set a memory limit and eviction policy&lt;/strong&gt; — always. Without it, Redis will reject writes when it runs out of memory, and that failure mode is jarring in production:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# redis.conf
maxmemory 2gb
maxmemory-policy allkeys-lru   # evict least recently used keys when full
**

## ```


The one line that ties it all together
**
Redis is fast because it stays in RAM and never waits for itself. Use it for caching, sessions, leaderboards, rate limiting, and lightweight pub/sub, where dropped messages are acceptable. Don’t ask it to be your source of truth. Understand those two constraints, and Redis stops being magic — it becomes a predictable tool that does exactly what you’d expect. That’s a good thing.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>redis</category>
      <category>softwareengineering</category>
      <category>database</category>
      <category>backend</category>
    </item>
    <item>
      <title>How Spring Security Works Internally (Filters, Authentication &amp; Authorization Explained)</title>
      <dc:creator>Piyush Kumar Singh</dc:creator>
      <pubDate>Sat, 02 May 2026 16:57:49 +0000</pubDate>
      <link>https://dev.to/piyushsingh_dev/how-spring-security-works-internally-filters-authentication-authorization-explained-2686</link>
      <guid>https://dev.to/piyushsingh_dev/how-spring-security-works-internally-filters-authentication-authorization-explained-2686</guid>
      <description>&lt;p&gt;If you have worked with Spring Boot for a while, you have used Spring Security without fully tracing what happens inside it.&lt;/p&gt;

&lt;p&gt;You add a dependency, configure a SecurityFilterChain, and wire a UserDetailsService, and your APIs are suddenly protected. It works. But under the hood, there is a very disciplined flow that decides who the user is, whether the password is valid, and whether the request should even reach your controller.&lt;/p&gt;

&lt;p&gt;Once that internal flow clicks, Spring Security stops feeling magical and starts feeling predictable.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;The Big Picture&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Every incoming HTTP request does not go straight to your controller. Before that request reaches DispatcherServlet, it passes through the servlet filter chain. Spring Security plugs itself into that chain and intercepts the request early.&lt;/p&gt;

&lt;p&gt;That matters because security decisions should happen before business logic runs. The flow looks like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0kvvss3j3my60eq7hips.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0kvvss3j3my60eq7hips.webp" alt=" " width="800" height="767"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That is the full security journey in one line: intercept, authenticate, authorize, then continue.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: The Request Enters the Servlet Filter Chain&lt;/strong&gt;&lt;br&gt;
When a client sends a request, Tomcat receives it first. Tomcat then passes it through a chain of servlet filters.&lt;/p&gt;

&lt;p&gt;These filters are not specific to Spring Security. They are part of the servlet infrastructure. Any framework can register filters here. Spring Security registers one important filter called FilterChainProxy.&lt;/p&gt;

&lt;p&gt;You can think of FilterChainProxy as the front desk for all Spring Security logic. It does not do all the security work itself. Instead, it decides which internal security filters should handle the request.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: FilterChainProxy Picks the Right SecurityFilterChain&lt;/strong&gt;&lt;br&gt;
This is a key part that many developers miss. Spring Security does not always use one universal chain for every request. It can maintain multiple SecurityFilterChain configurations, each tied to different URL patterns or request matchers.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;p&gt;/api/** may use JWT authentication&lt;br&gt;
/admin/** may require stricter role checks&lt;br&gt;
/login may use a form login&lt;br&gt;
FilterChainProxy checks the request and selects the correct chain using RequestMatcher. That means Spring Security is not just a collection of filters. It is a smart router for security filters.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Authentication Filter Extracts Credentials&lt;/strong&gt;&lt;br&gt;
Once the correct security chain is selected, one of the authentication filters takes over. In the classic username-password login flow, that filter is usually UsernamePasswordAuthenticationFilter.&lt;/p&gt;

&lt;p&gt;Its job is simple:&lt;/p&gt;

&lt;p&gt;Read the username and password from the request&lt;br&gt;
Create an unauthenticated Authentication object&lt;br&gt;
Pass that object to AuthenticationManager&lt;br&gt;
At this point, the user is not yet trusted. The filter has only collected credentials. Verification still has to happen. This distinction is important. Extracting credentials and validating credentials are two separate responsibilities.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4: AuthenticationManager Coordinates Authentication&lt;/strong&gt;&lt;br&gt;
AuthenticationManager is the entry point for authentication logic. In most applications, the default implementation is ProviderManager.&lt;/p&gt;

&lt;p&gt;ProviderManager does not usually authenticate the user directly. Instead, it delegates to one of the configured AuthenticationProvider implementations. That design makes Spring Security flexible. Different providers can handle different authentication mechanisms:&lt;/p&gt;

&lt;p&gt;username and password&lt;br&gt;
JWT token&lt;br&gt;
OAuth2 login&lt;br&gt;
LDAP&lt;br&gt;
custom authentication rules&lt;br&gt;
When ProviderManager receives an authentication object, it loops through the registered providers and calls supports() on each one.&lt;/p&gt;

&lt;p&gt;The first provider that says, “Yes, I know how to handle this type of authentication,” gets the job. Then authenticate() is called on that provider.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 5: AuthenticationProvider Verifies the User&lt;/strong&gt;&lt;br&gt;
This is where the real authentication work happens. For username-password login, the provider is often DaoAuthenticationProvider.&lt;/p&gt;

&lt;p&gt;Its job usually includes two things:&lt;/p&gt;

&lt;p&gt;Load the user from a data source&lt;br&gt;
Validate the submitted password&lt;br&gt;
To load the user, it calls UserDetailsService.&lt;/p&gt;

&lt;p&gt;To validate the password, it uses PasswordEncoder.&lt;/p&gt;

&lt;p&gt;This split is one of the reasons Spring Security is so clean internally. Fetching user data and checking password hashing are handled by dedicated components, not mixed into one giant class.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 6: UserDetailsService Loads the User From the Database&lt;/strong&gt;&lt;br&gt;
UserDetailsService is a very small but important contract.&lt;/p&gt;

&lt;p&gt;Its core method is:&lt;/p&gt;

&lt;p&gt;loadUserByUsername(String username)&lt;br&gt;
This method is responsible for fetching the user from your database, external system, or custom source. It returns a UserDetails object that contains:&lt;/p&gt;

&lt;p&gt;username&lt;br&gt;
password&lt;br&gt;
roles or authorities&lt;br&gt;
account status flags, such as locked or disabled&lt;br&gt;
If the user is not found, Spring Security throws an exception, and authentication fails. At this stage, the system now knows what the stored user record looks like. Next, it needs to compare the password.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 7: PasswordEncoder Validates the Password&lt;/strong&gt;&lt;br&gt;
Spring Security does not compare raw passwords directly. Instead, it uses a PasswordEncoder such as BCryptPasswordEncoder.&lt;/p&gt;

&lt;p&gt;Here is the idea:&lt;/p&gt;

&lt;p&gt;The password submitted by the client is plain text&lt;br&gt;
The stored password in the database is hashed&lt;br&gt;
PasswordEncoder.matches(rawPassword, encodedPassword) checks if they match safely&lt;br&gt;
If the password is wrong, authentication fails.&lt;/p&gt;

&lt;p&gt;If it matches, Spring Security creates a fully authenticated Authentication object containing the user’s identity and authorities. That object is now trusted.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 8: SecurityContextHolder Stores the Authenticated User&lt;/strong&gt;&lt;br&gt;
Once authentication succeeds, Spring Security stores the authenticated user in SecurityContextHolder.&lt;/p&gt;

&lt;p&gt;This is what makes the user available for the rest of the request lifecycle.&lt;/p&gt;

&lt;p&gt;From here, other parts of the application can access the logged-in user through:&lt;/p&gt;

&lt;p&gt;SecurityContextHolder.getContext().getAuthentication()&lt;br&gt;
@AuthenticationPrincipal&lt;br&gt;
Principal in controllers&lt;br&gt;
In a regular servlet application, this security context is usually stored per request thread. That is why the controller can later know who the current user is without manually passing user details around.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 9: Authorization Happens Before the Controller&lt;/strong&gt;&lt;br&gt;
Authentication answers this question: Who are you?&lt;/p&gt;

&lt;p&gt;Authorization answers this one: What are you allowed to do?&lt;/p&gt;

&lt;p&gt;After the user is authenticated, Spring Security moves to authorization filters such as AuthorizationFilter.&lt;/p&gt;

&lt;p&gt;This stage checks whether the current user has the required role, authority, or permission for the requested resource.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;p&gt;hasRole(“ADMIN”)&lt;br&gt;
hasAuthority(“PAYMENT_READ”)&lt;br&gt;
Request matchers that restrict endpoints&lt;br&gt;
If authorization fails, Spring Security stops the request and returns an error such as 403 Forbidden. If authorization succeeds, the request continues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 10: The Request Reaches DispatcherServlet and Then the Controller&lt;/strong&gt;&lt;br&gt;
Only after authentication and authorization are complete does the request proceed to Spring MVC. Now, DispatcherServlet can route the request to the correct controller.&lt;/p&gt;

&lt;p&gt;At this point, your controller can safely assume one of two things:&lt;/p&gt;

&lt;p&gt;The endpoint is public, or&lt;br&gt;
The user has already been authenticated and authorized&lt;br&gt;
That separation is why controllers stay cleaner. Security is handled earlier in the pipeline instead of being scattered across business logic.&lt;/p&gt;

&lt;p&gt;What Happens If Authentication Fails?&lt;br&gt;
If authentication fails anywhere in the chain, Spring Security throws an authentication-related exception.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Common outcomes include:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;401 Unauthorized for unauthenticated access&lt;br&gt;
Redirect to the login page in form-based login&lt;br&gt;
custom error response in REST APIs&lt;br&gt;
The controller is never called.&lt;/p&gt;

&lt;p&gt;This is a useful mental model: failed authentication stops the request before business logic begins.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What Makes Spring Security Feel Complex?&lt;/strong&gt;&lt;br&gt;
Usually, it is not the concepts. It is the number of moving parts.&lt;/p&gt;

&lt;p&gt;There are many classes involved:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;FilterChainProxy&lt;/li&gt;
&lt;li&gt;SecurityFilterChain&lt;/li&gt;
&lt;li&gt;UsernamePasswordAuthenticationFilter&lt;/li&gt;
&lt;li&gt;AuthenticationManager&lt;/li&gt;
&lt;li&gt;ProviderManager&lt;/li&gt;
&lt;li&gt;AuthenticationProvider&lt;/li&gt;
&lt;li&gt;UserDetailsService&lt;/li&gt;
&lt;li&gt;PasswordEncoder&lt;/li&gt;
&lt;li&gt;SecurityContextHolder&lt;/li&gt;
&lt;li&gt;AuthorizationFilter
At first glance, that looks like a lot.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But if you group them by responsibility, it becomes manageable:&lt;/p&gt;

&lt;p&gt;_Filters _handle request interception&lt;br&gt;
_Manager _and providers handle authentication delegation&lt;br&gt;
_UserDetailsService _and PasswordEncoder validate identity&lt;br&gt;
_SecurityContextHolder _stores the authenticated user&lt;br&gt;
_Authorization _filters enforce access rules&lt;br&gt;
That is really the whole story.&lt;/p&gt;

&lt;p&gt;A Simple Way to Remember the Flow&lt;/p&gt;

&lt;p&gt;Use this line:&lt;/p&gt;

&lt;p&gt;Request comes in -&amp;gt; filter intercepts -&amp;gt; credentials extracted -&amp;gt; manager delegates -&amp;gt; provider authenticates -&amp;gt; context stores user -&amp;gt; authorization checks access -&amp;gt; controller runs&lt;/p&gt;

&lt;p&gt;If you remember that sentence, you already understand the internals better than many developers who use Spring Security every day.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Takeaway
&lt;/h2&gt;

&lt;p&gt;Spring Security works like a layered checkpoint system. It intercepts the request before your application code sees it, verifies identity using providers and encoders, stores the authenticated user in a security context, checks permissions, and only then allows the request to hit the controller. Once you understand that flow, the framework feels a lot less intimidating.&lt;/p&gt;

</description>
      <category>java</category>
      <category>spring</category>
      <category>springsecurity</category>
      <category>backend</category>
    </item>
  </channel>
</rss>
