<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Redis</title>
    <description>The latest articles on DEV Community by Redis (@redis).</description>
    <link>https://dev.to/redis</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F1284%2Fe3860d38-3900-46c6-a051-1fb66704157c.png</url>
      <title>DEV Community: Redis</title>
      <link>https://dev.to/redis</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/redis"/>
    <language>en</language>
    <item>
      <title>Building Reliable Agents with the Transactional Outbox Pattern and Redis Streams</title>
      <dc:creator>Ricardo Ferreira</dc:creator>
      <pubDate>Fri, 27 Mar 2026 15:06:01 +0000</pubDate>
      <link>https://dev.to/redis/building-reliable-agents-with-the-transactional-outbox-pattern-and-redis-streams-45e6</link>
      <guid>https://dev.to/redis/building-reliable-agents-with-the-transactional-outbox-pattern-and-redis-streams-45e6</guid>
      <description>&lt;p&gt;AI agents are pretty good at deciding what should happen next, given a well-defined business workflow. In the case of a customer support agent, for example, they can read a conversation, apply a policy, and return a response like "approve the refund" or "escalate this case." That part is exciting, and it is usually what gets demoed first. But the hard part starts right after the decision is made.&lt;/p&gt;

&lt;p&gt;In a real system, a decision only matters if the rest of the platform can trust it. If an agent decides a customer should get a refund, that decision still has to turn into real work across the rest of the application. The support case needs to be updated, and billing needs to issue the refund. The customer may need an email, and the CRM probably needs the updated status, too.&lt;/p&gt;

&lt;p&gt;If the app updates the case and then crashes before billing gets the event, you now have a case that says "refund approved" and a customer who never actually got refunded. That is the kind of bug that makes a system feel flaky even when the model made the right call. But the worst part is the damage to customer experience. I would be really mad at the company if this happened to me.&lt;/p&gt;

&lt;p&gt;For scenarios like this, the &lt;a href="https://microservices.io/patterns/data/transactional-outbox.html" rel="noopener noreferrer"&gt;Transactional Outbox&lt;/a&gt; pattern exists. Instead of treating "update the case" and "tell the rest of the system" as two separate operations, we commit them together and let the rest of the platform react asynchronously afterward. This pattern became fairly famous in the context of microservices, as they often need a reliable way to hand off tasks. I think the pattern is also useful for agents, because the fundamental problem is the same.&lt;/p&gt;

&lt;p&gt;In this post, I will discuss the Transactional Outbox pattern in the context of agents and provide an opinionated view of why I believe it is a best practice for agentic applications. I will discuss the pattern around the following question: once an agent makes a business decision, how can you ensure the rest of the system can rely on it?&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem is the handoff
&lt;/h2&gt;

&lt;p&gt;Developers often stress about designing systems that can survive unimaginable production incidents. But the reality is, you don't need a major outage or some exotic distributed systems incident to witness the worst. Sometimes it is as simple as the need for one service to try to do two related things in two separate steps. Like, first, it updates the business state, then it publishes an event for downstream systems.&lt;/p&gt;

&lt;p&gt;That looks harmless until something fails in between. If the state update succeeds and the event publish does not, the source of truth has moved forward, but the rest of the workflow has not.&lt;/p&gt;

&lt;p&gt;Here is that failure in one picture:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiixo880tbxc2humoyx6u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiixo880tbxc2humoyx6u.png" alt="Diagram 1" width="800" height="368"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What makes this annoying is that nothing looks obviously broken at first. During an incident investigation, if someone checks the case record, it looks correct. The problem only shows up later when billing never acts, the customer complains, or support has to manually reconcile what happened. That is why I think this is not an AI problem. It is a handoff problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  Motivation for the Transactional Outbox pattern
&lt;/h2&gt;

&lt;p&gt;The Transactional Outbox pattern exists because "save state, then publish the event" is fragile by design. The pattern gives you a cleaner contract: when business state changes, the application also writes an outbox event &lt;strong&gt;in the same atomic operation&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That one change removes the worst failure mode. You no longer end up in the state where the case changed, but the event silently disappeared. It also keeps the request path honest. The service does not have to directly coordinate billing, notifications, CRM sync, and everything else just to be correct.&lt;/p&gt;

&lt;p&gt;Instead, the request path only needs to guarantee one thing: the decision and the outbox event are committed together. Once that happens, everything else becomes recoverable instead of fragile.&lt;/p&gt;

&lt;p&gt;That is why this pattern fits agentic systems so well. Agents make decisions that trigger follow-up work, but those decisions need a durable "this happened" moment before the rest of the system can safely react.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why is "Just Retry the Publish" not enough?
&lt;/h2&gt;

&lt;p&gt;Every time I promote a discussion with developers around the Transactional Outbox pattern, I often hear them saying, "Why not just retry if the publish fails?" I think that happens because implementing the pattern correctly requires certain design decisions and technology stacks. The instinct is usually to look for a simpler alternative.&lt;/p&gt;

&lt;p&gt;For this reason, I like to stress the following: the use of retries is reasonable until you look closely at where the failure occurs. This means that retries only help if the application still knows it has something to retry. If the process crashes after the state update but before the event is durably recorded anywhere, there is nothing left to retry.&lt;/p&gt;

&lt;p&gt;That is the key difference between retries and an outbox. Retries help you deliver an event that already exists, while the outbox ensures the event exists in the first place. Once you look at it that way, the pattern feels less like a ceremony and more like basic design principles. If the business state changes, the system needs a durable record of the event that describes that change.&lt;/p&gt;

&lt;h2&gt;
  
  
  Redis Streams is great for this pattern
&lt;/h2&gt;

&lt;p&gt;Redis Streams are a good fit for this kind of outbox because they already behave like the commit log we want. You can append events to them, consume them in order, track what is pending, and let different consumer groups process the same stream independently. That matters because the outbox is not really a queue in the narrow sense. It is a commit log for business events.&lt;/p&gt;

&lt;p&gt;Admittedly, the Transactional Outbox pattern is often implemented using Apache Kafka and technologies such as Debezium. That is where the pattern became most notorious. I helped many developers implement this pattern with Kafka, and it works great for getting things done. However, because I have tons of experience with Kafka, I can say that the implementation effort can sometimes exceed the main problem they were trying to solve due to Kafka's inherent complexity. You spend more time dealing with Kafka than the actual problem.&lt;/p&gt;

&lt;p&gt;Redis Streams, on the other hand, makes that pretty natural. A single event can be appended once and then processed independently by several downstream concerns. The other reason Streams fit well is that they sit comfortably inside Redis. If your support case state also lives in Redis, the state change and the outbox append can share one commit boundary.&lt;/p&gt;

&lt;p&gt;That part is important. The pattern is strongest when the business state and the outbox live in the same datastore, because that gives you a single atomic write instead of a dual-write problem wearing different clothes. With Kafka, you would need to handle two different distributed systems: the commit log itself and the data store where the update must occur.&lt;/p&gt;

&lt;h2&gt;
  
  
  Diving deep into the architecture
&lt;/h2&gt;

&lt;p&gt;For this example, the support case state and the outbox both live in Redis. The current case state is stored in a hash, and the outbox is stored in a Redis Stream.&lt;/p&gt;

&lt;p&gt;A case key might look like &lt;code&gt;support:{tenant-acme}:case:case-123&lt;/code&gt;, while the outbox stream might be &lt;code&gt;support:{tenant-acme}:outbox&lt;/code&gt;. The use of hash tags here is important because you must be intentional about where the data will be stored in Redis. During development, you may work with a single server which is the equivalent of a single shard. The data will naturally live in the same place. However, in production, you may have a clustered Redis environment with multiple shards.&lt;/p&gt;

&lt;p&gt;The shared hash tag keeps both keys in the same slot in clustered Redis, &lt;strong&gt;which is what lets them participate in the same transaction&lt;/strong&gt;. That gives us a clean split of responsibilities. The case record tells us what is true now, and the outbox stream tells us what happened and what the rest of the platform still needs to process. Yes, a relatively simple use of key prefixes could make this entire implementation useless if not carefully thought out.&lt;/p&gt;

&lt;p&gt;From there, downstream concerns consume the stream through their own consumer groups. Billing can issue the refund, notifications can contact the customer, and CRM sync can update external systems, all without forcing the support service to orchestrate them directly in the request path.&lt;/p&gt;

&lt;p&gt;That flow looks like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F97a07t8o3e8h4p53zy4s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F97a07t8o3e8h4p53zy4s.png" alt="Diagram 2" width="800" height="515"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The thing I like most about the Transactional Outbox pattern is that it keeps responsibilities clear. The support service is responsible for making the decision durable, and the rest of the platform is responsible for responding to it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Trade-offs that are interesting to consider
&lt;/h2&gt;

&lt;p&gt;The basic implementation of the pattern is simple. The design choices around it are where things get interesting. One of the first questions you must ask is where your source of truth lives. If the support case is also in Redis, the case update and the outbox append can share one transaction. If the case lives somewhere else and Redis only holds the stream, you are back in dual-write territory.&lt;/p&gt;

&lt;p&gt;Another big choice is partitioning. It is tempting to imagine a single global outbox stream for the whole application, but that often becomes awkward in a clustered Redis setup. A per-tenant stream is often a better balance. It keeps related events together, provides useful ordering, and avoids making every transactional write depend on a single global key. It also makes querying and data retrieval a bit easier during investigation scenarios.&lt;/p&gt;

&lt;p&gt;Consumer isolation is another trade-off that is worth saying out loud. One consumer group per downstream concern is a very nice model operationally, because billing, notifications, and CRM sync can all move at their own pace. The flip side is that you now own several background workflows. Each one has lag, retries, health, and recovery behavior to think about. This is where the world of microservices cross paths again with agentic systems. Each agent is not only a set of code and resources. They also bring operational complexities that must be owned by someone.&lt;/p&gt;

&lt;p&gt;Retention matters too. An outbox is a log, and logs grow. If you trim too aggressively, you lose the replay window and the investigation history. If you never trim at all, the stream just keeps growing and eventually becomes an operational problem in its own right. Deciding how large the stream is allowed to grow must be a discussion that takes place before the app even goes to production. Not an afterthought.&lt;/p&gt;

&lt;p&gt;Durability is another place where the architecture gets real fast. If the outbox carries important business decisions like refunds, escalations, or account changes, Redis is no longer "just a cache" in this design. It is part of the system's correctness model. You must treat Redis as a single source of truth, and as such, think carefully about how to handle details like replication, failover, and geographic disasters.&lt;/p&gt;

&lt;p&gt;Finally, there is idempotency. The outbox makes the handoff reliable, but it does not magically make downstream effects exactly-once behavior in the business sense. If a worker crashes after reading but before acknowledging, another worker may retry the same event later. That means the side effect needs to be safe to run more than once. The usual instinct for developers is to write the worker as a function that hooks into the stream, pulls the latest records, and processes them as if the data is simply mutable. Nope, you must treat them as immutable objects.&lt;/p&gt;

&lt;h2&gt;
  
  
  Okay, let's see some code
&lt;/h2&gt;

&lt;p&gt;This post is not meant to be a complete implementation reference, but I know that, as a developer, looking at code helps make understanding concrete. I will try to provide the example with fewer details so you can understand the design principles. I'm sure your coding agent can help with your actual final code. Also, I will use Java because it comes naturally to me — but feel free to ask your coding agent to translate it into another language.&lt;/p&gt;

&lt;p&gt;Let's start by looking for a runtime helper class that instantiates Jedis, a Redis client for Java:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;redis.clients.jedis.RedisClient&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;redis.clients.jedis.UnifiedJedis&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RuntimeSupport&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;UnifiedJedis&lt;/span&gt; &lt;span class="nf"&gt;createJedisFromEnv&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;redisHost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getenv&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getOrDefault&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"REDIS_HOST"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"localhost"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;redisPort&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Integer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;parseInt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getenv&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getOrDefault&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"REDIS_PORT"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"6379"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;RedisClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;hostAndPort&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redisHost&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;redisPort&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, let's take a look at how keys and group naming are handled in a small constants class instead of scattering strings through the code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SupportConstants&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;STREAM_GROUP_START_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"0-0"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;BILLING_GROUP_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"billing-cg"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;NOTIFICATIONS_GROUP_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"notifications-cg"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;CRM_SYNC_GROUP_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"crm-sync-cg"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nf"&gt;SupportConstants&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For the Redis keys themselves, a small helper record keeps the slotting decision obvious:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;SupportKeys&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;caseKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;outboxKey&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;SupportKeys&lt;/span&gt; &lt;span class="nf"&gt;forCase&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;tenantId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;caseId&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;hashTag&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"{"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;tenantId&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;"}"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;SupportKeys&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="s"&gt;"support:"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;hashTag&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;":case:"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;caseId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="s"&gt;"support:"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;hashTag&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;":outbox"&lt;/span&gt;
        &lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The core write path is where the architectural idea becomes real. When the support service accepts the agent's decision, it updates the case state and appends a &lt;code&gt;RefundApproved&lt;/code&gt; event in a single Redis transaction.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;redis.clients.jedis.AbstractTransaction&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;redis.clients.jedis.Response&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;redis.clients.jedis.StreamEntryID&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;redis.clients.jedis.UnifiedJedis&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.time.Instant&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.LinkedHashMap&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Map&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Objects&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.UUID&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RefundApprovalService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;UnifiedJedis&lt;/span&gt; &lt;span class="n"&gt;jedis&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;RefundApprovalService&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;UnifiedJedis&lt;/span&gt; &lt;span class="n"&gt;jedis&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;jedis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Objects&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;requireNonNull&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jedis&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"jedis must not be null"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;RefundCommitted&lt;/span&gt; &lt;span class="nf"&gt;approveRefund&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;RefundDecision&lt;/span&gt; &lt;span class="n"&gt;decision&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;SupportKeys&lt;/span&gt; &lt;span class="n"&gt;keys&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SupportKeys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;forCase&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;decision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;tenantId&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;decision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;caseId&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;

        &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;caseFields&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LinkedHashMap&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;
        &lt;span class="n"&gt;caseFields&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"case_id"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;caseId&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;caseFields&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"customer_id"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;customerId&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;caseFields&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"refund_id"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;refundId&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;caseFields&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"status"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"refund_approved"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;caseFields&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"decision_source"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"support-agent"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;caseFields&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"updated_at"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;decidedAt&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;

        &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;outboxFields&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LinkedHashMap&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;
        &lt;span class="n"&gt;outboxFields&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"event_id"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;eventId&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;outboxFields&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"event_type"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"RefundApproved"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;outboxFields&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"case_id"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;caseId&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;outboxFields&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"customer_id"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;customerId&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;outboxFields&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"refund_id"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;refundId&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;outboxFields&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"decision_source"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"support-agent"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;outboxFields&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"occurred_at"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;decidedAt&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;AbstractTransaction&lt;/span&gt; &lt;span class="n"&gt;redisTx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;jedis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;multi&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;redisTx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;hset&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;caseKey&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;caseFields&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StreamEntryID&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;streamEntryId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
                    &lt;span class="n"&gt;redisTx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;xadd&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;outboxKey&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="nc"&gt;StreamEntryID&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;NEW_ENTRY&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;outboxFields&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

            &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;execResults&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redisTx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;exec&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;execResults&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;IllegalStateException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Refund approval transaction aborted"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;

            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;RefundCommitted&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;decision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;caseId&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt;
                    &lt;span class="n"&gt;decision&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;eventId&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt;
                    &lt;span class="n"&gt;streamEntryId&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;RefundDecision&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;tenantId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;caseId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;customerId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;refundId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;eventId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;Instant&lt;/span&gt; &lt;span class="n"&gt;decidedAt&lt;/span&gt;
    &lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;RefundDecision&lt;/span&gt; &lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;tenantId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;caseId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;customerId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;refundId&lt;/span&gt;
        &lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;RefundDecision&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;tenantId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;caseId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;customerId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;refundId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                    &lt;span class="no"&gt;UUID&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;randomUUID&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt;
                    &lt;span class="nc"&gt;Instant&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;now&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;RefundCommitted&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;caseId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;eventId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;streamEntryId&lt;/span&gt;
    &lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This one method is the whole architectural point made concrete. If the transaction does not complete, neither the case update nor the outbox event exists. If it does complete, both exist. That is the durability boundary that the rest of the workflow can rely on.&lt;/p&gt;

&lt;p&gt;Here is the same moment as a diagram:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8mhvhxgfdv7dwlfa79ht.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8mhvhxgfdv7dwlfa79ht.png" alt="Diagram 3" width="800" height="575"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;On the consumer side, the worker will act on the message written to the stream.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;redis.clients.jedis.StreamEntryID&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;redis.clients.jedis.UnifiedJedis&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;redis.clients.jedis.exceptions.JedisDataException&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;redis.clients.jedis.params.XReadGroupParams&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;redis.clients.jedis.resps.StreamEntry&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.ArrayList&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Collections&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.List&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Map&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Objects&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;static&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;clients&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;jedis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;StreamEntryID&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;XREADGROUP_UNDELIVERED_ENTRY&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;BillingConsumer&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;StreamEntryID&lt;/span&gt; &lt;span class="no"&gt;PENDING_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;StreamEntryID&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;SupportConstants&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;STREAM_GROUP_START_ID&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;StreamEntryID&lt;/span&gt; &lt;span class="no"&gt;NEW_ENTRY_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;XREADGROUP_UNDELIVERED_ENTRY&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;UnifiedJedis&lt;/span&gt; &lt;span class="n"&gt;jedis&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;BillingGateway&lt;/span&gt; &lt;span class="n"&gt;billingGateway&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;consumerName&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;BillingConsumer&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="nc"&gt;UnifiedJedis&lt;/span&gt; &lt;span class="n"&gt;jedis&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;BillingGateway&lt;/span&gt; &lt;span class="n"&gt;billingGateway&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;consumerName&lt;/span&gt;
    &lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;jedis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Objects&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;requireNonNull&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jedis&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"jedis must not be null"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;billingGateway&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Objects&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;requireNonNull&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;billingGateway&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"billingGateway must not be null"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;consumerName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Objects&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;requireNonNull&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;consumerName&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"consumerName must not be null"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;tenantId&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;InterruptedException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;outboxKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SupportKeys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;forCase&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenantId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"unused"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;outboxKey&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="n"&gt;createConsumerGroup&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outboxKey&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="o"&gt;(!&lt;/span&gt;&lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;currentThread&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;isInterrupted&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StreamMessage&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pendingEntries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;readGroup&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outboxKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;PENDING_ID&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;pendingEntries&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;processEntries&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outboxKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pendingEntries&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;

            &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StreamMessage&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;newEntries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;readGroup&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outboxKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;NEW_ENTRY_ID&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;newEntries&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;processEntries&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outboxKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;newEntries&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sleep&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200L&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;createConsumerGroup&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;outboxKey&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;jedis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;xgroupCreate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;outboxKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                    &lt;span class="nc"&gt;SupportConstants&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;BILLING_GROUP_NAME&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;StreamEntryID&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;SupportConstants&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;STREAM_GROUP_START_ID&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
                    &lt;span class="kc"&gt;true&lt;/span&gt;
            &lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;JedisDataException&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getMessage&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;contains&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"BUSYGROUP"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StreamMessage&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;readGroup&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;outboxKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;StreamEntryID&lt;/span&gt; &lt;span class="n"&gt;streamEntryID&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;XReadGroupParams&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;XReadGroupParams&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;xReadGroupParams&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Entry&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StreamEntry&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;rawEntries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;jedis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;xreadGroup&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="nc"&gt;SupportConstants&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;BILLING_GROUP_NAME&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;consumerName&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outboxKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;streamEntryID&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;parseEntries&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rawEntries&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;processEntries&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;outboxKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StreamMessage&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;entries&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;StreamMessage&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;entries&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(!&lt;/span&gt;&lt;span class="s"&gt;"RefundApproved"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fields&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"event_type"&lt;/span&gt;&lt;span class="o"&gt;)))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;jedis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;xack&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                        &lt;span class="n"&gt;outboxKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="nc"&gt;SupportConstants&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;BILLING_GROUP_NAME&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;StreamEntryID&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
                &lt;span class="o"&gt;);&lt;/span&gt;
                &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;

            &lt;span class="n"&gt;billingGateway&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;issueRefund&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fields&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"refund_id"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
                    &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fields&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"customer_id"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
                    &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fields&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"event_id"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;);&lt;/span&gt;

            &lt;span class="n"&gt;jedis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;xack&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;outboxKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                    &lt;span class="nc"&gt;SupportConstants&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;BILLING_GROUP_NAME&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;StreamEntryID&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
            &lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StreamMessage&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;parseEntries&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Entry&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StreamEntry&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;rawEntries&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rawEntries&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;rawEntries&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Collections&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;emptyList&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StreamMessage&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;entries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ArrayList&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Entry&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StreamEntry&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;streamData&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;rawEntries&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;StreamEntry&lt;/span&gt; &lt;span class="n"&gt;streamEntry&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;streamData&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getValue&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;entries&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;StreamMessage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                        &lt;span class="n"&gt;streamEntry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getID&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt;
                        &lt;span class="n"&gt;streamEntry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getFields&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;));&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;entries&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;StreamMessage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;interface&lt;/span&gt; &lt;span class="nc"&gt;BillingGateway&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;issueRefund&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;refundId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;customerId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;idempotencyKey&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;What I like most about the Transactional Outbox pattern is that it respects the actual shape of agentic systems. Agents are good at deciding what should happen next given a flow, but the platform is still responsible for turning that decision into a durable state and letting the rest of the workflow react safely. The pattern gives you a clean handoff for that.&lt;/p&gt;

&lt;p&gt;Redis Streams make it practical when your application state and the outbox both live in Redis. That doesn't make the design free of trade-offs. You still need to think about partitioning, retention, durability, lag, and idempotency. It just prevents you from thinking about dual-write problems. It gives you a system where an agent's decision becomes a durable fact before the rest of the platform starts depending on it.&lt;/p&gt;

&lt;p&gt;By applying the Transactional Outbox pattern in your agents, you can be the difference between an agent that looks clever in a demo and a system you can actually trust.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>redis</category>
      <category>ai</category>
      <category>eventdriven</category>
    </item>
    <item>
      <title>Syncing Data from Amazon DynamoDB to Redis with Apache SeaTunnel</title>
      <dc:creator>Ricardo Ferreira</dc:creator>
      <pubDate>Fri, 09 Jan 2026 17:28:17 +0000</pubDate>
      <link>https://dev.to/redis/syncing-data-from-amazon-dynamodb-to-redis-with-apache-seatunnel-4njn</link>
      <guid>https://dev.to/redis/syncing-data-from-amazon-dynamodb-to-redis-with-apache-seatunnel-4njn</guid>
      <description>&lt;p&gt;Let's say you have customer data stored in Amazon DynamoDB, which serves as the single source of truth for your application. However, now you need that same data in Redis for lightning-fast caching, real-time analytics, or perhaps to power a recommendation engine with vector search capabilities. As an example, consider this table named &lt;code&gt;mySourceTable&lt;/code&gt; stored at DynamoDB, which currently contains two items.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr58u1hxxy47v87yj9emi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr58u1hxxy47v87yj9emi.png" alt="DynamoDB Source Table" width="800" height="449"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The challenge? How do you keep them in sync without writing a massive pile of code?&lt;/p&gt;

&lt;p&gt;Personally, my ideal go-to solution is to use &lt;a href="https://redis.io/data-integration/" rel="noopener noreferrer"&gt;Redis Data Integration (RDI)&lt;/a&gt;, as I have demonstrated and explained the why in &lt;a href="https://dev.to/redis/from-postgresql-to-redis-accelerating-your-applications-with-redis-data-integration-3dn2"&gt;this other post&lt;/a&gt;. Unfortunately, at the time of writing this blog, RDI does not support Amazon DynamoDB as a source. That leaves developers with the option to write complex data integration pipelines themselves or to invest in costly (and often overpriced) ETL solutions. None of these options is pleasant.&lt;/p&gt;

&lt;p&gt;But there is another way.&lt;/p&gt;

&lt;p&gt;Enter &lt;a href="https://seatunnel.apache.org/" rel="noopener noreferrer"&gt;Apache SeaTunnel&lt;/a&gt;. An open source data integration project that makes this sync as easy as writing a config file. In this blog post, I will delve into this project and explain, step by step, how to synchronize data from Amazon DynamoDB to Redis.&lt;/p&gt;

&lt;h2&gt;
  
  
  Apache SeaTunnel: Your Open Source Data Pipeline Buddy
&lt;/h2&gt;

&lt;p&gt;Apache SeaTunnel is a multimodal, high-performance, distributed, massive data integration tool that supports both batch and streaming modes. Think of it as the universal translator for your data systems. It speaks DynamoDB, Redis, and over 100 other data systems fluently.&lt;/p&gt;

&lt;p&gt;What makes SeaTunnel special for this use case:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Zero Code&lt;/strong&gt;: Define your pipeline in simple configuration files&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Distributed&lt;/strong&gt;: Scale syncing horizontally when your data grows&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fault Tolerant&lt;/strong&gt;: Built-in checkpointing ensures no data loss&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can use SeaTunnel with two distinct configurations: standalone mode for one-off data movement, or cluster mode, which enables a data pipeline infrastructure to be up and running, always ready to execute new jobs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Option 1: The Quick Hit (Standalone Mode)
&lt;/h3&gt;

&lt;p&gt;This option is perfect for one-time migrations, local development, or scheduled batch jobs. One command, and you're done.&lt;/p&gt;

&lt;p&gt;Your first step is to create your pipeline configuration (&lt;code&gt;jobs/my.config&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hocon"&gt;&lt;code&gt;&lt;span class="nl"&gt;env&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;parallelism&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;job.mode&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"BATCH"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="nl"&gt;source&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;Amazondynamodb&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="k"&gt;url&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http://dynamodb.us-east-1.amazonaws.com"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;region&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"us-east-1"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;access_key_id&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"YOUR_ACCESS_KEY"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;secret_access_key&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"YOUR_SECRET_KEY"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;table&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mySourceTable"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;schema&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;fields&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;customerId&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;int&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;customerName&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;string&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;address&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;string&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;  
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="nl"&gt;sink&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;Redis&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;host&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;host.docker.internal&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;# or your Redis host&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;port&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;6379&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;support_custom_key&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;key&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"customer:{customerId}"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;data_type&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;hash&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run it with a single Docker command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;/jobs:/config &lt;span class="se"&gt;\&lt;/span&gt;
  apache/seatunnel:2.3.12 &lt;span class="se"&gt;\&lt;/span&gt;
  ./bin/seatunnel.sh &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="nb"&gt;local&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; /config/my.config
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command starts a new container with a standalone instance of SeaTunnel, which executes the job described in the file &lt;code&gt;my.config&lt;/code&gt;. You should see an output like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;2026-01-09 16:57:50,786 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.c.AbstractConfigLocator &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - Loading configuration &lt;span class="s1"&gt;'/opt/seatunnel/config/seatunnel.yaml'&lt;/span&gt; from System property &lt;span class="s1"&gt;'seatunnel.config'&lt;/span&gt;
2026-01-09 16:57:50,788 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.c.AbstractConfigLocator &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - Using configuration file at /opt/seatunnel/config/seatunnel.yaml
2026-01-09 16:57:50,789 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.c.c.SeaTunnelConfig   &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - seatunnel.home is /opt/seatunnel
2026-01-09 16:57:50,830 INFO  &lt;span class="o"&gt;[&lt;/span&gt;amlSeaTunnelDomConfigProcessor] &lt;span class="o"&gt;[&lt;/span&gt;main] - Dynamic slot is enabled, the schedule strategy is &lt;span class="nb"&gt;set &lt;/span&gt;to REJECT
2026-01-09 16:57:50,830 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.c.AbstractConfigLocator &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - Loading configuration &lt;span class="s1"&gt;'/opt/seatunnel/config/hazelcast.yaml'&lt;/span&gt; from System property &lt;span class="s1"&gt;'hazelcast.config'&lt;/span&gt;
2026-01-09 16:57:50,830 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.c.AbstractConfigLocator &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - Using configuration file at /opt/seatunnel/config/hazelcast.yaml
2026-01-09 16:57:50,962 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.c.AbstractConfigLocator &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - Loading configuration &lt;span class="s1"&gt;'/opt/seatunnel/config/hazelcast-client.yaml'&lt;/span&gt; from System property &lt;span class="s1"&gt;'hazelcast.client.config'&lt;/span&gt;
2026-01-09 16:57:50,962 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.c.AbstractConfigLocator &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - Using configuration file at /opt/seatunnel/config/hazelcast-client.yaml
2026-01-09 16:57:50,985 WARN  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.AddressPicker           &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;LOCAL] &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] You configured your member address as host name. Please be aware of that your dns can be spoofed. Make sure that your dns configurations are correct.
2026-01-09 16:57:50,985 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.AddressPicker           &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;LOCAL] &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Resolving domain name &lt;span class="s1"&gt;'localhost'&lt;/span&gt; to address&lt;span class="o"&gt;(&lt;/span&gt;es&lt;span class="o"&gt;)&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;127.0.0.1, 0:0:0:0:0:0:0:1]
2026-01-09 16:57:50,986 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.AddressPicker           &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;LOCAL] &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Interfaces is disabled, trying to pick one address from TCP-IP config addresses: &lt;span class="o"&gt;[&lt;/span&gt;localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1]
SLF4J: Failed to load class &lt;span class="s2"&gt;"org.slf4j.impl.StaticLoggerBinder"&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
SLF4J: Defaulting to no-operation &lt;span class="o"&gt;(&lt;/span&gt;NOP&lt;span class="o"&gt;)&lt;/span&gt; logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder &lt;span class="k"&gt;for &lt;/span&gt;further details.
2026-01-09 16:57:51,189 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.SeaTunnelServer     &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - SeaTunnel server start...
2026-01-09 16:57:51,191 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.system                    &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Based on Hazelcast IMDG version: 5.1.0 &lt;span class="o"&gt;(&lt;/span&gt;20220228 - 21f20e7&lt;span class="o"&gt;)&lt;/span&gt;
2026-01-09 16:57:51,191 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.system                    &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Cluster name: seatunnel-420804
2026-01-09 16:57:51,191 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.system                    &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] 

 _____               _____                             _ 
/  ___|             |_   _|                           | |
&lt;span class="se"&gt;\ &lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="nt"&gt;--&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;   ___   __ _   | |   _   _  _ __   _ __    ___ | |
 &lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="nt"&gt;--&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="se"&gt;\ &lt;/span&gt;/ _ &lt;span class="se"&gt;\ &lt;/span&gt;/ _&lt;span class="sb"&gt;`&lt;/span&gt; |  | |  | | | &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s1"&gt;'_ \ | '&lt;/span&gt;_ &lt;span class="se"&gt;\ &lt;/span&gt; / _ &lt;span class="se"&gt;\|&lt;/span&gt; |
/&lt;span class="se"&gt;\_&lt;/span&gt;_/ /|  __/| &lt;span class="o"&gt;(&lt;/span&gt;_| |  | |  | |_| &lt;span class="o"&gt;||&lt;/span&gt; | | &lt;span class="o"&gt;||&lt;/span&gt; | | &lt;span class="o"&gt;||&lt;/span&gt;  __/| |
&lt;span class="se"&gt;\_&lt;/span&gt;___/  &lt;span class="se"&gt;\_&lt;/span&gt;__| &lt;span class="se"&gt;\_&lt;/span&gt;_,_|  &lt;span class="se"&gt;\_&lt;/span&gt;/   &lt;span class="se"&gt;\_&lt;/span&gt;_,_||_| |_||_| |_| &lt;span class="se"&gt;\_&lt;/span&gt;__||_|


2026-01-09 16:57:51,191 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.system                    &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Copyright © 2021-2024 The Apache Software Foundation. Apache SeaTunnel, SeaTunnel, and its feather logo are trademarks of The Apache Software Foundation.
2026-01-09 16:57:51,191 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.system                    &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Integrity Checker is disabled. Fail-fast on corrupted executables will not be performed.
To &lt;span class="nb"&gt;enable &lt;/span&gt;integrity checker &lt;span class="k"&gt;do &lt;/span&gt;one of the following: 
  - Change member config using Java API: config.setIntegrityCheckerEnabled&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  - Change XML/YAML configuration property: Set hazelcast.integrity-checker.enabled to &lt;span class="nb"&gt;true&lt;/span&gt;
  - Add system property: &lt;span class="nt"&gt;-Dhz&lt;/span&gt;.integritychecker.enabled&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;for &lt;/span&gt;Hazelcast embedded, works only when loading config via Config.load&lt;span class="o"&gt;)&lt;/span&gt;
  - Add environment variable: &lt;span class="nv"&gt;HZ_INTEGRITYCHECKER_ENABLED&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;recommended when running container image. For Hazelcast embedded, works only when loading config via Config.load&lt;span class="o"&gt;)&lt;/span&gt;
2026-01-09 16:57:51,193 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.system                    &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] The Jet engine is disabled.
To &lt;span class="nb"&gt;enable &lt;/span&gt;the Jet engine on the members, &lt;span class="k"&gt;do &lt;/span&gt;one of the following:
  - Change member config using Java API: config.getJetConfig&lt;span class="o"&gt;()&lt;/span&gt;.setEnabled&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
  - Change XML/YAML configuration property: Set hazelcast.jet.enabled to &lt;span class="nb"&gt;true&lt;/span&gt;
  - Add system property: &lt;span class="nt"&gt;-Dhz&lt;/span&gt;.jet.enabled&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;for &lt;/span&gt;Hazelcast embedded, works only when loading config via Config.load&lt;span class="o"&gt;)&lt;/span&gt;
  - Add environment variable: &lt;span class="nv"&gt;HZ_JET_ENABLED&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;recommended when running container image. For Hazelcast embedded, works only when loading config via Config.load&lt;span class="o"&gt;)&lt;/span&gt;
2026-01-09 16:57:51,331 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.s.security                &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Enable DEBUG/FINE log level &lt;span class="k"&gt;for &lt;/span&gt;log category com.hazelcast.system.security  or use &lt;span class="nt"&gt;-Dhazelcast&lt;/span&gt;.security.recommendations system property to see 🔒 security recommendations and the status of current config.
2026-01-09 16:57:51,392 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.SeaTunnelNodeContext] &lt;span class="o"&gt;[&lt;/span&gt;main] - Using LiteNodeDropOutTcpIpJoiner TCP/IP discovery
2026-01-09 16:57:51,393 WARN  &lt;span class="o"&gt;[&lt;/span&gt;c.h.c.CPSubsystem             &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] CP Subsystem is not enabled. CP data structures will operate &lt;span class="k"&gt;in &lt;/span&gt;UNSAFE mode! Please note that UNSAFE mode will not provide strong consistency guarantees.
2026-01-09 16:57:51,462 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.c.c.DefaultClassLoaderService] &lt;span class="o"&gt;[&lt;/span&gt;main] - start classloader service with cache mode
2026-01-09 16:57:51,464 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.c.AbstractConfigLocator &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - Loading configuration &lt;span class="s1"&gt;'/opt/seatunnel/config/seatunnel.yaml'&lt;/span&gt; from System property &lt;span class="s1"&gt;'seatunnel.config'&lt;/span&gt;
2026-01-09 16:57:51,464 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.c.AbstractConfigLocator &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - Using configuration file at /opt/seatunnel/config/seatunnel.yaml
2026-01-09 16:57:51,466 INFO  &lt;span class="o"&gt;[&lt;/span&gt;amlSeaTunnelDomConfigProcessor] &lt;span class="o"&gt;[&lt;/span&gt;main] - Dynamic slot is enabled, the schedule strategy is &lt;span class="nb"&gt;set &lt;/span&gt;to REJECT
2026-01-09 16:57:51,466 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.c.AbstractConfigLocator &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - Loading configuration &lt;span class="s1"&gt;'/opt/seatunnel/config/hazelcast.yaml'&lt;/span&gt; from System property &lt;span class="s1"&gt;'hazelcast.config'&lt;/span&gt;
2026-01-09 16:57:51,466 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.c.AbstractConfigLocator &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - Using configuration file at /opt/seatunnel/config/hazelcast.yaml
2026-01-09 16:57:51,471 WARN  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.TaskExecutionService] &lt;span class="o"&gt;[&lt;/span&gt;pool-3-thread-1] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] The Node is not ready yet, Node state STARTING,looking forward to the next scheduling
2026-01-09 16:57:51,471 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.TaskExecutionService] &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Created new BusWork : 1257526338
2026-01-09 16:57:51,474 WARN  &lt;span class="o"&gt;[&lt;/span&gt;a.s.e.s.s.s.DefaultSlotService] &lt;span class="o"&gt;[&lt;/span&gt;hz.main.seaTunnel.slotService.thread] - failed send heartbeat to resource manager, will retry later. this address: &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801
2026-01-09 16:57:51,475 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.CoordinatorService  &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Start pending job schedule thread
2026-01-09 16:57:51,554 WARN  &lt;span class="o"&gt;[&lt;/span&gt;o.a.h.u.NativeCodeLoader      &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - Unable to load native-hadoop library &lt;span class="k"&gt;for &lt;/span&gt;your platform... using builtin-java classes where applicable
2026-01-09 16:57:51,607 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.CoordinatorService  &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;pool-7-thread-1] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] 
&lt;span class="k"&gt;***********************************************&lt;/span&gt;
     CoordinatorService Thread Pool Status
&lt;span class="k"&gt;***********************************************&lt;/span&gt;
activeCount               :                   1
corePoolSize              :                  10
maximumPoolSize           :          2147483647
poolSize                  :                   1
completedTaskCount        :                   0
taskCount                 :                   1
&lt;span class="k"&gt;***********************************************&lt;/span&gt;

2026-01-09 16:57:51,612 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.JettyService        &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - SeaTunnel REST service will start on port 8080
2026-01-09 16:57:51,617 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.s.o.e.j.u.log           &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - Logging initialized @1069ms to org.apache.seatunnel.shade.org.eclipse.jetty.util.log.Slf4jLog
2026-01-09 16:57:51,643 WARN  &lt;span class="o"&gt;[&lt;/span&gt;a.s.s.o.e.j.s.h.ContextHandler] &lt;span class="o"&gt;[&lt;/span&gt;main] - Empty contextPath
2026-01-09 16:57:51,652 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.s.o.e.j.s.Server        &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - jetty-9.4.56.v20240826&lt;span class="p"&gt;;&lt;/span&gt; built: 2024-08-26T17:15:05.868Z&lt;span class="p"&gt;;&lt;/span&gt; git: ec6782ff5ead824dabdcf47fa98f90a4aedff401&lt;span class="p"&gt;;&lt;/span&gt; jvm 1.8.0_342-b07
2026-01-09 16:57:51,665 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.s.o.e.j.s.session       &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - DefaultSessionIdManager &lt;span class="nv"&gt;workerName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;node0
2026-01-09 16:57:51,665 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.s.o.e.j.s.session       &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - No SessionScavenger &lt;span class="nb"&gt;set&lt;/span&gt;, using defaults
2026-01-09 16:57:51,666 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.s.o.e.j.s.session       &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - node0 Scavenging every 660000ms
2026-01-09 16:57:51,727 INFO  &lt;span class="o"&gt;[&lt;/span&gt;a.s.s.o.e.j.s.h.ContextHandler] &lt;span class="o"&gt;[&lt;/span&gt;main] - Started o.a.s.s.o.e.j.s.ServletContextHandler@7069f076&lt;span class="o"&gt;{&lt;/span&gt;/,null,AVAILABLE&lt;span class="o"&gt;}&lt;/span&gt;
2026-01-09 16:57:51,731 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.s.o.e.j.s.AbstractConnector] &lt;span class="o"&gt;[&lt;/span&gt;main] - Started ServerConnector@6e31d989&lt;span class="o"&gt;{&lt;/span&gt;HTTP/1.1, &lt;span class="o"&gt;(&lt;/span&gt;http/1.1&lt;span class="o"&gt;)}{&lt;/span&gt;0.0.0.0:8080&lt;span class="o"&gt;}&lt;/span&gt;
2026-01-09 16:57:51,732 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.s.o.e.j.s.Server        &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - Started @1184ms
2026-01-09 16:57:51,747 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.d.Diagnostics           &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Diagnostics disabled. To &lt;span class="nb"&gt;enable &lt;/span&gt;add &lt;span class="nt"&gt;-Dhazelcast&lt;/span&gt;.diagnostics.enabled&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true &lt;/span&gt;to the JVM arguments.
2026-01-09 16:57:51,749 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.c.LifecycleService        &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 is STARTING
Members &lt;span class="o"&gt;{&lt;/span&gt;size:1, ver:1&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;
        Member &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 - 579e9374-7d47-4d91-80d5-c7a9155d42d0 &lt;span class="o"&gt;[&lt;/span&gt;master node] &lt;span class="o"&gt;[&lt;/span&gt;active master] this
&lt;span class="o"&gt;]&lt;/span&gt;

2026-01-09 16:57:52,775 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.CoordinatorService  &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;pool-5-thread-1] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] This node become a new active master node, begin init coordinator service
2026-01-09 16:57:52,779 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.c.LifecycleService        &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 is STARTED
2026-01-09 16:57:52,797 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.CoordinatorService  &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;pool-5-thread-1] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Loaded event handlers: &lt;span class="o"&gt;[&lt;/span&gt;org.apache.seatunnel.api.event.LoggingEventHandler@1ebbcf65]
2026-01-09 16:57:52,803 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.c.i.s.ClientInvocationService] &lt;span class="o"&gt;[&lt;/span&gt;main] - hz.client_1 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Running with 2 response threads, &lt;span class="nv"&gt;dynamic&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true
&lt;/span&gt;2026-01-09 16:57:52,807 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.h.i.p.i.PartitionStateManager] &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-coordinator-service-1] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Initializing cluster partition table arrangement...
2026-01-09 16:57:52,811 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.c.LifecycleService        &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - hz.client_1 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] HazelcastClient 5.1 &lt;span class="o"&gt;(&lt;/span&gt;20220228 - 21f20e7&lt;span class="o"&gt;)&lt;/span&gt; is STARTING
2026-01-09 16:57:52,811 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.c.LifecycleService        &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - hz.client_1 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] HazelcastClient 5.1 &lt;span class="o"&gt;(&lt;/span&gt;20220228 - 21f20e7&lt;span class="o"&gt;)&lt;/span&gt; is STARTED
2026-01-09 16:57:52,814 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.c.i.c.ClientConnectionManager] &lt;span class="o"&gt;[&lt;/span&gt;main] - hz.client_1 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Trying to connect to cluster: seatunnel-420804
2026-01-09 16:57:52,815 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.c.i.c.ClientConnectionManager] &lt;span class="o"&gt;[&lt;/span&gt;main] - hz.client_1 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Trying to connect to &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801
2026-01-09 16:57:52,823 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.p.t.AuthenticationMessageTask] &lt;span class="o"&gt;[&lt;/span&gt;hz.main.priority-generic-operation.thread-0] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Received auth from Connection[id&lt;span class="o"&gt;=&lt;/span&gt;1, /127.0.0.1:5801-&amp;gt;/127.0.0.1:45255, &lt;span class="nv"&gt;qualifier&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;null, &lt;span class="nv"&gt;endpoint&lt;/span&gt;&lt;span class="o"&gt;=[&lt;/span&gt;127.0.0.1]:45255, &lt;span class="nv"&gt;remoteUuid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3be41dca-f0e7-4603-ad27-4e5b3aecbb4a, &lt;span class="nv"&gt;alive&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;, &lt;span class="nv"&gt;connectionType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;JVM, &lt;span class="nv"&gt;planeIndex&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nt"&gt;-1&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;, successfully authenticated, clientUuid: 3be41dca-f0e7-4603-ad27-4e5b3aecbb4a, client name: hz.client_1, client version: 5.1
2026-01-09 16:57:52,824 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.c.LifecycleService        &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - hz.client_1 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] HazelcastClient 5.1 &lt;span class="o"&gt;(&lt;/span&gt;20220228 - 21f20e7&lt;span class="o"&gt;)&lt;/span&gt; is CLIENT_CONNECTED
2026-01-09 16:57:52,824 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.c.i.c.ClientConnectionManager] &lt;span class="o"&gt;[&lt;/span&gt;main] - hz.client_1 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Authenticated with server &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801:579e9374-7d47-4d91-80d5-c7a9155d42d0, server version: 5.1, &lt;span class="nb"&gt;local &lt;/span&gt;address: /127.0.0.1:45255
2026-01-09 16:57:52,825 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.d.Diagnostics           &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - hz.client_1 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Diagnostics disabled. To &lt;span class="nb"&gt;enable &lt;/span&gt;add &lt;span class="nt"&gt;-Dhazelcast&lt;/span&gt;.diagnostics.enabled&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true &lt;/span&gt;to the JVM arguments.
2026-01-09 16:57:52,829 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.c.i.s.ClientClusterService] &lt;span class="o"&gt;[&lt;/span&gt;hz.client_1.event-10] - hz.client_1 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] 

Members &lt;span class="o"&gt;[&lt;/span&gt;1] &lt;span class="o"&gt;{&lt;/span&gt;
        Member &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 - 579e9374-7d47-4d91-80d5-c7a9155d42d0 &lt;span class="o"&gt;[&lt;/span&gt;master node]
&lt;span class="o"&gt;}&lt;/span&gt;

2026-01-09 16:57:52,841 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.c.i.s.ClientStatisticsService] &lt;span class="o"&gt;[&lt;/span&gt;main] - Client statistics is enabled with period 5 seconds.
2026-01-09 16:57:52,858 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.c.s.u.ConfigBuilder     &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - Loading config file from path: /config/my.config
2026-01-09 16:57:52,951 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.c.s.u.ConfigShadeUtils  &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - Load config shade spi: &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;base64&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
2026-01-09 16:57:52,967 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.c.s.u.ConfigBuilder     &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - Parsed config file: 
&lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;"env"&lt;/span&gt; : &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;"parallelism"&lt;/span&gt; : 1,
        &lt;span class="s2"&gt;"job.mode"&lt;/span&gt; : &lt;span class="s2"&gt;"BATCH"&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;,
    &lt;span class="s2"&gt;"source"&lt;/span&gt; : &lt;span class="o"&gt;[&lt;/span&gt;
        &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;"url"&lt;/span&gt; : &lt;span class="s2"&gt;"http://dynamodb.us-east-1.amazonaws.com"&lt;/span&gt;,
            &lt;span class="s2"&gt;"region"&lt;/span&gt; : &lt;span class="s2"&gt;"us-east-1"&lt;/span&gt;,
            &lt;span class="s2"&gt;"access_key_id"&lt;/span&gt; : &lt;span class="s2"&gt;"&amp;lt;REDACTED&amp;gt;"&lt;/span&gt;,
            &lt;span class="s2"&gt;"secret_access_key"&lt;/span&gt; : &lt;span class="s2"&gt;"&amp;lt;REDACTED&amp;gt;"&lt;/span&gt;,
            &lt;span class="s2"&gt;"table"&lt;/span&gt; : &lt;span class="s2"&gt;"mySourceTable"&lt;/span&gt;,
            &lt;span class="s2"&gt;"schema"&lt;/span&gt; : &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="s2"&gt;"fields"&lt;/span&gt; : &lt;span class="o"&gt;{&lt;/span&gt;
                    &lt;span class="s2"&gt;"customerId"&lt;/span&gt; : &lt;span class="s2"&gt;"int"&lt;/span&gt;,
                    &lt;span class="s2"&gt;"customerName"&lt;/span&gt; : &lt;span class="s2"&gt;"string"&lt;/span&gt;,
                    &lt;span class="s2"&gt;"address"&lt;/span&gt; : &lt;span class="s2"&gt;"string"&lt;/span&gt;
                &lt;span class="o"&gt;}&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;,
            &lt;span class="s2"&gt;"plugin_name"&lt;/span&gt; : &lt;span class="s2"&gt;"Amazondynamodb"&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;]&lt;/span&gt;,
    &lt;span class="s2"&gt;"sink"&lt;/span&gt; : &lt;span class="o"&gt;[&lt;/span&gt;
        &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;"host"&lt;/span&gt; : &lt;span class="s2"&gt;"host.docker.internal"&lt;/span&gt;,
            &lt;span class="s2"&gt;"port"&lt;/span&gt; : 6379,
            &lt;span class="s2"&gt;"support_custom_key"&lt;/span&gt; : &lt;span class="nb"&gt;true&lt;/span&gt;,
            &lt;span class="s2"&gt;"key"&lt;/span&gt; : &lt;span class="s2"&gt;"customer:{customerId}"&lt;/span&gt;,
            &lt;span class="s2"&gt;"data_type"&lt;/span&gt; : &lt;span class="s2"&gt;"hash"&lt;/span&gt;,
            &lt;span class="s2"&gt;"plugin_name"&lt;/span&gt; : &lt;span class="s2"&gt;"Redis"&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;]&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

2026-01-09 16:57:52,971 INFO  &lt;span class="o"&gt;[&lt;/span&gt;p.MultipleTableJobConfigParser] &lt;span class="o"&gt;[&lt;/span&gt;main] - add common jar &lt;span class="k"&gt;in &lt;/span&gt;plugins :[]
2026-01-09 16:57:52,978 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.p.d.AbstractPluginDiscovery] &lt;span class="o"&gt;[&lt;/span&gt;main] - Load SeaTunnelSink Plugin from /opt/seatunnel/connectors
2026-01-09 16:57:52,980 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.p.d.AbstractPluginDiscovery] &lt;span class="o"&gt;[&lt;/span&gt;main] - Discovery plugin jar &lt;span class="k"&gt;for&lt;/span&gt;: PluginIdentifier&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;engineType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'seatunnel'&lt;/span&gt;, &lt;span class="nv"&gt;pluginType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'source'&lt;/span&gt;, &lt;span class="nv"&gt;pluginName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'Amazondynamodb'&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt; at: &lt;span class="o"&gt;[&lt;/span&gt;file:/opt/seatunnel/connectors/connector-amazondynamodb-2.3.12.jar]
2026-01-09 16:57:52,980 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.p.d.AbstractPluginDiscovery] &lt;span class="o"&gt;[&lt;/span&gt;main] - find connector jar and dependency &lt;span class="k"&gt;for &lt;/span&gt;PluginIdentifier&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;engineType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'seatunnel'&lt;/span&gt;, &lt;span class="nv"&gt;pluginType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'source'&lt;/span&gt;, &lt;span class="nv"&gt;pluginName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'Amazondynamodb'&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;file:/opt/seatunnel/connectors/connector-amazondynamodb-2.3.12.jar]
2026-01-09 16:57:52,982 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.p.d.AbstractPluginDiscovery] &lt;span class="o"&gt;[&lt;/span&gt;main] - Load SeaTunnelSink Plugin from /opt/seatunnel/connectors
2026-01-09 16:57:52,985 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.p.d.AbstractPluginDiscovery] &lt;span class="o"&gt;[&lt;/span&gt;main] - Load SeaTunnelSink Plugin from /opt/seatunnel/connectors
2026-01-09 16:57:52,986 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.p.d.AbstractPluginDiscovery] &lt;span class="o"&gt;[&lt;/span&gt;main] - Discovery plugin jar &lt;span class="k"&gt;for&lt;/span&gt;: PluginIdentifier&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;engineType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'seatunnel'&lt;/span&gt;, &lt;span class="nv"&gt;pluginType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'sink'&lt;/span&gt;, &lt;span class="nv"&gt;pluginName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'Redis'&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt; at: &lt;span class="o"&gt;[&lt;/span&gt;file:/opt/seatunnel/connectors/connector-redis-2.3.12.jar]
2026-01-09 16:57:52,986 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.p.d.AbstractPluginDiscovery] &lt;span class="o"&gt;[&lt;/span&gt;main] - find connector jar and dependency &lt;span class="k"&gt;for &lt;/span&gt;PluginIdentifier&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;engineType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'seatunnel'&lt;/span&gt;, &lt;span class="nv"&gt;pluginType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'sink'&lt;/span&gt;, &lt;span class="nv"&gt;pluginName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'Redis'&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;file:/opt/seatunnel/connectors/connector-redis-2.3.12.jar]
2026-01-09 16:57:52,988 INFO  &lt;span class="o"&gt;[&lt;/span&gt;p.MultipleTableJobConfigParser] &lt;span class="o"&gt;[&lt;/span&gt;main] - start generating all sources.
2026-01-09 16:57:53,003 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.a.t.f.FactoryUtil       &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - get the CatalogTable from &lt;span class="nb"&gt;source &lt;/span&gt;AmazonDynamodb: .default.default.default
2026-01-09 16:57:53,008 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.p.d.AbstractPluginDiscovery] &lt;span class="o"&gt;[&lt;/span&gt;main] - Load SeaTunnelSource Plugin from /opt/seatunnel/connectors
2026-01-09 16:57:53,011 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.p.d.AbstractPluginDiscovery] &lt;span class="o"&gt;[&lt;/span&gt;main] - Discovery plugin jar &lt;span class="k"&gt;for&lt;/span&gt;: PluginIdentifier&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;engineType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'seatunnel'&lt;/span&gt;, &lt;span class="nv"&gt;pluginType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'source'&lt;/span&gt;, &lt;span class="nv"&gt;pluginName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'Amazondynamodb'&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt; at: &lt;span class="o"&gt;[&lt;/span&gt;file:/opt/seatunnel/connectors/connector-amazondynamodb-2.3.12.jar]
2026-01-09 16:57:53,011 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.p.d.AbstractPluginDiscovery] &lt;span class="o"&gt;[&lt;/span&gt;main] - find connector jar and dependency &lt;span class="k"&gt;for &lt;/span&gt;PluginIdentifier&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;engineType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'seatunnel'&lt;/span&gt;, &lt;span class="nv"&gt;pluginType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'source'&lt;/span&gt;, &lt;span class="nv"&gt;pluginName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'Amazondynamodb'&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;file:/opt/seatunnel/connectors/connector-amazondynamodb-2.3.12.jar]
2026-01-09 16:57:53,012 INFO  &lt;span class="o"&gt;[&lt;/span&gt;p.MultipleTableJobConfigParser] &lt;span class="o"&gt;[&lt;/span&gt;main] - start generating all transforms.
2026-01-09 16:57:53,012 INFO  &lt;span class="o"&gt;[&lt;/span&gt;p.MultipleTableJobConfigParser] &lt;span class="o"&gt;[&lt;/span&gt;main] - start generating all sinks.
2026-01-09 16:57:53,013 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.p.d.AbstractPluginDiscovery] &lt;span class="o"&gt;[&lt;/span&gt;main] - Load SeaTunnelSink Plugin from /opt/seatunnel/connectors
2026-01-09 16:57:53,014 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.p.d.AbstractPluginDiscovery] &lt;span class="o"&gt;[&lt;/span&gt;main] - Discovery plugin jar &lt;span class="k"&gt;for&lt;/span&gt;: PluginIdentifier&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;engineType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'seatunnel'&lt;/span&gt;, &lt;span class="nv"&gt;pluginType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'sink'&lt;/span&gt;, &lt;span class="nv"&gt;pluginName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'Redis'&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt; at: &lt;span class="o"&gt;[&lt;/span&gt;file:/opt/seatunnel/connectors/connector-redis-2.3.12.jar]
2026-01-09 16:57:53,014 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.p.d.AbstractPluginDiscovery] &lt;span class="o"&gt;[&lt;/span&gt;main] - find connector jar and dependency &lt;span class="k"&gt;for &lt;/span&gt;PluginIdentifier&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;engineType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'seatunnel'&lt;/span&gt;, &lt;span class="nv"&gt;pluginType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'sink'&lt;/span&gt;, &lt;span class="nv"&gt;pluginName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'Redis'&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;file:/opt/seatunnel/connectors/connector-redis-2.3.12.jar]
2026-01-09 16:57:53,024 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.a.t.f.FactoryUtil       &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - Create sink &lt;span class="s1"&gt;'Redis'&lt;/span&gt; with upstream input catalog-table[database: default, schema: default, table: default]
2026-01-09 16:57:53,042 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.c.j.ClientJobProxy    &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - Start submit job, job &lt;span class="nb"&gt;id&lt;/span&gt;: 1062052604328017921, with plugin jar &lt;span class="o"&gt;[&lt;/span&gt;file:/opt/seatunnel/connectors/connector-amazondynamodb-2.3.12.jar, file:/opt/seatunnel/connectors/connector-redis-2.3.12.jar]
2026-01-09 16:57:53,047 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.e.s.r.AbstractResourceManager] &lt;span class="o"&gt;[&lt;/span&gt;hz.main.client.thread-4] - Init ResourceManager
2026-01-09 16:57:53,047 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.e.s.r.AbstractResourceManager] &lt;span class="o"&gt;[&lt;/span&gt;hz.main.client.thread-4] - initWorker... 
2026-01-09 16:57:53,047 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.e.s.r.AbstractResourceManager] &lt;span class="o"&gt;[&lt;/span&gt;hz.main.client.thread-4] - init live nodes: &lt;span class="o"&gt;[[&lt;/span&gt;localhost]:5801]
2026-01-09 16:57:53,048 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.e.s.r.AbstractResourceManager] &lt;span class="o"&gt;[&lt;/span&gt;SeaTunnel-CompletableFuture-Thread-2] - received new worker register: &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801
2026-01-09 16:57:53,152 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.c.s.r.c.RedisParameters &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;hz.main.seaTunnel.task.thread-4] - Try to get redis version information from the jedis.info&lt;span class="o"&gt;()&lt;/span&gt; method
2026-01-09 16:57:53,152 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.c.s.r.c.RedisParameters &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;hz.main.seaTunnel.task.thread-4] - The version of Redis is :8.4.0
2026-01-09 16:57:53,154 INFO  &lt;span class="o"&gt;[&lt;/span&gt;a.s.a.s.m.MultiTableSinkWriter] &lt;span class="o"&gt;[&lt;/span&gt;hz.main.seaTunnel.task.thread-4] - init multi table sink writer, queue size: 1
2026-01-09 16:57:53,233 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.a.e.LoggingEventHandler &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;hz.main.generic-operation.thread-12] - log event: ReaderOpenEvent&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;createdTime&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1767977873233, &lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;eventType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;LIFECYCLE_READER_OPEN&lt;span class="o"&gt;)&lt;/span&gt;
2026-01-09 16:57:53,402 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.t.SourceSplitEnumeratorTask] &lt;span class="o"&gt;[&lt;/span&gt;hz.main.seaTunnel.task.thread-4] - received reader register, readerID: TaskLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;taskGroupLocation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2&lt;span class="o"&gt;}&lt;/span&gt;, &lt;span class="nv"&gt;taskID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1000200000000, &lt;span class="nv"&gt;index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0&lt;span class="o"&gt;}&lt;/span&gt;
2026-01-09 16:57:53,429 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.e.s.c.CheckpointCoordinator] &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-coordinator-service-9] - checkpoint is disabled, because &lt;span class="k"&gt;in &lt;/span&gt;batch mode and &lt;span class="s1"&gt;'checkpoint.interval'&lt;/span&gt; of &lt;span class="nb"&gt;env &lt;/span&gt;is missing.
2026-01-09 16:57:53,506 INFO  &lt;span class="o"&gt;[&lt;/span&gt;a.s.AmazonDynamoDBSourceReader] &lt;span class="o"&gt;[&lt;/span&gt;BlockingWorker-TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2&lt;span class="o"&gt;}]&lt;/span&gt; - AmazonDynamoDB Source Reader &lt;span class="o"&gt;[&lt;/span&gt;0] waiting &lt;span class="k"&gt;for &lt;/span&gt;splits
2026-01-09 16:57:53,506 INFO  &lt;span class="o"&gt;[&lt;/span&gt;a.s.AmazonDynamoDBSourceReader] &lt;span class="o"&gt;[&lt;/span&gt;BlockingWorker-TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2&lt;span class="o"&gt;}]&lt;/span&gt; - AmazonDynamoDB Source Reader &lt;span class="o"&gt;[&lt;/span&gt;0] waiting &lt;span class="k"&gt;for &lt;/span&gt;splits
2026-01-09 16:57:53,529 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.t.SourceSplitEnumeratorTask] &lt;span class="o"&gt;[&lt;/span&gt;BlockingWorker-TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;}]&lt;/span&gt; - received enough reader, starting enumerator...
2026-01-09 16:57:53,530 INFO  &lt;span class="o"&gt;[&lt;/span&gt;nDynamoDBSourceSplitEnumerator] &lt;span class="o"&gt;[&lt;/span&gt;BlockingWorker-TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;}]&lt;/span&gt; - Assigning org.apache.seatunnel.connectors.seatunnel.amazondynamodb.source.AmazonDynamoDBSourceSplit@36fe5b1d to 0 reader.
2026-01-09 16:57:53,530 INFO  &lt;span class="o"&gt;[&lt;/span&gt;nDynamoDBSourceSplitEnumerator] &lt;span class="o"&gt;[&lt;/span&gt;BlockingWorker-TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;}]&lt;/span&gt; - Assigning org.apache.seatunnel.connectors.seatunnel.amazondynamodb.source.AmazonDynamoDBSourceSplit@81bd425 to 0 reader.
2026-01-09 16:57:53,530 INFO  &lt;span class="o"&gt;[&lt;/span&gt;nDynamoDBSourceSplitEnumerator] &lt;span class="o"&gt;[&lt;/span&gt;BlockingWorker-TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;}]&lt;/span&gt; - Assign splits &lt;span class="o"&gt;[&lt;/span&gt;org.apache.seatunnel.connectors.seatunnel.amazondynamodb.source.AmazonDynamoDBSourceSplit@36fe5b1d, org.apache.seatunnel.connectors.seatunnel.amazondynamodb.source.AmazonDynamoDBSourceSplit@81bd425] to reader 0
2026-01-09 16:57:53,533 INFO  &lt;span class="o"&gt;[&lt;/span&gt;a.s.AmazonDynamoDBSourceReader] &lt;span class="o"&gt;[&lt;/span&gt;hz.main.generic-operation.thread-21] - Reader &lt;span class="o"&gt;[&lt;/span&gt;0] received noMoreSplit event.
2026-01-09 16:57:53,934 INFO  &lt;span class="o"&gt;[&lt;/span&gt;a.s.AmazonDynamoDBSourceReader] &lt;span class="o"&gt;[&lt;/span&gt;BlockingWorker-TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2&lt;span class="o"&gt;}]&lt;/span&gt; - AmazonDynamoDB Source Reader &lt;span class="o"&gt;[&lt;/span&gt;0] waiting &lt;span class="k"&gt;for &lt;/span&gt;splits
2026-01-09 16:57:53,934 INFO  &lt;span class="o"&gt;[&lt;/span&gt;a.s.AmazonDynamoDBSourceReader] &lt;span class="o"&gt;[&lt;/span&gt;BlockingWorker-TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2&lt;span class="o"&gt;}]&lt;/span&gt; - Closed the bounded amazonDynamodb &lt;span class="nb"&gt;source
&lt;/span&gt;2026-01-09 16:57:53,941 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.e.s.c.CheckpointCoordinator] &lt;span class="o"&gt;[&lt;/span&gt;hz.main.generic-operation.thread-23] - skip schedule trigger checkpoint because checkpoint &lt;span class="nb"&gt;type &lt;/span&gt;is COMPLETED_POINT_TYPE
2026-01-09 16:57:53,943 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.e.s.c.CheckpointCoordinator] &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-coordinator-service-8] - &lt;span class="nb"&gt;wait &lt;/span&gt;checkpoint completed: 9223372036854775807
2026-01-09 16:57:55,974 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.e.s.c.CheckpointCoordinator] &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-coordinator-service-8] - pending checkpoint&lt;span class="o"&gt;(&lt;/span&gt;9223372036854775807/1@1062052604328017921&lt;span class="o"&gt;)&lt;/span&gt; notify finished!
2026-01-09 16:57:55,974 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.e.s.c.CheckpointCoordinator] &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-coordinator-service-8] - start notify checkpoint completed, job &lt;span class="nb"&gt;id&lt;/span&gt;: 1062052604328017921, pipeline &lt;span class="nb"&gt;id&lt;/span&gt;: 1, checkpoint &lt;span class="nb"&gt;id&lt;/span&gt;:9223372036854775807
2026-01-09 16:57:55,979 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.e.s.c.CheckpointCoordinator] &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-coordinator-service-8] - start clean pending checkpoint cause CheckpointCoordinator completed.
2026-01-09 16:57:55,979 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.e.s.c.CheckpointCoordinator] &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-coordinator-service-8] - Turn checkpoint_state_1062052604328017921_1 state from null to FINISHED
2026-01-09 16:57:55,987 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.TaskExecutionService] &lt;span class="o"&gt;[&lt;/span&gt;BlockingWorker-TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;}]&lt;/span&gt; - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] taskDone, taskId &lt;span class="o"&gt;=&lt;/span&gt; 1000100000000, taskGroup &lt;span class="o"&gt;=&lt;/span&gt; TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;}&lt;/span&gt;
2026-01-09 16:57:55,987 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.a.e.LoggingEventHandler &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;hz.main.generic-operation.thread-32] - log event: EnumeratorCloseEvent&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;createdTime&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1767977875987, &lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;eventType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;LIFECYCLE_ENUMERATOR_CLOSE&lt;span class="o"&gt;)&lt;/span&gt;
2026-01-09 16:57:55,998 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.TaskExecutionService] &lt;span class="o"&gt;[&lt;/span&gt;BlockingWorker-TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;}]&lt;/span&gt; - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] taskGroup TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="nb"&gt;complete &lt;/span&gt;with FINISHED
2026-01-09 16:57:55,998 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.TaskExecutionService] &lt;span class="o"&gt;[&lt;/span&gt;hz.main.seaTunnel.task.thread-5] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Task TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="nb"&gt;complete &lt;/span&gt;with state FINISHED
2026-01-09 16:57:55,998 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.CoordinatorService  &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;hz.main.seaTunnel.task.thread-5] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Received task end from execution TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;}&lt;/span&gt;, state FINISHED
2026-01-09 16:57:56,000 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.d.p.PhysicalVertex  &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;hz.main.seaTunnel.task.thread-5] - Job &lt;span class="o"&gt;(&lt;/span&gt;1062052604328017921&lt;span class="o"&gt;)&lt;/span&gt;, Pipeline: &lt;span class="o"&gt;[(&lt;/span&gt;1/1&lt;span class="o"&gt;)]&lt;/span&gt;, task: &lt;span class="o"&gt;[&lt;/span&gt;pipeline-1 &lt;span class="o"&gt;[&lt;/span&gt;Source[0]-Amazondynamodb]-SplitEnumerator &lt;span class="o"&gt;(&lt;/span&gt;1/1&lt;span class="o"&gt;)]&lt;/span&gt;, taskGroupLocation: &lt;span class="o"&gt;[&lt;/span&gt;TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;}]&lt;/span&gt; turned from state RUNNING to FINISHED.
2026-01-09 16:57:56,000 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.d.p.PhysicalVertex  &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;hz.main.seaTunnel.task.thread-5] - Job &lt;span class="o"&gt;(&lt;/span&gt;1062052604328017921&lt;span class="o"&gt;)&lt;/span&gt;, Pipeline: &lt;span class="o"&gt;[(&lt;/span&gt;1/1&lt;span class="o"&gt;)]&lt;/span&gt;, task: &lt;span class="o"&gt;[&lt;/span&gt;pipeline-1 &lt;span class="o"&gt;[&lt;/span&gt;Source[0]-Amazondynamodb]-SplitEnumerator &lt;span class="o"&gt;(&lt;/span&gt;1/1&lt;span class="o"&gt;)]&lt;/span&gt;, taskGroupLocation: &lt;span class="o"&gt;[&lt;/span&gt;TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;}]&lt;/span&gt; state process is stopped
2026-01-09 16:57:56,000 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.d.p.SubPlan         &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-coordinator-service-8] - Job &lt;span class="o"&gt;(&lt;/span&gt;1062052604328017921&lt;span class="o"&gt;)&lt;/span&gt;, Pipeline: &lt;span class="o"&gt;[(&lt;/span&gt;1/1&lt;span class="o"&gt;)]&lt;/span&gt;, task: &lt;span class="o"&gt;[&lt;/span&gt;pipeline-1 &lt;span class="o"&gt;[&lt;/span&gt;Source[0]-Amazondynamodb]-SplitEnumerator &lt;span class="o"&gt;(&lt;/span&gt;1/1&lt;span class="o"&gt;)]&lt;/span&gt;, taskGroupLocation: &lt;span class="o"&gt;[&lt;/span&gt;TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1&lt;span class="o"&gt;}]&lt;/span&gt; future &lt;span class="nb"&gt;complete &lt;/span&gt;with state FINISHED
2026-01-09 16:57:56,039 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.a.e.LoggingEventHandler &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;hz.main.generic-operation.thread-33] - log event: ReaderCloseEvent&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;createdTime&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1767977876039, &lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;eventType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;LIFECYCLE_READER_CLOSE&lt;span class="o"&gt;)&lt;/span&gt;
2026-01-09 16:57:56,040 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.TaskExecutionService] &lt;span class="o"&gt;[&lt;/span&gt;BlockingWorker-TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2&lt;span class="o"&gt;}]&lt;/span&gt; - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] taskDone, taskId &lt;span class="o"&gt;=&lt;/span&gt; 1000200000000, taskGroup &lt;span class="o"&gt;=&lt;/span&gt; TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2&lt;span class="o"&gt;}&lt;/span&gt;
2026-01-09 16:57:56,076 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.TaskExecutionService] &lt;span class="o"&gt;[&lt;/span&gt;BlockingWorker-TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2&lt;span class="o"&gt;}]&lt;/span&gt; - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] taskDone, taskId &lt;span class="o"&gt;=&lt;/span&gt; 1000200010000, taskGroup &lt;span class="o"&gt;=&lt;/span&gt; TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2&lt;span class="o"&gt;}&lt;/span&gt;
2026-01-09 16:57:56,077 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.a.e.LoggingEventHandler &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;hz.main.generic-operation.thread-34] - log event: WriterCloseEvent&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;createdTime&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1767977876076, &lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;eventType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;LIFECYCLE_WRITER_CLOSE&lt;span class="o"&gt;)&lt;/span&gt;
2026-01-09 16:57:56,079 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.TaskExecutionService] &lt;span class="o"&gt;[&lt;/span&gt;BlockingWorker-TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2&lt;span class="o"&gt;}]&lt;/span&gt; - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] taskGroup TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="nb"&gt;complete &lt;/span&gt;with FINISHED
2026-01-09 16:57:56,079 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.TaskExecutionService] &lt;span class="o"&gt;[&lt;/span&gt;hz.main.seaTunnel.task.thread-2] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Task TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="nb"&gt;complete &lt;/span&gt;with state FINISHED
2026-01-09 16:57:56,079 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.CoordinatorService  &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;hz.main.seaTunnel.task.thread-2] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Received task end from execution TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2&lt;span class="o"&gt;}&lt;/span&gt;, state FINISHED
2026-01-09 16:57:56,080 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.d.p.PhysicalVertex  &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;hz.main.seaTunnel.task.thread-2] - Job &lt;span class="o"&gt;(&lt;/span&gt;1062052604328017921&lt;span class="o"&gt;)&lt;/span&gt;, Pipeline: &lt;span class="o"&gt;[(&lt;/span&gt;1/1&lt;span class="o"&gt;)]&lt;/span&gt;, task: &lt;span class="o"&gt;[&lt;/span&gt;pipeline-1 &lt;span class="o"&gt;[&lt;/span&gt;Source[0]-Amazondynamodb]-SourceTask &lt;span class="o"&gt;(&lt;/span&gt;1/1&lt;span class="o"&gt;)]&lt;/span&gt;, taskGroupLocation: &lt;span class="o"&gt;[&lt;/span&gt;TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2&lt;span class="o"&gt;}]&lt;/span&gt; turned from state RUNNING to FINISHED.
2026-01-09 16:57:56,080 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.d.p.PhysicalVertex  &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;hz.main.seaTunnel.task.thread-2] - Job &lt;span class="o"&gt;(&lt;/span&gt;1062052604328017921&lt;span class="o"&gt;)&lt;/span&gt;, Pipeline: &lt;span class="o"&gt;[(&lt;/span&gt;1/1&lt;span class="o"&gt;)]&lt;/span&gt;, task: &lt;span class="o"&gt;[&lt;/span&gt;pipeline-1 &lt;span class="o"&gt;[&lt;/span&gt;Source[0]-Amazondynamodb]-SourceTask &lt;span class="o"&gt;(&lt;/span&gt;1/1&lt;span class="o"&gt;)]&lt;/span&gt;, taskGroupLocation: &lt;span class="o"&gt;[&lt;/span&gt;TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2&lt;span class="o"&gt;}]&lt;/span&gt; state process is stopped
2026-01-09 16:57:56,080 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.d.p.SubPlan         &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-coordinator-service-8] - Job &lt;span class="o"&gt;(&lt;/span&gt;1062052604328017921&lt;span class="o"&gt;)&lt;/span&gt;, Pipeline: &lt;span class="o"&gt;[(&lt;/span&gt;1/1&lt;span class="o"&gt;)]&lt;/span&gt;, task: &lt;span class="o"&gt;[&lt;/span&gt;pipeline-1 &lt;span class="o"&gt;[&lt;/span&gt;Source[0]-Amazondynamodb]-SourceTask &lt;span class="o"&gt;(&lt;/span&gt;1/1&lt;span class="o"&gt;)]&lt;/span&gt;, taskGroupLocation: &lt;span class="o"&gt;[&lt;/span&gt;TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2&lt;span class="o"&gt;}]&lt;/span&gt; future &lt;span class="nb"&gt;complete &lt;/span&gt;with state FINISHED
2026-01-09 16:57:56,080 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.d.p.SubPlan         &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-coordinator-service-8] - Job SeaTunnel_Job &lt;span class="o"&gt;(&lt;/span&gt;1062052604328017921&lt;span class="o"&gt;)&lt;/span&gt;, Pipeline: &lt;span class="o"&gt;[(&lt;/span&gt;1/1&lt;span class="o"&gt;)]&lt;/span&gt; will end with state FINISHED
2026-01-09 16:57:56,081 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.m.JobMaster         &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;hz.main.seaTunnel.task.thread-2] - release the task group resource TaskGroupLocation&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;pipelineId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;taskGroupId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2&lt;span class="o"&gt;}&lt;/span&gt;
2026-01-09 16:57:56,081 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.d.p.SubPlan         &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-coordinator-service-8] - Job SeaTunnel_Job &lt;span class="o"&gt;(&lt;/span&gt;1062052604328017921&lt;span class="o"&gt;)&lt;/span&gt;, Pipeline: &lt;span class="o"&gt;[(&lt;/span&gt;1/1&lt;span class="o"&gt;)]&lt;/span&gt; turned from state RUNNING to FINISHED.
2026-01-09 16:57:56,081 INFO  &lt;span class="o"&gt;[&lt;/span&gt;a.s.e.s.s.s.DefaultSlotService] &lt;span class="o"&gt;[&lt;/span&gt;hz.main.generic-operation.thread-35] - received slot release request, jobID: 1062052604328017921, slot: SlotProfile&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;worker&lt;/span&gt;&lt;span class="o"&gt;=[&lt;/span&gt;localhost]:5801, &lt;span class="nv"&gt;slotID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2, &lt;span class="nv"&gt;ownerJobID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;assigned&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;, &lt;span class="nv"&gt;resourceProfile&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ResourceProfile&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;cpu&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;CPU&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;core&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0&lt;span class="o"&gt;}&lt;/span&gt;, &lt;span class="nv"&gt;heapMemory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Memory&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;bytes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0&lt;span class="o"&gt;}}&lt;/span&gt;, &lt;span class="nv"&gt;sequence&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'e992c180-0d29-42db-ac3f-249a1130c0b1'&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
2026-01-09 16:57:56,105 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.m.JobMaster         &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-coordinator-service-8] - release the pipeline Job SeaTunnel_Job &lt;span class="o"&gt;(&lt;/span&gt;1062052604328017921&lt;span class="o"&gt;)&lt;/span&gt;, Pipeline: &lt;span class="o"&gt;[(&lt;/span&gt;1/1&lt;span class="o"&gt;)]&lt;/span&gt; resource
2026-01-09 16:57:56,105 INFO  &lt;span class="o"&gt;[&lt;/span&gt;a.s.e.s.s.s.DefaultSlotService] &lt;span class="o"&gt;[&lt;/span&gt;hz.main.generic-operation.thread-39] - received slot release request, jobID: 1062052604328017921, slot: SlotProfile&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;worker&lt;/span&gt;&lt;span class="o"&gt;=[&lt;/span&gt;localhost]:5801, &lt;span class="nv"&gt;slotID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;ownerJobID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;assigned&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;, &lt;span class="nv"&gt;resourceProfile&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ResourceProfile&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;cpu&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;CPU&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;core&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0&lt;span class="o"&gt;}&lt;/span&gt;, &lt;span class="nv"&gt;heapMemory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Memory&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;bytes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0&lt;span class="o"&gt;}}&lt;/span&gt;, &lt;span class="nv"&gt;sequence&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'e992c180-0d29-42db-ac3f-249a1130c0b1'&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
2026-01-09 16:57:56,106 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.d.p.SubPlan         &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-coordinator-service-8] - Job SeaTunnel_Job &lt;span class="o"&gt;(&lt;/span&gt;1062052604328017921&lt;span class="o"&gt;)&lt;/span&gt;, Pipeline: &lt;span class="o"&gt;[(&lt;/span&gt;1/1&lt;span class="o"&gt;)]&lt;/span&gt; state process is stop
2026-01-09 16:57:56,106 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.d.p.PhysicalPlan    &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-coordinator-service-7] - Job SeaTunnel_Job &lt;span class="o"&gt;(&lt;/span&gt;1062052604328017921&lt;span class="o"&gt;)&lt;/span&gt;, Pipeline: &lt;span class="o"&gt;[(&lt;/span&gt;1/1&lt;span class="o"&gt;)]&lt;/span&gt; future &lt;span class="nb"&gt;complete &lt;/span&gt;with state FINISHED
2026-01-09 16:57:56,107 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.d.p.PhysicalPlan    &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-coordinator-service-7] - Job SeaTunnel_Job &lt;span class="o"&gt;(&lt;/span&gt;1062052604328017921&lt;span class="o"&gt;)&lt;/span&gt; turned from state RUNNING to FINISHED.
2026-01-09 16:57:56,107 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.d.p.PhysicalPlan    &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-coordinator-service-7] - Job SeaTunnel_Job &lt;span class="o"&gt;(&lt;/span&gt;1062052604328017921&lt;span class="o"&gt;)&lt;/span&gt; state process is stop
2026-01-09 16:57:56,116 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.a.e.LoggingEventHandler &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-coordinator-service-7] - log event: JobStateEvent&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;jobId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1062052604328017921, &lt;span class="nv"&gt;jobName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;SeaTunnel_Job, &lt;span class="nv"&gt;jobStatus&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;FINISHED, &lt;span class="nv"&gt;createdTime&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1767977876116&lt;span class="o"&gt;)&lt;/span&gt;
2026-01-09 16:57:56,117 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.c.j.ClientJobProxy    &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - Job &lt;span class="o"&gt;(&lt;/span&gt;1062052604328017921&lt;span class="o"&gt;)&lt;/span&gt; end with state FINISHED
2026-01-09 16:57:56,132 INFO  &lt;span class="o"&gt;[&lt;/span&gt;s.c.s.s.c.ClientExecuteCommand] &lt;span class="o"&gt;[&lt;/span&gt;main] - 
&lt;span class="k"&gt;***********************************************&lt;/span&gt;
           Job Statistic Information
&lt;span class="k"&gt;***********************************************&lt;/span&gt;
Start Time                : 2026-01-09 16:57:52
End Time                  : 2026-01-09 16:57:56
Total Time&lt;span class="o"&gt;(&lt;/span&gt;s&lt;span class="o"&gt;)&lt;/span&gt;             :                   3
Total Read Count          :                   2
Total Write Count         :                   2
Total Failed Count        :                   0
&lt;span class="k"&gt;***********************************************&lt;/span&gt;

2026-01-09 16:57:56,133 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.c.LifecycleService        &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - hz.client_1 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] HazelcastClient 5.1 &lt;span class="o"&gt;(&lt;/span&gt;20220228 - 21f20e7&lt;span class="o"&gt;)&lt;/span&gt; is SHUTTING_DOWN
2026-01-09 16:57:56,135 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.s.t.TcpServerConnection &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;hz.main.IO.thread-in-1] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Connection[id&lt;span class="o"&gt;=&lt;/span&gt;1, /127.0.0.1:5801-&amp;gt;/127.0.0.1:45255, &lt;span class="nv"&gt;qualifier&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;null, &lt;span class="nv"&gt;endpoint&lt;/span&gt;&lt;span class="o"&gt;=[&lt;/span&gt;127.0.0.1]:45255, &lt;span class="nv"&gt;remoteUuid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3be41dca-f0e7-4603-ad27-4e5b3aecbb4a, &lt;span class="nv"&gt;alive&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;, &lt;span class="nv"&gt;connectionType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;JVM, &lt;span class="nv"&gt;planeIndex&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nt"&gt;-1&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; closed. Reason: Connection closed by the other side
2026-01-09 16:57:56,135 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.c.i.c.ClientConnectionManager] &lt;span class="o"&gt;[&lt;/span&gt;main] - hz.client_1 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Removed connection to endpoint: &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801:579e9374-7d47-4d91-80d5-c7a9155d42d0, connection: ClientConnection&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;alive&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;, &lt;span class="nv"&gt;connectionId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1, &lt;span class="nv"&gt;channel&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;NioChannel&lt;span class="o"&gt;{&lt;/span&gt;/127.0.0.1:45255-&amp;gt;localhost/127.0.0.1:5801&lt;span class="o"&gt;}&lt;/span&gt;, &lt;span class="nv"&gt;remoteAddress&lt;/span&gt;&lt;span class="o"&gt;=[&lt;/span&gt;localhost]:5801, &lt;span class="nv"&gt;lastReadTime&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2026-01-09 16:57:56.131, &lt;span class="nv"&gt;lastWriteTime&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2026-01-09 16:57:56.117, &lt;span class="nv"&gt;closedTime&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2026-01-09 16:57:56.134, connected server &lt;span class="nv"&gt;version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;5.1&lt;span class="o"&gt;}&lt;/span&gt;
2026-01-09 16:57:56,135 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.c.LifecycleService        &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - hz.client_1 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] HazelcastClient 5.1 &lt;span class="o"&gt;(&lt;/span&gt;20220228 - 21f20e7&lt;span class="o"&gt;)&lt;/span&gt; is CLIENT_DISCONNECTED
2026-01-09 16:57:56,136 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.c.i.ClientEndpointManager &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;hz.main.event-1] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Destroying ClientEndpoint&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;connection&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Connection[id&lt;span class="o"&gt;=&lt;/span&gt;1, /127.0.0.1:5801-&amp;gt;/127.0.0.1:45255, &lt;span class="nv"&gt;qualifier&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;null, &lt;span class="nv"&gt;endpoint&lt;/span&gt;&lt;span class="o"&gt;=[&lt;/span&gt;127.0.0.1]:45255, &lt;span class="nv"&gt;remoteUuid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3be41dca-f0e7-4603-ad27-4e5b3aecbb4a, &lt;span class="nv"&gt;alive&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;, &lt;span class="nv"&gt;connectionType&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;JVM, &lt;span class="nv"&gt;planeIndex&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nt"&gt;-1&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;, &lt;span class="nv"&gt;clientUuid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3be41dca-f0e7-4603-ad27-4e5b3aecbb4a, &lt;span class="nv"&gt;clientName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;hz.client_1, &lt;span class="nv"&gt;authenticated&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;, &lt;span class="nv"&gt;clientVersion&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;5.1, &lt;span class="nv"&gt;creationTime&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1767977872821, latest &lt;span class="nv"&gt;clientAttributes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;lastStatisticsCollectionTime&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1767977872841,enterprise&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;,clientType&lt;span class="o"&gt;=&lt;/span&gt;JVM,clientVersion&lt;span class="o"&gt;=&lt;/span&gt;5.1,clusterConnectionTimestamp&lt;span class="o"&gt;=&lt;/span&gt;1767977872816,clientAddress&lt;span class="o"&gt;=&lt;/span&gt;127.0.0.1,clientName&lt;span class="o"&gt;=&lt;/span&gt;hz.client_1,credentials.principal&lt;span class="o"&gt;=&lt;/span&gt;null,os.committedVirtualMemorySize&lt;span class="o"&gt;=&lt;/span&gt;9714790400,os.freePhysicalMemorySize&lt;span class="o"&gt;=&lt;/span&gt;2077376512,os.freeSwapSpaceSize&lt;span class="o"&gt;=&lt;/span&gt;1073737728,os.maxFileDescriptorCount&lt;span class="o"&gt;=&lt;/span&gt;1048576,os.openFileDescriptorCount&lt;span class="o"&gt;=&lt;/span&gt;103,os.processCpuTime&lt;span class="o"&gt;=&lt;/span&gt;4280000000,os.systemLoadAverage&lt;span class="o"&gt;=&lt;/span&gt;3.26,os.totalPhysicalMemorySize&lt;span class="o"&gt;=&lt;/span&gt;8321540096,os.totalSwapSpaceSize&lt;span class="o"&gt;=&lt;/span&gt;1073737728,runtime.availableProcessors&lt;span class="o"&gt;=&lt;/span&gt;14,runtime.freeMemory&lt;span class="o"&gt;=&lt;/span&gt;286196824,runtime.maxMemory&lt;span class="o"&gt;=&lt;/span&gt;477626368,runtime.totalMemory&lt;span class="o"&gt;=&lt;/span&gt;376438784,runtime.uptime&lt;span class="o"&gt;=&lt;/span&gt;2297,runtime.usedMemory&lt;span class="o"&gt;=&lt;/span&gt;90241960, &lt;span class="nv"&gt;labels&lt;/span&gt;&lt;span class="o"&gt;=[]}&lt;/span&gt;
2026-01-09 16:57:56,136 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.c.LifecycleService        &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - hz.client_1 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] HazelcastClient 5.1 &lt;span class="o"&gt;(&lt;/span&gt;20220228 - 21f20e7&lt;span class="o"&gt;)&lt;/span&gt; is SHUTDOWN
2026-01-09 16:57:56,136 INFO  &lt;span class="o"&gt;[&lt;/span&gt;s.c.s.s.c.ClientExecuteCommand] &lt;span class="o"&gt;[&lt;/span&gt;main] - Closed SeaTunnel client......
2026-01-09 16:57:56,136 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.c.LifecycleService        &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 is SHUTTING_DOWN
2026-01-09 16:57:56,138 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.p.i.MigrationManager    &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;hz.main.cached.thread-15] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Shutdown request of Member &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 - 579e9374-7d47-4d91-80d5-c7a9155d42d0 &lt;span class="o"&gt;[&lt;/span&gt;master node] &lt;span class="o"&gt;[&lt;/span&gt;active master] this is handled
2026-01-09 16:57:56,141 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.i.Node                  &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Shutting down connection manager...
2026-01-09 16:57:56,142 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.i.Node                  &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Shutting down node engine...
2026-01-09 16:57:56,149 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.s.s.o.e.j.s.AbstractConnector] &lt;span class="o"&gt;[&lt;/span&gt;main] - Stopped ServerConnector@6e31d989&lt;span class="o"&gt;{&lt;/span&gt;HTTP/1.1, &lt;span class="o"&gt;(&lt;/span&gt;http/1.1&lt;span class="o"&gt;)}{&lt;/span&gt;0.0.0.0:8080&lt;span class="o"&gt;}&lt;/span&gt;
2026-01-09 16:57:56,149 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.s.o.e.j.s.session       &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - node0 Stopped scavenging
2026-01-09 16:57:56,150 INFO  &lt;span class="o"&gt;[&lt;/span&gt;a.s.s.o.e.j.s.h.ContextHandler] &lt;span class="o"&gt;[&lt;/span&gt;main] - Stopped o.a.s.s.o.e.j.s.ServletContextHandler@7069f076&lt;span class="o"&gt;{&lt;/span&gt;/,null,STOPPED&lt;span class="o"&gt;}&lt;/span&gt;
2026-01-09 16:57:56,151 INFO  &lt;span class="o"&gt;[&lt;/span&gt;.c.c.DefaultClassLoaderService] &lt;span class="o"&gt;[&lt;/span&gt;main] - close classloader service
2026-01-09 16:57:56,152 INFO  &lt;span class="o"&gt;[&lt;/span&gt;o.a.s.e.s.EventService        &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;event-forwarder-0] - Event forward thread interrupted
2026-01-09 16:57:56,155 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.i.NodeExtension         &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Destroying node NodeExtension.
2026-01-09 16:57:56,156 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.i.i.Node                  &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] Hazelcast Shutdown is completed &lt;span class="k"&gt;in &lt;/span&gt;18 ms.
2026-01-09 16:57:56,156 INFO  &lt;span class="o"&gt;[&lt;/span&gt;c.h.c.LifecycleService        &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;main] - &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 &lt;span class="o"&gt;[&lt;/span&gt;seatunnel-420804] &lt;span class="o"&gt;[&lt;/span&gt;5.1] &lt;span class="o"&gt;[&lt;/span&gt;localhost]:5801 is SHUTDOWN
2026-01-09 16:57:56,156 INFO  &lt;span class="o"&gt;[&lt;/span&gt;s.c.s.s.c.ClientExecuteCommand] &lt;span class="o"&gt;[&lt;/span&gt;main] - Closed HazelcastInstance ......
2026-01-09 16:57:56,156 INFO  &lt;span class="o"&gt;[&lt;/span&gt;s.c.s.s.c.ClientExecuteCommand] &lt;span class="o"&gt;[&lt;/span&gt;main] - Closed metrics executor service ......
2026-01-09 16:57:56,156 INFO  &lt;span class="o"&gt;[&lt;/span&gt;s.c.s.s.c.ClientExecuteCommand] &lt;span class="o"&gt;[&lt;/span&gt;SeaTunnel-CompletableFuture-Thread-7] - run shutdown hook because get close signal
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the job finishes, the Docker container created to execute it will be terminated, and you will be able to see the records written on Redis.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F25yw63ie5lebystiomjv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F25yw63ie5lebystiomjv.png" alt="Viewing Data with Redis Insight" width="800" height="499"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As promised, this was as simple as writing a single configuration file. Of course, understanding the numerous configuration options that both the source and sink connectors offer may take some time to grasp. You can check the configuration properties for Amazon DynamoDB &lt;a href="https://seatunnel.apache.org/docs/2.3.12/connector-v2/source/AmazonDynamoDB" rel="noopener noreferrer"&gt;here&lt;/a&gt; and those for Redis &lt;a href="https://seatunnel.apache.org/docs/2.3.12/connector-v2/sink/Redis" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Option 2: The Production Powerhouse (Cluster Mode)
&lt;/h3&gt;

&lt;p&gt;When you need an always-on data pipeline infrastructure that can handle multiple jobs, scale on demand, and provide REST API access, cluster mode is your friend.&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 1: Set Up Your Infrastructure
&lt;/h4&gt;

&lt;p&gt;Create a &lt;code&gt;docker-compose.yaml&lt;/code&gt; that spins up a complete SeaTunnel cluster. It also includes Redis, which, for this use case, will be the sink of the data pipeline.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;3.8'&lt;/span&gt;

&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;

  &lt;span class="na"&gt;seatunnel-master&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apache/seatunnel:2.3.12&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;seatunnel-master&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;ST_DOCKER_MEMBER_LIST=172.16.0.3,172.16.0.4,172.16.0.5&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./config/hazelcast-master.yaml:/opt/seatunnel/config/hazelcast-master.yaml&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./config/seatunnel.yaml:/opt/seatunnel/config/seatunnel.yaml&lt;/span&gt;
    &lt;span class="na"&gt;entrypoint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="s"&gt;/bin/sh -c "&lt;/span&gt;
      &lt;span class="s"&gt;/opt/seatunnel/bin/seatunnel-cluster.sh -r master&lt;/span&gt;
      &lt;span class="s"&gt;"    &lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;5801:5801"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8080:8080"&lt;/span&gt;  &lt;span class="c1"&gt;# REST API endpoint&lt;/span&gt;
    &lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;seatunnel_network&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;ipv4_address&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;172.16.0.3&lt;/span&gt;
    &lt;span class="na"&gt;healthcheck&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CMD-SHELL"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;curl&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;-f&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;http://localhost:8080/system-monitoring-information&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;||&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;exit&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;1"&lt;/span&gt; &lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;30s&lt;/span&gt;
      &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;
      &lt;span class="na"&gt;retries&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
      &lt;span class="na"&gt;start_period&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;60s&lt;/span&gt;

  &lt;span class="na"&gt;seatunnel-worker1&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apache/seatunnel:2.3.12&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;seatunnel-worker1&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;ST_DOCKER_MEMBER_LIST=172.16.0.3,172.16.0.4,172.16.0.5&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./config/hazelcast-worker.yaml:/opt/seatunnel/config/hazelcast-worker.yaml&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./config/seatunnel.yaml:/opt/seatunnel/config/seatunnel.yaml&lt;/span&gt;
    &lt;span class="na"&gt;entrypoint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="s"&gt;/bin/sh -c "&lt;/span&gt;
      &lt;span class="s"&gt;/opt/seatunnel/bin/seatunnel-cluster.sh -r worker&lt;/span&gt;
      &lt;span class="s"&gt;" &lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;seatunnel-master&lt;/span&gt;
    &lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;seatunnel_network&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;ipv4_address&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;172.16.0.4&lt;/span&gt;
    &lt;span class="na"&gt;healthcheck&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CMD-SHELL"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pgrep&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;-f&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;seatunnel&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;||&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;exit&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;1"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;30s&lt;/span&gt;
      &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;
      &lt;span class="na"&gt;retries&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
      &lt;span class="na"&gt;start_period&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;45s&lt;/span&gt;

  &lt;span class="na"&gt;seatunnel-worker2&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apache/seatunnel:2.3.12&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;seatunnel-worker2&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;ST_DOCKER_MEMBER_LIST=172.16.0.3,172.16.0.4,172.16.0.5&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./config/hazelcast-worker.yaml:/opt/seatunnel/config/hazelcast-worker.yaml&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./config/seatunnel.yaml:/opt/seatunnel/config/seatunnel.yaml&lt;/span&gt;
    &lt;span class="na"&gt;entrypoint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="s"&gt;/bin/sh -c "&lt;/span&gt;
      &lt;span class="s"&gt;/opt/seatunnel/bin/seatunnel-cluster.sh -r worker&lt;/span&gt;
      &lt;span class="s"&gt;" &lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;seatunnel-master&lt;/span&gt;
    &lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;seatunnel_network&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;ipv4_address&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;172.16.0.5&lt;/span&gt;
    &lt;span class="na"&gt;healthcheck&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CMD-SHELL"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pgrep&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;-f&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;seatunnel&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;||&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;exit&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;1"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;30s&lt;/span&gt;
      &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;
      &lt;span class="na"&gt;retries&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
      &lt;span class="na"&gt;start_period&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;45s&lt;/span&gt;

  &lt;span class="na"&gt;redis-database&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis-database&lt;/span&gt;
    &lt;span class="na"&gt;hostname&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis-database&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis:8.4.0&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;6379:6379"&lt;/span&gt;
    &lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;seatunnel_network&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;ipv4_address&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;172.16.0.2&lt;/span&gt;
    &lt;span class="na"&gt;healthcheck&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CMD-SHELL"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redis-cli&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;ping&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;|&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;grep&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;PONG"&lt;/span&gt; &lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;
      &lt;span class="na"&gt;retries&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
      &lt;span class="na"&gt;start_period&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5s&lt;/span&gt;
      &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5s&lt;/span&gt;

&lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;seatunnel_network&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;driver&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bridge&lt;/span&gt;
    &lt;span class="na"&gt;ipam&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;subnet&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;172.16.0.0/24&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Step 2: Configure your Cluster
&lt;/h4&gt;

&lt;p&gt;Create &lt;code&gt;config/seatunnel.yaml&lt;/code&gt; to enable the REST API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;seatunnel&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;engine&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;http&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;enable-http&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8080&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create &lt;code&gt;config/hazelcast-master.yaml&lt;/code&gt; and &lt;code&gt;config/hazelcast-worker.yaml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;hazelcast&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;network&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;rest-api&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;endpoint-groups&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;CLUSTER_READ&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;CLUSTER_WRITE&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;HEALTH_CHECK&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;DATA&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;auto-increment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;port-count&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;
      &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5801&lt;/span&gt;
    &lt;span class="na"&gt;join&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;tcp-ip&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;member-list&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;172.16.0.3&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;172.16.0.4&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;172.16.0.5&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Step 3: Launch Your Cluster
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your SeaTunnel cluster is now running with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1 Master node (coordinator + API server)&lt;/li&gt;
&lt;li&gt;2 Worker nodes (execute the actual data pipelines)&lt;/li&gt;
&lt;li&gt;Redis database (your sink and target system)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Step 4: Submit Jobs via REST API
&lt;/h4&gt;

&lt;p&gt;Now the magic happens! Submit jobs using a simple HTTP POST:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8080/submit-job?jobName&lt;span class="o"&gt;=&lt;/span&gt;dynamodb-to-redis &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "env": {
        "parallelism": 2,
        "job.mode": "BATCH"
    },
    "source": [
        {
            "plugin_name": "Amazondynamodb",
            "plugin_output": "dynamodb_source",
            "url": "http://dynamodb.us-east-1.amazonaws.com",
            "region": "us-east-1",
            "access_key_id": "YOUR_ACCESS_KEY",
            "secret_access_key": "YOUR_SECRET_KEY",
            "table": "mySourceTable",
            "schema": {
                "fields": {
                    "customerId": "int",
                    "customerName": "string",
                    "address": "string"
                }
            }
        }
    ],
    "transform": [],
    "sink": [
        {
            "plugin_name": "Redis",
            "host": "redis-database",
            "port": 6379,
            "support_custom_key": true,
            "key": "customer:{customerId}",
            "data_type": "hash"
        }
    ]
}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once this job is accepted, it will start executing immediately. You can actually follow up with the execution of this job in the web console that SeaTunnel exposes. Go to the URL &lt;code&gt;http://localhost:8080/#/overview&lt;/code&gt; and navigate to the &lt;code&gt;Jobs&lt;/code&gt; tab.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fopmora1pvvg2m8qj8xu9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fopmora1pvvg2m8qj8xu9.png" alt="SeaTunnel Web Console" width="800" height="454"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;With Apache SeaTunnel, syncing data from Amazon DynamoDB to Redis is a configuration exercise. Whether you need a one-time migration or an always-on data integration pipeline, Apache SeaTunnel has the tools for your needs.&lt;/p&gt;

&lt;p&gt;The beauty of this approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No code to maintain&lt;/strong&gt; – just configuration&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Battle-tested connectors&lt;/strong&gt; – focus on your business logic&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flexible deployment&lt;/strong&gt; – from laptop to production cluster&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalable by design&lt;/strong&gt; – scale with your data horiontally&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Ready to give it a spin? Start with standalone mode for quick experiments, then graduate to cluster mode when you're ready for production.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Have questions or want to share your specific data integration challenges? Please drop a comment below or find me on social media. Happy data syncing! 🚀&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>dynamodb</category>
      <category>redis</category>
      <category>seatunnel</category>
    </item>
    <item>
      <title>Organizing AI Applications: Lessons from traditional software architecture</title>
      <dc:creator>Ashwin Hariharan</dc:creator>
      <pubDate>Mon, 05 Jan 2026 13:30:00 +0000</pubDate>
      <link>https://dev.to/redis/organizing-ai-applications-lessons-from-traditional-software-architecture-mpe</link>
      <guid>https://dev.to/redis/organizing-ai-applications-lessons-from-traditional-software-architecture-mpe</guid>
      <description>&lt;p&gt;When I started learning AI and diving into frameworks like LangGraph, n8n, and the OpenAI APIs, I found plenty of great tutorials. They taught me how to build a simple chatbot, how to make my first LLM call, how to chain a few prompts together. Useful stuff for getting started.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Great for for learning. Less great for shipping.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;After the first couple of weeks, I wanted to build an actual production-ready application which goes beyond standard POCs. Something which uses AI but involving dozens of routes, multiple features and services, database operations, caching layers. But those beginner tutorials weren't enough. &lt;em&gt;Where do the embeddings live? How do I structure my agent workflows? Should my API routes call AI directly, or is there supposed to be a layer in between?&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The documentation showed me how to use the framework abstractions and APIs. It didn't show me how to organize them.&lt;/p&gt;

&lt;p&gt;If you're like me, coming from a JavaScript/TypeScript background, you know the language gives you a lot of freedom - no enforced folder structure, no prescribed architecture. You can organize your code however you want. But that freedom comes at a price.&lt;/p&gt;

&lt;p&gt;Without clear patterns to guide where things should go, you might end up with working code in all the wrong places. Calls to OpenAI API scattered everywhere, business logic tangled with your routes, and you just know this is going to be painful to maintain later.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frcw45hz8q1lti7j8de7b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frcw45hz8q1lti7j8de7b.png" alt="XKCD comic"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;br&gt;
    &lt;a href="https://xkcd.com/2044/" rel="noopener noreferrer"&gt;The cost of 'just making it work'&lt;/a&gt;&lt;br&gt;
  &lt;/p&gt;

&lt;p&gt;Here's the thing: software architecture trends come and go. In 2015, everyone said microservices were the future. In 2018, serverless. In 2021, JAMstack. In 2023, everyone quietly went back to monoliths.&lt;/p&gt;

&lt;p&gt;But you know what remained constant through all these trends? The fundamental principle of separating concerns. These same software development principles apply to AI and agentic AI applications. Whether you're building traditional web apps or AI-powered systems, the need for clear architecture remains constant.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Makes AI projects maintainable
&lt;/h2&gt;

&lt;p&gt;Let's establish what properties a good AI project should have:&lt;/p&gt;

&lt;h3&gt;
  
  
  Clear Ownership Boundaries
&lt;/h3&gt;

&lt;p&gt;Clear ownership boundaries define which part of your system is responsible for what. Good boundary means when something breaks or needs extension, you immediately know which component to check.&lt;/p&gt;

&lt;p&gt;Each distinct concern in your application should be handled by a single, well-defined module or component. This way, when something goes wrong or when you need to add a feature, you'll immediately know which part of your codebase is responsible.&lt;/p&gt;

&lt;p&gt;Clear boundaries mean each concern lives in an identifiable place. When something goes wrong, you immediately know which module to check. When you need to add a feature, you know which component to extend.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reusability across entry points
&lt;/h3&gt;

&lt;p&gt;Reusability means writing logic once and calling it from anywhere.&lt;/p&gt;

&lt;p&gt;Your core business logic should work the same way regardless of how it's triggered. Whether called from a web API, an AI agent, a scheduled job, a command-line tool, a message queue, or a test suite, the same functionality should be available without rewriting it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it matters&lt;/strong&gt;: Today it's a chat API. Tomorrow you may want a Slack bot. Next week, batch processing. And who knows? Maybe you'll discover that your users actually prefer the regular search over your fancy AI chatbot anyway. If your AI code is tied to say, your controllers, you might have to rewrite it each time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Testability
&lt;/h3&gt;

&lt;p&gt;Testability is the degree to which a piece of software can be tested easily and effectively. It describes how simple it is to check that the software works as intended. High testability means tests can be written quickly, run reliably, and give clear results, while low testability leads to tests that are difficult, slow, or unclear.&lt;/p&gt;

&lt;p&gt;AI applications have many moving parts. Vector search slow? Cache not hitting? Agent hallucinating? Embedding generation failing? When you're debugging, you shouldn't have to hunt through your entire codebase to find the problem.&lt;/p&gt;

&lt;h3&gt;
  
  
  Provider Independence
&lt;/h3&gt;

&lt;p&gt;Swapping from GPT-4 to Claude to Gemini shouldn't require changing business logic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it matters&lt;/strong&gt;: AI models evolve weekly. Today's best model is next month's deprecated one. Providers change pricing. Features get sunset. Your architecture should make provider switching very straightforward.&lt;/p&gt;




&lt;p&gt;If you're a software engineer who has started building AI applications and you want to move beyond simple code snippets and demos, let me show you what worked for me. In the following sections, I'll walk you through patterns I extracted from a real project - patterns that apply whether you're building e-commerce, SaaS tools, content platforms, or any AI-powered application.&lt;/p&gt;

&lt;p&gt;For illustration, I'll use an AI-powered restaurant discovery application. Think of a platform where users search for restaurants, browse by cuisine or location, and make reservations - the typical flow you'd see in apps like Zomato or Yelp. Now imagine you decide to add a chat feature to make restaurant discovery more conversational - a dining assistant in a corner chat window that enables finding the perfect restaurant through natural conversation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqud2lsewg66yjrrap737.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqud2lsewg66yjrrap737.png" alt="A simplified AI powered restaurant discovery app"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;br&gt;
    &lt;a href="https://github.com/redis-developer/restaurant-discovery-ai-agent-demo" rel="noopener noreferrer"&gt;View the implementation on GitHub&lt;/a&gt;&lt;br&gt;
  &lt;/p&gt;

&lt;p&gt;So now, instead of just filtering by &lt;em&gt;"Italian cuisine"&lt;/em&gt;, diners can now also ask questions like &lt;em&gt;"I need a romantic spot with live music for an anniversary dinner"&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;What this really comes down to:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;How do you add these AI capabilities without major refactoring of your existing system? And without over-engineering a solution that's way more complex than it needs to be?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Throughout the code examples, I will use Node.js / JavaScript as its syntax is widely familiar to anyone building apps for the web and generalizes well to other languages. But these architectural patterns apply equally to Python, Java, or any other language you're working with.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architectural patterns for AI apps
&lt;/h2&gt;

&lt;p&gt;After some research and quite a few experiments and trying out different approaches, I settled on few patterns that actually work. Before we dive in, know that these aren't necessarily AI-specific. They're the same principles that make any large application maintainable. But when you build with these patterns from the start, adding and testing AI features later becomes straightforward.&lt;/p&gt;

&lt;p&gt;Let's look at what they look like:&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern 1: Structure by business components or modules
&lt;/h3&gt;

&lt;p&gt;Rather than organizing code by technical function (all controllers together, all models together), organize by business components. Each module represents a bounded context with its own API, logic, and data access.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why This Matters for AI&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;AI applications typically have multiple distinct domains:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Conversation management (chat history, session state)&lt;/li&gt;
&lt;li&gt;AI workflow orchestration (agents, tools, prompts)&lt;/li&gt;
&lt;li&gt;Business entities (restaurants, users, reservations)&lt;/li&gt;
&lt;li&gt;Data processing (embeddings, vector search)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Mixing these concerns creates cognitive overload and makes it hard to reason about the system and maintain it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ❌ Bad: Organized by technical layers&lt;/span&gt;

&lt;span class="nx"&gt;src&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="nx"&gt;controllers&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;   &lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="nx"&gt;chatController&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;   &lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="nx"&gt;restaurantController&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;   &lt;span class="err"&gt;└──&lt;/span&gt; &lt;span class="nx"&gt;reservationController&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="nx"&gt;services&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;   &lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="nx"&gt;chatService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;   &lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="nx"&gt;restaurantService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;   &lt;span class="err"&gt;└──&lt;/span&gt; &lt;span class="nx"&gt;reservationService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;
&lt;span class="err"&gt;└──&lt;/span&gt; &lt;span class="nx"&gt;repositories&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
    &lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="nx"&gt;chatRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;
    &lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="nx"&gt;restaurantRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;
    &lt;span class="err"&gt;└──&lt;/span&gt; &lt;span class="nx"&gt;reservationRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Problem: To understand the "chat" feature, you jump between three different directories. Adding a new feature touches files across the entire codebase.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Colocation&lt;/strong&gt;: For a feature, put related code close together. Code that changes for the same feature should be neighbors and a short navigation away.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is also popularly known as &lt;strong&gt;"Domain-driven design"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When you add a new feature to the chat system, you typically need to modify the API endpoint, update the business logic, and adjust the data access layer. With domain-driven design, all these files are in the same &lt;code&gt;chat/&lt;/code&gt; directory-you never leave that folder. Without it, you're jumping between &lt;code&gt;controllers/&lt;/code&gt;, &lt;code&gt;services/&lt;/code&gt;, and &lt;code&gt;repositories/&lt;/code&gt; directories, trying to remember which pieces connect.&lt;/p&gt;


  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcrjchqqpzuttzq6zyasl.png" alt="Domain-Driven Design Architecture showing chat, restaurants, and reservations modules"&gt;Each domain is self-contained with its own API, Domain, and Data layers
  


&lt;p&gt;Benefit: Everything related to "chat" lives in one place. Each business component is self-contained.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ✅ Good: Organized by business components&lt;/span&gt;

&lt;span class="nx"&gt;modules&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;   &lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="nx"&gt;api&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;chatController&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;   &lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="nx"&gt;service&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;chatService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;   &lt;span class="err"&gt;└──&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;chatRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="nx"&gt;restaurants&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;   &lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="nx"&gt;api&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;restaurantController&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;   &lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="nx"&gt;service&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;restaurantService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;   &lt;span class="err"&gt;└──&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;restaurantRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;
&lt;span class="err"&gt;└──&lt;/span&gt; &lt;span class="nx"&gt;reservations&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
    &lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="nx"&gt;api&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;reservationController&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;
    &lt;span class="err"&gt;├──&lt;/span&gt; &lt;span class="nx"&gt;service&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;reservationService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;
    &lt;span class="err"&gt;└──&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;reservationRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;When you need to modify how restaurants are searched, you go directly to &lt;code&gt;modules/restaurants/&lt;/code&gt;. When you need to add a new AI tool, it goes in &lt;code&gt;modules/ai/agentic/tools&lt;/code&gt;. There's no guessing, no hunting through dozens of files.&lt;/p&gt;

&lt;p&gt;This also means you could extract any of these services into a separate microservice later without major refactoring - the boundaries are already defined.&lt;/p&gt;
&lt;h3&gt;
  
  
  Pattern 2: Layer your feature modules with 3-tier architecture
&lt;/h3&gt;

&lt;p&gt;While Pattern 1 is about grouping by business domain, Pattern 2 applies the same concept of &lt;em&gt;colocation&lt;/em&gt; &lt;em&gt;within&lt;/em&gt; each domain. The API, domain, and data layers for a feature stay together in the same feature module folder, not scattered across the codebase.&lt;/p&gt;


  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsqiat09pmg9v4ofmtjqk.png" alt="Three-tier architecture pattern showing entry points, domain logic, and data access layers"&gt;

  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fujddszzifba01n7ztpcf.png" alt="Three-tier architecture pattern showing entry points, domain logic, and data access layers"&gt;

  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiias83cq8nwmtx6vljf9.png" alt="Three-tier architecture pattern showing entry points, domain logic, and data access layers"&gt;Three-tier architecture
  


&lt;p&gt;Within each module, maintain clear separation between three concerns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Entry Points&lt;/strong&gt; (API routes, message queue consumers, scheduled jobs)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Domain Logic&lt;/strong&gt; (business rules, workflows, services)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data Access&lt;/strong&gt; (database queries, external API calls) - this is also called &lt;em&gt;repository-pattern&lt;/em&gt;.&lt;/li&gt;
&lt;/ul&gt;


  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frsoojgm0ismted7g8by0.png" alt="Restaurants module three-tier architecture with controller, service, and repository layers"&gt;



  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffwm8h5jfoubd1gdy05dk.png" alt="Reservations module three-tier architecture with controller, service, and repository layers"&gt;


&lt;p&gt;&lt;strong&gt;The Critical Rule:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Never pass framework-specific objects (Express &lt;code&gt;Request&lt;/code&gt;/&lt;code&gt;Response&lt;/code&gt;, HTTP headers, etc.) into your domain layer. Domain logic should be pure and reusable across different entry points.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ❌ Don't do this: Domain logic coupled to Express&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Response&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;express&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;handleChatMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sessionId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cookies&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="c1"&gt;// AI workflow logic mixed with HTTP handling&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;aiAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The problem with the above code is that this function would only work with Express. But AI workflows often need to be triggered from multiple sources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;HTTP API requests (user chat)&lt;/li&gt;
&lt;li&gt;Scheduled jobs (batch processing)&lt;/li&gt;
&lt;li&gt;Message queues (async workflows)&lt;/li&gt;
&lt;li&gt;Tests (validation)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your AI logic is tightly coupled to your HTTP layer, you can't reuse it elsewhere, and you won't be able to call it from a scheduled job, test, or CLI tool.&lt;/p&gt;

&lt;p&gt;Your AI workflow orchestration should be completely separate from the HTTP layer. Here's an example:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ✅ Good: Clean separation&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Response&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;express&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;getResponseFromAgent&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@modules/chat&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;handleChatMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sessionId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cookies&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sessionId&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getResponseFromAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ✅ business logic - no HTTP dependencies&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getResponseFromAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;aiAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;toolsUsed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tools&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Now, &lt;code&gt;getResponseFromAgent()&lt;/code&gt; can be called from anywhere - HTTP endpoints, scheduled jobs, tests, or CLI scripts. &lt;/p&gt;

&lt;p&gt;The API layer now focuses only on handling HTTP concerns - receiving the request, extracting the session ID and message, and returning a response - while delegating all business logic to the domain layer.&lt;/p&gt;

&lt;p&gt;Similarly, use the &lt;a href="https://martinfowler.com/eaaCatalog/repository.html" rel="noopener noreferrer"&gt;Repository pattern&lt;/a&gt; to prevent database details from leaking into business logic:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ❌ Don't put this in your services / domain logic&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;searchRestaurants&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redisClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ft&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;idx:restaurants&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="c1"&gt;// transform query response...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ✅ Good&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;searchRestaurants&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;queryEmbeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;generateEmbeddings&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;restaurantRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;vectorSearchRestaurants&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;queryEmbeddings&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ✅ restaurant-repository.ts&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RestaurantRepository&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;vectorSearchRestaurants&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;searchVector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;[])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Redis-specific implementation hidden&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ft&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="cm"&gt;/* ... */&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transformToRestaurants&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The domain layer remains pure and independent of HTTP and database implementation, returning structured results. Now if you need to switch databases based on performance or cost, your domain services don't change - only the repository implementation does.&lt;/p&gt;
&lt;h3&gt;
  
  
  Pattern 3: Tools and prompts should call domain logic, not implement it
&lt;/h3&gt;

&lt;p&gt;When you're building AI agents with LangChain, LangGraph, or similar frameworks, you define "tools" - functions the AI can call to perform actions. Need to search products? Create a tool for that. Add items to cart? Tool. Get user preferences? Tool.&lt;/p&gt;

&lt;p&gt;There are two common anti-patterns where business logic ends up in the wrong place:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Anti-pattern 1: Business logic in prompts&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`
You are a restaurant discovery assistant. Follow these rules:
1. Only show budget-friendly restaurants (under ₹500 per person) for budget users
2. Apply member discounts on reservations for gold members
3. Suggest fine dining establishments to premium members
...
`&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Problem: Your business rules now live in natural language. They're non-deterministic, untestable, and invisible to code review.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Anti-pattern 2: Business logic in tools&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It feels natural to write business logic directly in tool functions. During a conversation in the chat interface, the user wants to search products, so you write the search logic right there in the tool. You need database access, so you import the database client. You need to validate the search query, calculate relevance scores, apply business rules - all of it goes into the tool.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ❌ Bad: Business logic inside AI tool&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;tool&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@langchain/core/tools&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;zod&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;createClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;redis&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;searchRestaurantsTool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Database access directly in tool&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;createClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ft&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;idx:restaurants&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...);&lt;/span&gt;

    &lt;span class="c1"&gt;// Business logic in tool&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;filtered&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;priceFor2&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sorted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;filtered&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rating&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rating&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;search_restaurants&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Search for restaurants&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The problem with the above code is that the core logic is locked in the AI tool. Few  months later, you might realize you need that same search logic in a REST API endpoint, or in a scheduled job, or in a different agent. But it's tightly coupled to the AI framework you used.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Tools should be thin wrappers that translate AI intent into domain service calls.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That's it. The tool's job is simple:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Receive parameters from the AI&lt;/li&gt;
&lt;li&gt;Validate/transform them if needed&lt;/li&gt;
&lt;li&gt;Call the appropriate domain service&lt;/li&gt;
&lt;li&gt;Return the result in a format the AI understands&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The actual business logic? That lives in domain services, completely independent of any AI framework.&lt;/p&gt;


  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn90q9ju5bukzkkt25l3m.png" alt="AI Tools Pattern"&gt;AI tools as thin wrappers calling domain services
  


&lt;p&gt;Here's how you can write it:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ✅ Good: logic lives separately&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;searchRestaurants&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;queryEmbeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;generateEmbeddings&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;searchVector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;queryEmbeddings&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;restaurants&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;restaurantRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;vectorSearchRestaurants&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;searchVector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;limit&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;restaurants&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rating&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rating&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ✅ Tool simply calls domain service&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;searchRestaurants&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@modules/restaurants&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;tool&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@langchain/core/tools&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;zod&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;searchRestaurantsTool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Just a thin wrapper&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;restaurants&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;searchRestaurants&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;restaurants&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;search_restaurants&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Search for restaurants&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Benefit: Your business logic is now independent, just plain functions that take input and return output - you can reuse it everywhere. &lt;code&gt;searchRestaurants()&lt;/code&gt; can now be called from AI tools, HTTP endpoints, CLI scripts, or tests.&lt;/p&gt;

&lt;p&gt;Ask yourself: &lt;em&gt;"If I needed this logic in a REST API endpoint tomorrow, would I have to copy-paste code or could I just import a function?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;If the answer is copy-paste, your logic is in the wrong place.&lt;/p&gt;

&lt;p&gt;When your tools are thin adapters to domain services, you can test the core behavior locally without calling the LLM at all. This makes tests fast, reliable, and deterministic. Your code becomes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reusable across HTTP, CLI, scheduled jobs, tests&lt;/li&gt;
&lt;li&gt;Testable without AI framework mocking&lt;/li&gt;
&lt;li&gt;Maintainable because business logic lives in one place&lt;/li&gt;
&lt;li&gt;Flexible because you can change AI frameworks without rewriting business logic&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Pattern 4: Dependency Inversion
&lt;/h3&gt;

&lt;p&gt;High-level policy (your business logic) should not depend on low-level details (specific frameworks, providers, or databases). This follows the &lt;a href="https://en.wikipedia.org/wiki/Dependency_inversion_principle" rel="noopener noreferrer"&gt;Dependency Inversion Principle&lt;/a&gt; from Robert C. Martin's SOLID principles: your code should depend on abstractions, not on concrete implementations.&lt;/p&gt;

&lt;p&gt;In simpler terms: when you use external services-whether it's OpenAI, AWS Bedrock, LangChain, or Redis-hide them behind interfaces that match &lt;em&gt;your domain's&lt;/em&gt; needs, not theirs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why this matters for AI:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;AI applications introduce volatile dependencies that change frequently:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;AI Providers switch constantly&lt;/strong&gt;: Today's OpenAI becomes tomorrow's Anthropic or AWS Bedrock&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Models evolve rapidly&lt;/strong&gt;: GPT-4 becomes GPT-4 Turbo becomes GPT-4o&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frameworks shift&lt;/strong&gt;: LangChain might become LangGraph, or you might switch to a different agentic framework entirely.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vector databases compete&lt;/strong&gt;: Pinecone, Weaviate, Redis Vector Search - requirements change.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without DIP, switching any of these requires changes across your entire codebase. With DIP, you change one file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-world example: switching AI providers&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Consider an embeddings service. Without DIP, you'd scatter OpenAI SDK calls throughout your codebase:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ❌ Bad: Tightly coupled to OpenAI everywhere&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;searchRestaurants&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OPENAI_KEY&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text-embedding-3-small&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="c1"&gt;// search logic using embedding...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Now imagine your company decides to switch to AWS Bedrock for cost savings. You'd need to find and modify every place that calls OpenAI's embedding API.&lt;/p&gt;

&lt;p&gt;With DIP, high level policy does not depend on low level details:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ✅ Good: embeddings.ts&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;generateEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;texts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Implementation hidden behind interface&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;searchRestaurants&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Generate embedding for the search query&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;textEmbeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;generateEmbeddings&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

    &lt;span class="c1"&gt;// Use repository for pure data operations&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;restaurantRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;vectorSearchRestaurants&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;textEmbeddings&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The implementation file changes based on your provider, but the &lt;em&gt;interface&lt;/em&gt; stays the same:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;BedrockEmbeddings&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@langchain/aws&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;generateEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;texts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BedrockEmbeddings&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;awsRegion&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;texts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embedQuery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// openai-version/embeddings.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;generateEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;texts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;CONFIG&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;openAiApiKey&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text-embedding-3-small&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;texts&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;item&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Your domain services import from the abstraction:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;generateEmbeddings&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@modules/ai/helpers&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;findRestaurantsBySemanticSearch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;queryEmbeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;generateEmbeddings&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;searchVector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;queryEmbeddings&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;restaurantRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;vectorSearchRestaurants&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;searchVector&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Switching providers now means changing one file&lt;/strong&gt;, not hunting through your entire codebase. The LLM becomes just one dependency behind a clear interface, which we can replace with mocks or fixtures during testing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example: Abstracting agent frameworks&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The same principle applies to agentic AI frameworks. Your business logic shouldn't know whether you're using LangGraph, CrewAI, or AutoGen.&lt;/p&gt;

&lt;p&gt;Instead of spreading LangGraph-specific code everywhere, isolate your agentic AI workflow implementation in its own module:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ✅ Good&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;HumanMessage&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@langchain/core/messages&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;restaurantReservationsWorkflow&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@modules/ai/workflows&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;processUserQuery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;restaurantReservationsWorkflow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HumanMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
    &lt;span class="nx"&gt;sessionId&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;cacheStatus&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cacheStatus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;toolsUsed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;toolsUsed&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;With Dependency Inversion, you can experiment with different models without touching business logic and move between frameworks gradually as your needs evolve. It does introduce extra abstraction, which can sometimes feel like over-engineering, but it's most valuable when dependencies are volatile, you’re evaluating multiple options, or vendor lock-in is a real risk.&lt;/p&gt;
&lt;h3&gt;
  
  
  Pattern 5: Use environment-aware, secure, and hierarchical config
&lt;/h3&gt;

&lt;p&gt;AI applications have complex configuration needs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multiple AI provider credentials (OpenAI, Anthropic, AWS), model identifiers and versions&lt;/li&gt;
&lt;li&gt;Rate limits and timeouts&lt;/li&gt;
&lt;li&gt;Feature flags for different AI capabilities&lt;/li&gt;
&lt;li&gt;Vector database connections&lt;/li&gt;
&lt;li&gt;Caching configuration&lt;/li&gt;
&lt;li&gt;Guardrail settings&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So your configuration should be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Environment-aware&lt;/strong&gt;: Different values for dev/staging/production&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Secure&lt;/strong&gt;: Secrets never committed to version control&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Validated&lt;/strong&gt;: Fail fast on startup if config is invalid&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hierarchical&lt;/strong&gt;: Organized for easy discovery&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Type-safe (optional)&lt;/strong&gt;: Preferably with TypeScript or runtime validation
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ❌ Bad: Secrets hardcoded, scattered config&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;sk-abc123...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// Hardcoded secret!&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;bedrockModel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;anthropic.claude-v2&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// Magic string&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;redisHost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;localhost&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// Where's production config?&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The problem with the configuration above is that the configuration is scattered everywhere, and there’s no clean way to switch between models and environments.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ✅ Good: Centralized, validated, environment-aware&lt;/span&gt;
&lt;span class="c1"&gt;// config.js&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;dotenv&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dotenv&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nx"&gt;dotenv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;requiredEnvVars&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;REDIS_HOST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;

&lt;span class="c1"&gt;// Fail fast on startup if config is missing&lt;/span&gt;
&lt;span class="nx"&gt;requiredEnvVars&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;varName&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;varName&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Missing required environment variable: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;varName&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// AI Providers&lt;/span&gt;
  &lt;span class="na"&gt;openAi&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OPENAI_MODEL&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gpt-4o&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;parseInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OPENAI_TIMEOUT&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;30000&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;

  &lt;span class="na"&gt;aws&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;AWS_REGION&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;us-east-1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;accessKeyId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;secretAccessKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;bedrockModelId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;BEDROCK_MODEL_ID&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;anthropic.claude-v2&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;

  &lt;span class="c1"&gt;// Databases&lt;/span&gt;
  &lt;span class="na"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;host&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;REDIS_HOST&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;parseInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;REDIS_PORT&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;6379&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;password&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;REDIS_PASSWORD&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# .env (never committed to git)&lt;/span&gt;
&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;sk-abc123...
&lt;span class="nv"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;AKIA...
&lt;span class="nv"&gt;REDIS_HOST&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;localhost

&lt;span class="c"&gt;# .env.production (deployed separately)&lt;/span&gt;
&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;sk-prod123...
&lt;span class="nv"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;AKIA...
&lt;span class="nv"&gt;REDIS_HOST&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;redis.production.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The benefit now is that all configuration lives in one place, it's automatically validated on startup, switching between environments is easy, and secrets are no longer hardcoded in the codebase.&lt;/p&gt;
&lt;h3&gt;
  
  
  Pattern 6: Separate persistent data from agent memory
&lt;/h3&gt;

&lt;p&gt;AI agents need to remember things during conversations, but not everything should live in your main database. User preferences? Database. The fact that someone just asked about "cozy sweaters" 30 seconds ago? Agent memory.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why this matters for AI:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Mixing persistent and ephemeral data leads to bloated databases and slow AI responses. Different types of data have different storage requirements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Database&lt;/strong&gt;: User profiles, restaurant catalog, reservation history - things that need to persist indefinitely&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent Memory&lt;/strong&gt;: Conversation context, temporary preferences, session state - things that can expire&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vector Store&lt;/strong&gt;: Restaurant embeddings, semantic search indexes - specialized AI data structures&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The Implementation:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;services/
├── chat/
│   ├── services/
│   │   ├── workflow.js         &lt;span class="c"&gt;# LangGraph orchestration&lt;/span&gt;
│   │   └── memory.js           &lt;span class="c"&gt;# Session state, conversation context&lt;/span&gt;
│   └── data/
│       └── session-store.js    &lt;span class="c"&gt;# Fast, ephemeral storage&lt;/span&gt;
├── restaurants/
│   └── data/
│       ├── restaurant-repository.js    &lt;span class="c"&gt;# Persistent restaurant vector store&lt;/span&gt;
└── &lt;span class="nb"&gt;users&lt;/span&gt;/
    └── data/
        └── user-store.js       &lt;span class="c"&gt;# User profiles, preferences&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Benefits:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Faster AI responses (no database queries for temporary data)&lt;/li&gt;
&lt;li&gt;Cleaner main database (only persistent data)&lt;/li&gt;
&lt;li&gt;Automatic cleanup (session data expires)&lt;/li&gt;
&lt;li&gt;Better performance (right storage for each data type)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Getting the storage layer right is crucial for AI performance.&lt;/p&gt;

&lt;p&gt;
  summary
  &lt;p&gt;Check out &lt;strong&gt;&lt;a href="https://university.redis.io/learningpath/hbykf3qrnhwccy?tab=details" rel="noopener noreferrer"&gt;Redis for AI learning path&lt;/a&gt;&lt;/strong&gt; for hands-on experience implementing these memory patterns and vector search capabilities. &lt;/p&gt;

&lt;/p&gt;



&lt;p&gt;Your final project architecture could look something like this:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;modules/
│
├── restaurants/             &lt;span class="c"&gt;# Restaurants Domain&lt;/span&gt;
│   ├── api/                   &lt;span class="c"&gt;# HTTP layer&lt;/span&gt;
│   ├── service/                &lt;span class="c"&gt;# Business logic&lt;/span&gt;
│   └── data/                  &lt;span class="c"&gt;# Data access&lt;/span&gt;
│
├── reservations/            &lt;span class="c"&gt;# Reservations Domain&lt;/span&gt;
│   ├── api/
│   ├── service/
│   └── data/
│
├── chat/                    &lt;span class="c"&gt;# Conversation Domain&lt;/span&gt;
│   ├── api/
│   ├── service/
│   └── data/
├── ai/                      &lt;span class="c"&gt;# AI Domain&lt;/span&gt;
│   ├── agentic-restaurant-workflow/
│   │   ├── index.js          &lt;span class="c"&gt;# Workflow orchestration&lt;/span&gt;
│   │   ├── nodes.js          &lt;span class="c"&gt;# Agent definitions&lt;/span&gt;
│   │   ├── tools.js          &lt;span class="c"&gt;# AI tools&lt;/span&gt;
│   │   └── state.js          &lt;span class="c"&gt;# State management&lt;/span&gt;
│   └── helpers/
│       ├── embeddings.js     &lt;span class="c"&gt;# Embedding generation&lt;/span&gt;
│       └── caching.js        &lt;span class="c"&gt;# Cache logic&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgal43ra1n3vehd9fsvg9.png" alt="Complete Architecture with All Patterns"&gt;Complete architecture showing all six patterns working together
  


&lt;p&gt;Check out the example implementation on Github!&lt;/p&gt;

&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/redis-developer" rel="noopener noreferrer"&gt;
        redis-developer
      &lt;/a&gt; / &lt;a href="https://github.com/redis-developer/restaurant-discovery-ai-agent-demo" rel="noopener noreferrer"&gt;
        restaurant-discovery-ai-agent-demo
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      An Agentic AI restaurant discovery platform that combines Redis's speed with LangGraph's intelligent workflow orchestration. Get personalized restaurant recommendations, make reservations, and get lightning-fast responses through semantic caching.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;🍽️ Restaurant Discovery AI Agent&lt;/h1&gt;
&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Redis-powered restaurant discovery with intelligent dining assistance.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;An AI-powered restaurant discovery platform that combines Redis's speed with LangGraph's intelligent workflow orchestration. Get personalized restaurant recommendations, smart dining suggestions, and lightning-fast responses through semantic caching.&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;App screenshots&lt;/h2&gt;
&lt;/div&gt;

&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://github.com/redis-developer/restaurant-discovery-ai-agent-demo/./docs/screenshots/home-screen.png"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub.com%2Fredis-developer%2Frestaurant-discovery-ai-agent-demo%2F.%2Fdocs%2Fscreenshots%2Fhome-screen.png" alt="App home page"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Tech Stack&lt;/h2&gt;
&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://nodejs.org/" rel="nofollow noopener noreferrer"&gt;Node.js (v24+)&lt;/a&gt;&lt;/strong&gt; + &lt;strong&gt;Express&lt;/strong&gt; - Backend runtime and API framework&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://redis.io" rel="nofollow noopener noreferrer"&gt;Redis&lt;/a&gt;&lt;/strong&gt; - Restaurant store, agentic AI memory, conversational history, and semantic caching&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://redis.io/langcache/" rel="nofollow noopener noreferrer"&gt;Redis LangCache API&lt;/a&gt;&lt;/strong&gt; - Semantic caching for LLM responses&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LangGraph&lt;/strong&gt; - AI workflow orchestration&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://platform.openai.com/account/api-keys" rel="nofollow noopener noreferrer"&gt;OpenAI API&lt;/a&gt;&lt;/strong&gt; - GPT-4 for intelligent responses and embeddings for vector search&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTML + CSS + Vanilla JS&lt;/strong&gt; - Frontend UI&lt;/li&gt;
&lt;/ul&gt;




&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Product Features&lt;/h2&gt;

&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Smart Restaurant Discovery&lt;/strong&gt;: AI-powered assistant helps you find restaurants, discover cuisines, and manage your reservations. Both text and vector-based search across restaurants&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dining Intelligence&lt;/strong&gt;: Get restaurant recommendations with detailed information for any cuisine or occasion using RAG (Retrieval Augmented Generation)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Demo Reservation System&lt;/strong&gt;: Reservation management…&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/redis-developer/restaurant-discovery-ai-agent-demo" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;



&lt;h2&gt;
  
  
  Next steps
&lt;/h2&gt;

&lt;p&gt;Applied AI is quite fluid today - patterns, frameworks, and libraries are changing constantly. Teams are expected to deliver features fast. In this environment, your architecture needs flexibility. You can't afford to lock your code into rigid structures that make change expensive.&lt;/p&gt;

&lt;p&gt;I have been working with the patterns we discussed for quite some time and they really helped. Remember, these are not rigid rules. Learn what works for your project, and adapt. Think of your project not as a monolithic whole, but as independent, composable features with clear interfaces:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If your code is scattered across technical layers, start grouping domains together grouping domains together.&lt;/li&gt;
&lt;li&gt;If your AI tools contain database queries, extract them into domain services.&lt;/li&gt;
&lt;li&gt;If you're tightly coupled to OpenAI, add an abstraction layer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're working in the Node.js ecosystem, frameworks like &lt;a href="https://nestjs.com/" rel="noopener noreferrer"&gt;NestJS&lt;/a&gt; can help you implement many of these patterns out of the box - modules, dependency injection, layered architecture. But here's the thing: these patterns aren't tied to any specific framework or even any specific language.&lt;/p&gt;

&lt;p&gt;Choose the tools that work for your team, and these patterns will help you organize things in such a way that refactoring is something that is performed casually on a daily basis. It will also help pave the way to full-blown microservices in the future once your app grows. You'll likely refactor anyway after you've built something real.&lt;/p&gt;




&lt;p&gt;Thanks to &lt;a href="https://www.linkedin.com/in/raphaeldelio" rel="noopener noreferrer"&gt;Raphael De Lio&lt;/a&gt; and &lt;a href="https://www.linkedin.com/in/bhavana-anant-giri-0a43109b" rel="noopener noreferrer"&gt;Bhavana Giri&lt;/a&gt; for the detailed technical review and great suggestions on the article's structure.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>architecture</category>
      <category>javascript</category>
      <category>programming</category>
    </item>
    <item>
      <title>From PostgreSQL to Redis: Accelerating Your Applications with Redis Data Integration</title>
      <dc:creator>Ricardo Ferreira</dc:creator>
      <pubDate>Wed, 17 Dec 2025 01:57:25 +0000</pubDate>
      <link>https://dev.to/redis/from-postgresql-to-redis-accelerating-your-applications-with-redis-data-integration-3dn2</link>
      <guid>https://dev.to/redis/from-postgresql-to-redis-accelerating-your-applications-with-redis-data-integration-3dn2</guid>
      <description>&lt;p&gt;Here's a statistic that might surprise you: 90% of all relational OLTP workloads are pure reads. Let that sink in. Nine out of ten database operations in your transactional system are simply fetching data, not modifying it. Yet these reads are competing for the same resources as your critical write operations. Resources like CPU, disk I/O, and network bandwidth.&lt;/p&gt;

&lt;p&gt;Let me illustrate the impact of this with a practical example. Say you are responsible for an e-commerce platform. Orders are flowing in, customers are browsing products, and your PostgreSQL database is handling transactions as expected. However, a problem lies beneath the surface, one that becomes apparent during peak shopping hours. Page load times creep up. Product searches feel sluggish. Cart updates lag just enough to frustrate users. Yes, this is a bad user experience for sure. In the world of e-commerce, where Amazon has accustomed customers to expect nothing less than sub-second responses, every millisecond of delay translates to lost revenue.&lt;/p&gt;

&lt;p&gt;The root cause? Disk and network I/O are hindering your transactions. Your perfectly normalized PostgreSQL database, while excellent at maintaining data consistency and handling complex transactions, wasn't designed for the read-heavy, millisecond-response-time demands of modern applications. Every product view, every category browse, and every user profile fetch requires a round trip to disk-based storage, and this is expensive as it competes for resources with write operations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cache-Aside Pattern: A Band-Aid, Not a Cure
&lt;/h2&gt;

&lt;p&gt;For years, developers have turned to the cache-aside pattern as the go-to solution for addressing the load challenges with read-intensive applications. The logic of this pattern seems sound: apps handle reads primarily with Redis, and only hit the source database on cache misses, and update the cache with fresh data. It's the "happy path" developers all dream about.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyp55hqdduc8358ceebwa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyp55hqdduc8358ceebwa.png" alt="Cache-Aside Pattern" width="800" height="576"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Everything is great until reality sets in. The cache-aside pattern quickly reveals three critical flaws:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Repetitive Update Logic&lt;/strong&gt;: Every application must implement the same caching logic. Each microservice, each new feature, each development team reinvents the wheel. It's challenging to maintain best practices across projects, and database schema changes often break with every new release.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. The Thundering Herd Problem&lt;/strong&gt;: When cache keys expire simultaneously, imagine a flash sale starting at midnight with thousands of requests hammering your database at once. Your database must be sized not for average load, but for these sporadic read spikes. Query times slow to a crawl, eventually causing cascading failures.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Data Invalidation Nightmares&lt;/strong&gt;: What happens when records are deleted from the database? How do you handle updates that affect multiple cached entries? There's no atomic way to write to both Redis and your database, leading to inconsistency windows that corrupt user experiences.&lt;/p&gt;

&lt;p&gt;After years of experiencing problems like this, developers came up with another pattern that aims to extend the cache aside with a more proactive approach. This pattern is known as refresh ahead.&lt;/p&gt;

&lt;h2&gt;
  
  
  Refresh-Ahead Pattern: You Don't Call Me; I Call You!
&lt;/h2&gt;

&lt;p&gt;Right, so you know reads must be served by Redis as it is faster than disk-based databases. But with cache aside, you must wait until a read request comes in to effectively read from the source database. Why can't we change this paradigm and let the cache be populated proactively using a dedicated update engine?&lt;/p&gt;

&lt;p&gt;This is what the refresh-ahead pattern is all about. You leverage an engine that will be responsible for pulling records from your source database and moving the data to Redis for eventual reads. The same engine must also be responsible for periodically monitoring the source database to identify changes and updating Redis accordingly. This includes monitoring deleted records to trigger key invalidation at Redis.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsb9tsimuj0o3o1s3m74m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsb9tsimuj0o3o1s3m74m.png" alt="Refresh Ahead Pattern" width="800" height="551"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is a great pattern to implement in conjunction with Redis for read-intensive use cases. Some teams turn to Change Data Capture (CDC) using tools like Apache Kafka and Debezium to achieve this. Others decide to implement complex ETL pipelines. Regardless of the implementation stack, the idea is the same: capture database changes as events and stream them to Redis. However, this approach introduces what we call the "distributed systems hole".&lt;/p&gt;

&lt;p&gt;A complexity trap that consumes entire development teams whose main job is not always to maintain data pipelines like that. Implementing the refresh ahead pattern manually often creates the following problems:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Developer Overutilization&lt;/strong&gt;: Your best engineers will spend months building and maintaining data pipelines instead of working on the systems that are actually tied to the company's revenue, creating a perception that they are not working toward the organization's goals.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Expertise Tax&lt;/strong&gt;: Apache Kafka, Debezium, and ETL experts command premium salaries; it's hard to keep them, and more importantly, replace them. If the team is not carefully planned from day one, it will be hard to justify to the business why they need to delay that important launch because someone from the team has left.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Operational Complexity&lt;/strong&gt;: Every schema change necessitates pipeline updates, and every deployment carries the risk of data inconsistency. This requires teams to be on-call every time the domain model changes, as this will break the integration built to sustain the data pipeline that has been developed.&lt;/p&gt;

&lt;p&gt;Let's go back to the original problem. All you wanted was to speed up your application because reads are more frequent than writes. Instead, you've created a distributed systems monster that requires constant feeding and care.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementing the Refresh-Ahead Pattern with RDI
&lt;/h2&gt;

&lt;p&gt;This is where &lt;a href="https://redis.io/data-integration/" rel="noopener noreferrer"&gt;Redis Data Integration (RDI)&lt;/a&gt; changes the game entirely. RDI implements the refresh-ahead pattern with a future-proof solution that moves data proactively from your source database to Redis, keeping both in perfect sync without the complexity overhead. Unlike traditional CDC solutions, RDI requires no expertise in distributed systems. Its configuration, not code. It's operational simplicity, not complexity. It's a solution that doesn't hold you back.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo6jhz44ivivv4u6mainn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo6jhz44ivivv4u6mainn.png" alt="RDI Architecture" width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Major enterprises, &lt;a href="https://redis.io/customers/axis-bank" rel="noopener noreferrer"&gt;such as Axis Bank&lt;/a&gt;, are already utilizing RDI to accelerate their applications, and guess what: you can use it too. RDI is available for on-premise deployments via &lt;a href="https://redis.io/software" rel="noopener noreferrer"&gt;Redis Enterprise&lt;/a&gt;, and for cloud users via &lt;a href="https://redis.io/cloud" rel="noopener noreferrer"&gt;Redis Cloud&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Let's see how this works with a real e-commerce dataset stored at PostgreSQL to showcase RDI's capabilities. For this example, you can use the Docker Compose file below that creates a pre-configured PostgreSQL database with CDC enabled.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;postgres&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;debezium/postgres:15-alpine&lt;/span&gt;
    &lt;span class="na"&gt;hostname&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;postgres&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;postgres&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;POSTGRES_USER=postgres&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;POSTGRES_PASSWORD=postgres&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;POSTGRES_DB=postgres&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;5432:5432&lt;/span&gt;
    &lt;span class="na"&gt;healthcheck&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CMD-SHELL"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pg_isready"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;
      &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5s&lt;/span&gt;
      &lt;span class="na"&gt;retries&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./pgdata:/var/lib/postgresql/data&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./scripts/initial-load.sql:/docker-entrypoint-initdb.d/initial-load.sql&lt;/span&gt;
  &lt;span class="na"&gt;pgadmin&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;dpage/pgadmin4&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pgadmin4&lt;/span&gt;
    &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;always&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8888:80"&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;PGADMIN_DEFAULT_EMAIL&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;admin@postgres.com&lt;/span&gt;
      &lt;span class="na"&gt;PGADMIN_DEFAULT_PASSWORD&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pgadmin4pwd&lt;/span&gt;
    &lt;span class="na"&gt;healthcheck&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CMD"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wget"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-O"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:80/misc/ping"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;
      &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5s&lt;/span&gt;
      &lt;span class="na"&gt;retries&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./pgadmin:/var/lib/pgadmin&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As the container with the PostgreSQL databases starts up, a script is executed to create the necessary tables. You can find this script &lt;a href="https://github.com/redis-developer/postgres-to-redis-rdi-demo/blob/main/source-db/scripts/initial-load.sql" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The database contains normalized tables, some of which contain foreign key relationships. Some of these tables are exactly what you'd expect in a typical transactional system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Categories&lt;/strong&gt; and &lt;strong&gt;Products&lt;/strong&gt; with a one-to-many relationship&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Customers&lt;/strong&gt; placing &lt;strong&gt;Orders&lt;/strong&gt; containing multiple &lt;strong&gt;OrderItems&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Suppliers&lt;/strong&gt; connected to &lt;strong&gt;Products&lt;/strong&gt; through a many-to-many relationship&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These tables represent your source of truth—but optimized for consistency, not speed.&lt;/p&gt;

&lt;h3&gt;
  
  
  How RDI Works?
&lt;/h3&gt;

&lt;p&gt;Once you have installed RDI and deployed your data pipeline, here is what happens behind the scenes. First, RDI performs an initial cache loading, populating Redis with all your existing data. In the demo below, you can see 78 records being synchronized initially, creating data streams for each table. The RDI dashboard displays real-time metrics, including records inserted, updated, and deleted, with timestamps accurate to the millisecond.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Failhqfwuu63g1l21sxf9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Failhqfwuu63g1l21sxf9.png" alt="RDI Analytics" width="800" height="474"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After this, every time you insert a new user through pgAdmin, it appears in Redis within milliseconds. When you update an order status, Redis reflects the change instantly. When you delete a product, it is automatically removed from Redis. This isn't eventual consistency with fingers crossed. It's guaranteed synchronization through CDC.&lt;/p&gt;

&lt;p&gt;By default, records will be written into Redis using Hashes data type. For simple entities, such as categories, that have flat, predictable fields, you can use Redis hashes. With Hashes, the primary key forms part of the Redis key (e.g., &lt;code&gt;category:1&lt;/code&gt;). This provides O(1) access to any field, without the need for table scans, and no more index lookups. This is an example of how the data will look in Redis using Hashes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2ni3yyrv1ew4q3166j11.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2ni3yyrv1ew4q3166j11.png" alt="Data at Redis with Hashes" width="800" height="473"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For more complicated data entities that require the usage of nested data, arrays, or flexible schemas, you can use the JSON data type in Redis to store data with more flexibility. This is an example of how the data will look in Redis using JSON:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb7rpgy9btzzcafq0288o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb7rpgy9btzzcafq0288o.png" alt="Data at Redis with JSON" width="800" height="475"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But RDI goes beyond simple replication. The stream processor layer continuously translates data to your preferred data model, as well as the layout you implement via transformations. For example, consider the &lt;code&gt;users&lt;/code&gt; table with the following record:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;52&lt;/span&gt;
&lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;riferrei&lt;/span&gt;  
&lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Ricardo&lt;/span&gt;
&lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Ferreira&lt;/span&gt;
&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ricardo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ferreira&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You want to create two additional fields in the final record that will be written to Redis. You can create a transformation using YAML like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;custom-job&lt;/span&gt;
&lt;span class="na"&gt;source&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;public&lt;/span&gt;
  &lt;span class="na"&gt;table&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;user&lt;/span&gt;
&lt;span class="na"&gt;transform&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;add_field&lt;/span&gt;
    &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;expression&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;first_name || ' ' || last_name&lt;/span&gt;
      &lt;span class="na"&gt;field&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;display_name&lt;/span&gt;
      &lt;span class="na"&gt;language&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sql&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;add_field&lt;/span&gt;
    &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;expression&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="s"&gt;CASE&lt;/span&gt;
          &lt;span class="s"&gt;WHEN email LIKE '%@example.com' THEN 'internal'&lt;/span&gt;
          &lt;span class="s"&gt;ELSE 'external'&lt;/span&gt;
        &lt;span class="s"&gt;END&lt;/span&gt;
      &lt;span class="na"&gt;field&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;user_type&lt;/span&gt;
      &lt;span class="na"&gt;language&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sql&lt;/span&gt;
&lt;span class="na"&gt;output&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis.write&lt;/span&gt;
    &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;connection&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;target&lt;/span&gt;
      &lt;span class="na"&gt;data_type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After the transformation that implements data enrichment, the final record becomes a Redis JSON document with computed fields:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;52&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"username"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"riferrei"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"first_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Ricardo"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; 
  &lt;/span&gt;&lt;span class="nl"&gt;"last_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Ferreira"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"email"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ricardo.ferreira@example.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"display_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Ricardo Ferreira"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"user_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"internal"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice the two new fields that don't exist in PostgreSQL:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;display_name&lt;/code&gt;: Concatenated from first and last names&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;user_type&lt;/code&gt;: Computed based on email domain logic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This transformation happens in the RDI stream processor, which parses your YAML configuration with the transformations. No code is required in your application. No cache invalidation needed as well. Just pure, configuration-driven transformation.&lt;/p&gt;

&lt;p&gt;If you want to try this demo yourself, you can do so by following the instructions in the following GitHub repository:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/redis-developer/postgres-to-redis-rdi-demo" rel="noopener noreferrer"&gt;https://github.com/redis-developer/postgres-to-redis-rdi-demo&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The beauty of this repository is that you can run it entirely on your local machine using Kubernetes. The repository includes everything you need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Automated deployment scripts for RDI (both local and cloud options)&lt;/li&gt;
&lt;li&gt;A pre-configured PostgreSQL database with sample e-commerce data&lt;/li&gt;
&lt;li&gt;Transformation job examples showing JSON and Hash outputs&lt;/li&gt;
&lt;li&gt;Step-by-step instructions with visual guides&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Within minutes, you'll have a complete CDC pipeline streaming data from PostgreSQL to Redis, transforming relational tables into high-performance key-value structures.&lt;/p&gt;

&lt;h2&gt;
  
  
  Beyond Simple Caching: A Living Data Layer
&lt;/h2&gt;

&lt;p&gt;What sets Redis apart from other caching solutions is its ability to provide a future-proof solution for problems. This isn't a cache that might be stale or needs complex invalidation logic. It's a real-time materialized view of your source database, transformed and optimized for high-speed access. The configuration-driven approach RDI provides means you can evolve your data pipeline without touching application code:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Need to add a new computed field? Just update the YAML configuration file.&lt;/li&gt;
&lt;li&gt;Want to change how data is structured in Redis? Modify the transformation job.&lt;/li&gt;
&lt;li&gt;Schema changed in the source? RDI adapts automatically. No action needed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No redeployment, no code changes, no downtime. Just operational simplicity that lets developers focus on innovation instead of infrastructure. RDI solves the fundamental tension in modern application architecture. You no longer have to choose between consistency and speed, simplicity and performance, and developer productivity and operational excellence.&lt;/p&gt;

&lt;p&gt;RDI delivers exactly that: an out-of-the-box data pipeline that offloads reads to Redis, speeding up both your applications and your learning curve.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Future Is Refresh-Ahead
&lt;/h2&gt;

&lt;p&gt;As we move toward software architectures where data is always in flux, where latency is measured in microseconds, and where scale is assumed rather than planned, the ability to seamlessly synchronize and transform data between complementary stores becomes essential.&lt;/p&gt;

&lt;p&gt;Redis Data Integration represents more than a technical solution. It's a paradigm shift in how we think about data architecture. It's the realization that we don't need to accept the trade-offs we've lived with for years. We can achieve transactional consistency with disk-based databases, along with blazing-fast reads in Redis, without compromising on complexity.&lt;/p&gt;

&lt;p&gt;The question isn't whether you need real-time data synchronization; it's when you need it. It's whether you can afford to keep solving the 90% problem with yesterday's solutions. Welcome to the refresh-ahead revolution. Your applications and your users will thank you.&lt;/p&gt;

</description>
      <category>redis</category>
      <category>rdi</category>
      <category>database</category>
      <category>etl</category>
    </item>
    <item>
      <title>Implementing Semantic Anomaly Detection with OpenTelemetry and Redis</title>
      <dc:creator>Ricardo Ferreira</dc:creator>
      <pubDate>Tue, 09 Dec 2025 17:07:16 +0000</pubDate>
      <link>https://dev.to/redis/implementing-semantic-anomaly-detection-with-opentelemetry-and-redis-4506</link>
      <guid>https://dev.to/redis/implementing-semantic-anomaly-detection-with-opentelemetry-and-redis-4506</guid>
      <description>&lt;h2&gt;
  
  
  &lt;strong&gt;Detecting Anomalies in OpenTelemetry Logs Using Vector Embeddings and Redis&lt;/strong&gt;
&lt;/h2&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Finding Unknown Unknowns in Your Logs&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Picture this: It's 3 AM, and your payment service starts failing. The errors don't match any of your alert rules. The log messages are technically distinct from anything you've seen before, but semantically, they're similar to a database connection issue that occurred six months ago. Your rule-based system missed it because the error message format changed. Your keyword searches failed because the new error uses different terminology. By the time your team discovers the issue through customer complaints, you have already lost thousands in revenue.&lt;/p&gt;

&lt;p&gt;This scenario plays out daily across engineering teams. Modern applications generate millions of logs, and buried within them are critical anomalies, such as security breaches, performance degradations, and system failures. The challenge isn't just the volume; it's that we don't know what we're looking for. New failure modes emerge constantly, attackers develop novel techniques, and dependencies fail in unexpected ways. Meanwhile, traditional monitoring relies on flawed assumptions, such as anticipating all failure modes (rule-based detection) and assuming that anomalies will use predictable keywords (search-based detection). But what if, instead of looking for specific patterns, we could teach our system to understand what "normal" looks like and flag anything that deviates from that understanding?&lt;/p&gt;

&lt;p&gt;This is where semantic anomaly detection comes in. By converting logs into vector embeddings that capture their meaning, we can identify anomalies based on how different they are from normal behavior, even if we've never seen that specific error before. It's like teaching your monitoring system to understand context and meaning, not just pattern matching.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Understanding the Solution: Embeddings and Vector Search&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Before diving into code, let's understand what makes this approach powerful. An embedding is a method for representing data as an array of numbers (a vector) with the purpose of capturing its semantic meaning. When we convert the log message "User authentication failed for admin account" into an embedding, we obtain a vector of the form [0.23, -0.45, 0.67, ...], stored in a 768-dimensional space, where similar meanings result in similar vectors. We can achieve this by using a vector store, such as Redis, and executing a &lt;a href="https://www.youtube.com/watch?v=o3XN4dImESE" rel="noopener noreferrer"&gt;semantic search&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Here's the key insight: logs that describe similar events will have similar embeddings, even if they use different words. "Authentication failed" and "Login unsuccessful" will produce vectors that are close together in the 768-dimensional space. Meanwhile, "Authentication failed" and "Payment processed successfully" will be far apart. This distance becomes our anomaly score.&lt;/p&gt;

&lt;p&gt;This is powerful, but it can be further enhanced. When we combine this with OpenTelemetry's structured logging, we can add additional context to the semantic search. OpenTelemetry (or OTEL for short) provides standardized fields for observability data:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;service.name&lt;/code&gt; tells us which service generated the log&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;http.status_code&lt;/code&gt; indicates success or failure
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;net.peer.ip&lt;/code&gt; shows who's connecting&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;trace.id&lt;/code&gt; links related logs together&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are only a few examples of fields that can be provided by OTEL structured logging. They were included in the OTEL project in 2023 when Elastic donated its structured approach to logs, known as Elastic Common Schema (ECS), to OTEL. You can read more about this &lt;a href="https://opentelemetry.io/blog/2023/ecs-otel-semconv-convergence/" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;By incorporating this context into our embeddings, we create rich representations that not only understand what happened, but also where, when, and under what circumstances. A "connection timeout" from your database service at 3 AM is very different from the same message from a third-party API during business hours, and our embeddings will reflect that.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Building the Anomaly Detection System&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Let's build a production-ready anomaly detector using &lt;a href="https://redis.io/open-source/" rel="noopener noreferrer"&gt;Redis Open Source&lt;/a&gt;, which includes support for &lt;a href="https://redis.io/docs/latest/develop/ai/search-and-query/" rel="noopener noreferrer"&gt;Redis Query Engine (RQE)&lt;/a&gt;. This will allow us to easily store vector embeddings and implement semantic search, which is required for this use case.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Setting Up Dependencies&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The Docker Compose file below provides you with an easy way to spin up Redis, as well as &lt;a href="https://redis.io/insight/" rel="noopener noreferrer"&gt;Redis Insight&lt;/a&gt;, which you can use to browse and inspect data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;

  &lt;span class="na"&gt;redis-database&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis-database&lt;/span&gt;
    &lt;span class="na"&gt;hostname&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis-database&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis:8.4.0&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./data:/data&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;REDIS_ARGS&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;--save 30 &lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;6379:6379"&lt;/span&gt;
    &lt;span class="na"&gt;healthcheck&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CMD-SHELL"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redis-cli&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;ping&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;|&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;grep&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;PONG"&lt;/span&gt; &lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;
      &lt;span class="na"&gt;retries&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
      &lt;span class="na"&gt;start_period&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5s&lt;/span&gt;
      &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5s&lt;/span&gt;

  &lt;span class="na"&gt;redis-insight&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis-insight&lt;/span&gt;
    &lt;span class="na"&gt;hostname&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis-insight&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis/redisinsight:2.70.1&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;redis-database&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;RI_REDIS_HOST&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redis-database"&lt;/span&gt;
      &lt;span class="na"&gt;RI_REDIS_PORT&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;6379"&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;5540:5540"&lt;/span&gt;
    &lt;span class="na"&gt;healthcheck&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CMD"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sh"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-c"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wget&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;-q&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;-O-&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;http://redis-insight:5540/api/health&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;|&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;grep&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;-q&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;up&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;'"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;
      &lt;span class="na"&gt;retries&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
      &lt;span class="na"&gt;start_period&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5s&lt;/span&gt;
      &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5s&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When both services are up, you can access your Redis database using a browser. Navigate to &lt;code&gt;http://localhost:5540&lt;/code&gt;, and you should see the following page:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0259sjp1kur6uxk49oth.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0259sjp1kur6uxk49oth.png" alt="Redis Insight" width="800" height="459"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To interact with RQE, we will use &lt;a href="https://github.com/redis/redis-vl-python" rel="noopener noreferrer"&gt;RedisVL (Redis Vector Library)&lt;/a&gt;, which provides a clean Python interface for vector operations in Redis. We'll start with the core components and gradually build up to a fully functional system.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Setting Up RedisVL and the Vector Index&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;First, we need to create a vector index that can store our log embeddings and enable fast similarity search. RedisVL makes this easy  for you while handling the complexity of index management behind the scenes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;redisvl.schema&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;IndexSchema&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;redisvl.index&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SearchIndex&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sentence_transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SentenceTransformer&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Tuple&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;OTELAnomalyDetector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        Initialize our anomaly detection system with RedisVL.

        We&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;re using sentence-transformers for embeddings because:
        1. They&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;re specifically trained for semantic similarity
        2. They run efficiently on CPU (no GPU required)
        3. They produce fixed-size vectors perfect for similarity search
        &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

        &lt;span class="c1"&gt;# Initialize the embedding model
&lt;/span&gt;        &lt;span class="c1"&gt;# all-MiniLM-L6-v2 gives us 768-dimensional embeddings
&lt;/span&gt;        &lt;span class="c1"&gt;# It's fast (2000+ sentences/sec on CPU) and accurate
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SentenceTransformer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;all-MiniLM-L6-v2&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Define our Redis schema
&lt;/span&gt;        &lt;span class="c1"&gt;# This tells RedisVL how to store and index our data
&lt;/span&gt;        &lt;span class="n"&gt;schema&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;index&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;otel-logs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prefix&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;log:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# All keys will start with "log:"
&lt;/span&gt;                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;storage_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# Redis hash for flexible field storage
&lt;/span&gt;            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fields&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="c1"&gt;# Vector field for embeddings - this is where the magic happens
&lt;/span&gt;                &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;embedding&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;attrs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dims&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;768&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Dimensions from our model
&lt;/span&gt;                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;distance_metric&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cosine&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Cosine similarity for semantic matching
&lt;/span&gt;                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;algorithm&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hnsw&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Hierarchical Navigable Small World for fast search
&lt;/span&gt;                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;datatype&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;float32&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="c1"&gt;# OTEL semantic convention fields for context
&lt;/span&gt;                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;service_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;  &lt;span class="c1"&gt;# Which service?
&lt;/span&gt;                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;severity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;  &lt;span class="c1"&gt;# ERROR, WARN, INFO
&lt;/span&gt;                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http_status_code&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;numeric&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;  &lt;span class="c1"&gt;# 200, 404, 500
&lt;/span&gt;                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;numeric&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;  &lt;span class="c1"&gt;# When did this happen?
&lt;/span&gt;                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;  &lt;span class="c1"&gt;# Original log message
&lt;/span&gt;                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;trace_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;  &lt;span class="c1"&gt;# For tracing correlation
&lt;/span&gt;            &lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;# Create the RedisVL index
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SearchIndex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;IndexSchema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;overwrite&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Don't overwrite if exists
&lt;/span&gt;
        &lt;span class="c1"&gt;# Anomaly detection threshold
&lt;/span&gt;        &lt;span class="c1"&gt;# Cosine distance &amp;gt; 0.7 indicates an anomaly
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;anomaly_threshold&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The schema design is crucial. We're not just storing vectors; we're creating a searchable space where we can filter by service, time, or severity before applying vector similarity. This contextual search is what makes our anomaly detection intelligent rather than just mathematical.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Processing OTEL Logs into Embeddings&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Now let's transform OpenTelemetry logs into embeddings. The key is creating a text representation that preserves the semantic meaning and context:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_otel_log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Process an OpenTelemetry log record through our anomaly detection pipeline.

    This method is the heart of our system. It takes structured OTEL data,
    converts it to a semantic representation, and determines if it&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s anomalous.

    Returns:
        - is_anomaly: Boolean indicating if this log is anomalous
        - anomaly_score: Float between 0 and 1 (higher = more anomalous)
        - explanation: Human-readable explanation of the decision
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="c1"&gt;# Step 1: Create a semantic representation of the log
&lt;/span&gt;    &lt;span class="c1"&gt;# We're not just concatenating fields - we're creating a narrative
&lt;/span&gt;    &lt;span class="c1"&gt;# that captures the meaning and context
&lt;/span&gt;
    &lt;span class="n"&gt;log_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_create_semantic_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Step 2: Generate the embedding
&lt;/span&gt;    &lt;span class="c1"&gt;# This converts our text into a 768-dimensional vector
&lt;/span&gt;    &lt;span class="c1"&gt;# The model understands semantic relationships, so similar events
&lt;/span&gt;    &lt;span class="c1"&gt;# produce similar vectors even with different wording
&lt;/span&gt;
    &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encoder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;log_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;convert_to_numpy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Step 3: Search for similar historical logs
&lt;/span&gt;    &lt;span class="c1"&gt;# This is where we determine if this log is normal or anomalous
&lt;/span&gt;
    &lt;span class="n"&gt;is_anomaly&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;similar_logs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_check_anomaly&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Step 4: Store this log for future comparisons
&lt;/span&gt;    &lt;span class="c1"&gt;# Every log helps improve our understanding of "normal"
&lt;/span&gt;
    &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_store_log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Step 5: Generate explanation
&lt;/span&gt;    &lt;span class="c1"&gt;# This helps operators understand why something was flagged
&lt;/span&gt;
    &lt;span class="n"&gt;explanation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_generate_explanation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;is_anomaly&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;similar_logs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;is_anomaly&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;explanation&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_create_semantic_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Convert OTEL structured data into semantic text for embedding.

    The order and format matter! We&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;re creating a consistent narrative
    that helps the embedding model understand context. Think of this as
    writing a one-sentence story about what happened.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="n"&gt;parts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="c1"&gt;# Start with the service context
&lt;/span&gt;    &lt;span class="n"&gt;service&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;resource&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;service.name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;unknown&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;In service &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Add severity context
&lt;/span&gt;    &lt;span class="n"&gt;severity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;severity_text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;INFO&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;severity&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ERROR&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;an error occurred&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;severity&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;WARN&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;a warning was raised&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;an event happened&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Add HTTP context if present
&lt;/span&gt;    &lt;span class="n"&gt;attributes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;attributes&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;http.method&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;attributes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;method&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;attributes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;http.method&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;attributes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;http.route&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;unknown&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;attributes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;http.status_code&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;unknown&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;during &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; with status &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Add network context
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;net.peer.ip&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;attributes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;from IP &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;attributes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;net.peer.ip&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Add the actual message
&lt;/span&gt;    &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;no message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Add trace context for distributed tracing correlation
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;trace_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[trace:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;trace_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This semantic text creation is more art than science. We're crafting a narrative that preserves the important context while being consistent enough for the embedding model to find patterns. The model has been trained on billions of sentences, so it understands that "error occurred during POST /api/login with status 401" indicates a failed authentication attempt.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Detecting Anomalies with Semantic Search&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Here's where Redis really shines. We will implement a fast similarity search to determine if a log is anomalous. We will develop a method that performs anomaly detection while feeding Redis with additional logs for future analysis.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;redisvl.query&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;VectorQuery&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_check_anomaly&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ndarray&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Determine if a log is anomalous by comparing it to historical patterns.

    The core insight: normal logs cluster together in vector space,
    while anomalies are outliers. We use k-NN (k-Nearest Neighbors)
    to find similar logs and calculate how different this one is.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="c1"&gt;# Build a vector similarity query with context filters
&lt;/span&gt;    &lt;span class="c1"&gt;# We're not searching all logs - we're searching logs from the same service
&lt;/span&gt;    &lt;span class="c1"&gt;# in a recent time window for more accurate anomaly detection
&lt;/span&gt;
    &lt;span class="n"&gt;service_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;resource&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;service.name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;unknown&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;current_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;one_hour_ago&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_time&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;3600&lt;/span&gt;

    &lt;span class="c1"&gt;# Create RedisVL vector query
&lt;/span&gt;    &lt;span class="c1"&gt;# This combines semantic similarity with metadata filtering
&lt;/span&gt;    &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;VectorQuery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;vector_field_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;embedding&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;num_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Find 20 most similar logs
&lt;/span&gt;        &lt;span class="n"&gt;return_fields&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;severity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;service_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;filter_expression&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@service_name:{{&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;service_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;}} @timestamp:[&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;one_hour_ago&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;current_time&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Execute the search
&lt;/span&gt;    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Not enough historical data - can't determine if anomalous
&lt;/span&gt;        &lt;span class="c1"&gt;# Default to not anomalous to avoid false positives
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="c1"&gt;# Calculate anomaly score based on distances to nearest neighbors
&lt;/span&gt;    &lt;span class="n"&gt;distances&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;similar_logs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;  &lt;span class="c1"&gt;# Use top 10 for scoring
&lt;/span&gt;        &lt;span class="c1"&gt;# RedisVL returns cosine distance (0 = identical, 2 = opposite)
&lt;/span&gt;        &lt;span class="n"&gt;distance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;vector_distance&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;distances&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;distance&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;similar_logs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;severity&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;severity&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;distance&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;similarity&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;distance&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Convert to similarity score
&lt;/span&gt;        &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="c1"&gt;# Calculate anomaly score using statistics
&lt;/span&gt;    &lt;span class="c1"&gt;# We use both mean and minimum distance for robustness
&lt;/span&gt;    &lt;span class="n"&gt;mean_distance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;distances&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;min_distance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;distances&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Weighted combination - if even the closest log is far, it's very anomalous
&lt;/span&gt;    &lt;span class="n"&gt;anomaly_score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;mean_distance&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;min_distance&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Determine if anomalous based on threshold
&lt;/span&gt;    &lt;span class="n"&gt;is_anomaly&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anomaly_score&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;anomaly_threshold&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;is_anomaly&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;anomaly_score&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;similar_logs&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;# Return top 3 for explanation
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_store_log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ndarray&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;anomaly_score&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Store the processed log with its embedding in RedisVL.

    This builds our historical baseline. Every normal log makes our
    anomaly detection more accurate by better defining &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;normal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="c1"&gt;# Generate unique ID using timestamp and trace ID
&lt;/span&gt;    &lt;span class="n"&gt;timestamp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;trace_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;trace_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;notrace&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;log_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;log:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;trace_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timestamp&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="c1"&gt;# Prepare document for RedisVL
&lt;/span&gt;    &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;embedding&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# RedisVL handles serialization
&lt;/span&gt;        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;service_name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;resource&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;service.name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;unknown&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;severity&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;severity_text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;INFO&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;http_status_code&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;attributes&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;http.status_code&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;trace_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;trace_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;anomaly_score&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;anomaly_score&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;raw_log&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Store original for debugging
&lt;/span&gt;    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;# Store in Redis via RedisVL
&lt;/span&gt;    &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;log_id&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The beauty of this approach is its simplicity. We're not training complex models or maintaining elaborate rule sets. We're just asking: "Have we seen something like this before?" If the answer is no (high distance to all historical logs), it's an anomaly worth investigating.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Making Results Actionable&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;An anomaly detection system is only useful if it provides actionable insights. Let's add explanation generation to this code to help operators understand why something was flagged:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_generate_explanation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;is_anomaly&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                         &lt;span class="n"&gt;similar_logs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Generate human-readable explanations for anomaly decisions.

    This is crucial for building trust in the system. Operators need
    to understand not just that something is anomalous, but why.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;is_anomaly&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;similar_logs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Normal behavior. Similar to &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;similar_logs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; recent logs &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                   &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;from the same service. Closest match: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;similar_logs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                   &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;with &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;similar_logs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;similarity&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; similarity.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Normal behavior. Insufficient historical data for detailed comparison.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="c1"&gt;# For anomalies, provide detailed explanation
&lt;/span&gt;    &lt;span class="n"&gt;severity_assessment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;High&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Medium&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Low&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="n"&gt;explanation_parts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;severity_assessment&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; severity anomaly detected (score: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;).&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;This log is significantly different from recent patterns in &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;resource&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="si"&gt;{}&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;service.name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;the service&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;similar_logs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Show what normal looks like for comparison
&lt;/span&gt;        &lt;span class="n"&gt;explanation_parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Most similar normal log: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;similar_logs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;but with only &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;similar_logs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;similarity&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; similarity.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;explanation_parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No similar logs found in recent history.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Add specific indicators if present
&lt;/span&gt;    &lt;span class="n"&gt;attributes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;attributes&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;attributes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;http.status_code&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;explanation_parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Server error status code detected.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;log_record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;explanation_parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error keyword present in message.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;explanation_parts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A variation of this implementation could use an LLM to provide more structure using a human-like narrative. But we decided to keep things simple here. Just know the art of the possible.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Testing Everything with an Example&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Let's see our anomaly detector in action with real OpenTelemetry logs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Initialize the detector
&lt;/span&gt;&lt;span class="n"&gt;detector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OTELAnomalyDetector&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Example OTEL logs - some normal, some anomalous
&lt;/span&gt;&lt;span class="n"&gt;test_logs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="c1"&gt;# Normal authentication log
&lt;/span&gt;    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;1699564800.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;severity_text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;INFO&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;User login successful for user123&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;resource&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;service.name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;auth-service&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;attributes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http.method&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;POST&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http.route&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/api/login&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http.status_code&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;net.peer.ip&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;192.168.1.100&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;trace_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;abc123&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;

    &lt;span class="c1"&gt;# Another normal log - similar pattern
&lt;/span&gt;    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;1699564860.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;severity_text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;INFO&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authentication completed for user456&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;resource&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;service.name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;auth-service&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;attributes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http.method&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;POST&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http.route&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/api/login&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http.status_code&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;net.peer.ip&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;192.168.1.101&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;trace_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;def456&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;

    &lt;span class="c1"&gt;# Suspicious log - SQL injection attempt
&lt;/span&gt;    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;1699564920.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;severity_text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ERROR&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Invalid input detected: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;; DROP TABLE users; --&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;resource&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;service.name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;auth-service&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;attributes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http.method&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;POST&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http.route&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/api/login&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http.status_code&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;net.peer.ip&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;185.220.101.45&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# Known malicious IP range
&lt;/span&gt;        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;trace_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ghi789&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;

    &lt;span class="c1"&gt;# Another anomaly - unusual error pattern
&lt;/span&gt;    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;1699564980.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;severity_text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ERROR&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Database connection pool exhausted - unable to serve requests&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;resource&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;service.name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;auth-service&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;attributes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http.method&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;POST&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http.route&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/api/login&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http.status_code&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;503&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;net.peer.ip&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;192.168.1.102&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;trace_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;jkl012&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Process logs and detect anomalies
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ANOMALY DETECTION RESULTS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;log&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;test_logs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;is_anomaly&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;explanation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;detector&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;process_otel_log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Log: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Timestamp: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromtimestamp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Service: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;resource&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;service.name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Anomaly: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;🔴 YES&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;is_anomaly&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;🟢 NO&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Score: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Explanation: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;explanation&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you run this, you'll see the system correctly identify the SQL injection attempt and database error as anomalies, while recognizing the normal authentication logs as expected behavior. The explanations help you understand why each decision was made. Our anomaly detection system works because embeddings capture semantic meaning, not just text patterns.&lt;/p&gt;

&lt;p&gt;When we process "Invalid input detected: '; DROP TABLE users; --", the embedding model recognizes this as semantically different from normal authentication logs. It understands that DROP TABLE is a SQL command, that the semicolon and comment markers indicate injection attempts, and that this pattern is inconsistent with successful logins.&lt;/p&gt;

&lt;p&gt;The anomaly scores tell us how unusual each log is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;0.0 - 0.3&lt;/strong&gt;: Very normal, closely matches historical patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;0.3 - 0.7&lt;/strong&gt;: Somewhat unusual but likely benign&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;0.7 - 0.9&lt;/strong&gt;: Anomalous, worth investigating&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;0.9+&lt;/strong&gt;: Highly anomalous, immediate attention needed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The contextual filtering (by service and time) is crucial. A "database connection timeout" might be normal for a batch processing service, but anomalous for your authentication service. By comparing logs within the same context, we avoid false positives from cross-service differences.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Monitoring the Monitor&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;As what happens with any system, you must ensure it is up and running so you can rely on it when you need it the most. As a best practice, you should track key metrics to ensure your anomaly detection system is working correctly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_system_metrics&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Get metrics about the anomaly detection system itself.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;info&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;total_logs&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;num_docs&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;index_memory_mb&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;memory_usage_mb&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;avg_embedding_time_ms&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_measure_embedding_speed&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;avg_search_time_ms&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_measure_search_speed&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;anomaly_rate&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_calculate_recent_anomaly_rate&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;index_fragmentation&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;fragmentation_ratio&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  &lt;strong&gt;Why This Approach Works?&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The combination of OpenTelemetry structure, semantic embeddings, and Redis's fast vector search capabilities creates a powerful anomaly detection system that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Understands Context&lt;/strong&gt;: Unlike regex patterns, embeddings understand that "authentication failed" and "login unsuccessful" mean the same thing. This is powerful. Anomalies don't have to be fancy IP masking attempts, as we often see in movies. It's often a log text said differently.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Learns Continuously&lt;/strong&gt;: Every log processed improves the baseline, making detection more accurate over time. In this case, more data literally means more value.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Scales Efficiently&lt;/strong&gt;: RedisVL's HNSW index maintains fast search times even with millions of logs stored. Redis's in-memory design enables queries to execute with extremely low latency, while providing the tools to scale out as needed—both vertically and horizontally.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Requires No Training&lt;/strong&gt;: Pre-trained embedding models work out of the box, no ML expertise required. Often, this is what hurts engineering teams the most, as they require ML expertise from day one to get things working.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Provides Explanations&lt;/strong&gt;: Operators understand why something was flagged, building trust in the system. They may not be experts, but they are able to capture the essence of what went wrong and take the required actions.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Conclusion&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;In this blog post, we've built an anomaly detection system that understands the semantic meaning of logs, not just their text patterns. By combining OpenTelemetry's structured observability with embedding-based semantic search in Redis, we can detect novel anomalies that rule-based systems would likely miss.&lt;/p&gt;

&lt;p&gt;The beauty of this approach is its simplicity. We're not training complex models or maintaining hundreds of rules. We're just asking, "Is this log semantically similar to what we've seen before?" When the answer is no, we've found an anomaly worth investigating. You can verify this by deploying this implementation for one critical service and allowing it to learn for a week. You'll be surprised how quickly it starts catching issues your existing monitoring misses. As you gain confidence, expand to more services, each maintaining its own baseline of normal behavior.&lt;/p&gt;

&lt;p&gt;The future of observability isn't about writing more alerts or hiring more analysts. It's about systems that understand meaning, learn patterns, and surface what truly matters. By using OpenTelemetry for structure, embeddings for understanding, and RedisVL for scale, that future is accessible to any engineering team today.&lt;/p&gt;




</description>
      <category>opentelemetry</category>
      <category>vectorsearch</category>
      <category>anomalydetection</category>
      <category>redis</category>
    </item>
    <item>
      <title>Designing Data Systems with Vector Embeddings using Redis Vector Sets</title>
      <dc:creator>Ricardo Ferreira</dc:creator>
      <pubDate>Sun, 03 Aug 2025 15:10:07 +0000</pubDate>
      <link>https://dev.to/redis/designing-vector-embeddings-a-finding-nemo-journey-through-redis-vector-sets-3i9j</link>
      <guid>https://dev.to/redis/designing-vector-embeddings-a-finding-nemo-journey-through-redis-vector-sets-3i9j</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Every software engineer has faced this challenge: how do you model complex, multifaceted relationships in your data? Whether it's products in an e-commerce catalog, documents in a search engine, or content in a recommendation system, we often struggle to represent data entities with multiple attributes and complex relationships.&lt;/p&gt;

&lt;p&gt;Today, we'll explore this fundamental problem through an unexpected lens: &lt;a href="https://en.wikipedia.org/wiki/Finding_Nemo" rel="noopener noreferrer"&gt;Pixar's Finding Nemo&lt;/a&gt;. By modeling Marlin's journey to find his son, Nemo, you'll learn how Redis Vector Sets and vector embeddings can elegantly solve problems that most traditional approaches struggle with.&lt;/p&gt;

&lt;p&gt;Best of all? You'll learn how to do this hands-on.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Challenge: Data Design
&lt;/h2&gt;

&lt;p&gt;Imagine you're tasked with building a system representing Marlin's journey in Finding Nemo. If you haven't watched the movie (well, if that is true, then shame on you), here is a &lt;strong&gt;TL;DR&lt;/strong&gt; of the storyline.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Nemo, a clownfish, gets lost in the sea and is dragged to Sydney&lt;/li&gt;
&lt;li&gt;Marlin, his father, starts a rescue journey alone and afraid&lt;/li&gt;
&lt;li&gt;He meets Dory, a helpful but rather forgetful blue tang&lt;/li&gt;
&lt;li&gt;They encounter Bruce and his shark friends, reformed predators&lt;/li&gt;
&lt;li&gt;A school of moonfish gives Marlin and Dory directions&lt;/li&gt;
&lt;li&gt;Crush and the sea turtles help them ride the sea to the EAC&lt;/li&gt;
&lt;li&gt;A whale swallows them and takes them to Sydney by accident&lt;/li&gt;
&lt;li&gt;Nigel the pelican recognizes Marlin and shares where Nemo is&lt;/li&gt;
&lt;li&gt;Finally, Marlin reunites with Nemo&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These are the requirements your system needs to keep in mind:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Preserve the journey order&lt;/strong&gt; — Who does Marlin meet first, second, third?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Find similar characters&lt;/strong&gt; — Which characters are most like Dory?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Filter by attributes&lt;/strong&gt; — Show me all the "helpers" or "large creatures"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Handle flexible insertion&lt;/strong&gt; — Add characters in any order, not just chronologically&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Support proximity queries&lt;/strong&gt; — Who does Marlin meet after the sharks?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Which well-known approaches would you use to address these requirements? Let's discuss some options and their trade-offs.&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;Using Linked Lists&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Character&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;Next&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Character&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;marlin&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Character&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Marlin"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;dory&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Character&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Dory"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;marlin&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Next&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dory&lt;/span&gt;
    &lt;span class="c"&gt;// Same for next ones...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Linked Lists are great for preserving the journey order. It allows you to navigate the relationships in different directions: forward-only, bidirectionally, and with circular support. But here is the problem:&lt;/p&gt;

&lt;p&gt;💡 You can't query by attributes, no similarity search, and rigid insertion order.&lt;/p&gt;

&lt;p&gt;🗄️ &lt;strong&gt;Relational Database&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;journey&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;character_name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="k"&gt;position&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;species&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;helpfulness&lt;/span&gt; &lt;span class="nb"&gt;FLOAT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;size&lt;/span&gt; &lt;span class="nb"&gt;FLOAT&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Relational databases are fairly attractive because they provide a proven programming model. If your data is in a table, you probably know how to query it with SQL. But here is the problem:&lt;/p&gt;

&lt;p&gt;💡 Similarity queries require complex JOINs, "find nearest neighbors" is expensive, and multi-attribute distance calculations are cumbersome.&lt;/p&gt;

&lt;p&gt;🕸️ &lt;strong&gt;Graph Database&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="py"&gt;marlin:&lt;/span&gt;&lt;span class="n"&gt;Character&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;:MEETS&lt;/span&gt;&lt;span class="ss"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="py"&gt;dory:&lt;/span&gt;&lt;span class="n"&gt;Character&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;:MEETS&lt;/span&gt;&lt;span class="ss"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="py"&gt;sharks:&lt;/span&gt;&lt;span class="n"&gt;Character&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Graph databases excel in scenarios requiring complex relationships. Let's face it, this is how the world looks most of the time. But here is the problem:&lt;/p&gt;

&lt;p&gt;💡 While suitable for relationships, calculating multi-dimensional similarity is still complex, and ordering isn't inherent.&lt;/p&gt;

&lt;p&gt;📄 &lt;strong&gt;Document Store with Full-Text Search&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"character"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Dory"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"attributes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"helpful"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"forgetful"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"blue tang"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"position"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Document stores are great because they provide flexible data models that allow you to adapt your query needs faster and painlessly. But here is the problem:&lt;/p&gt;

&lt;p&gt;💡 Text matching isn't the same as mathematical similarity, and no accurate distance calculations exist.&lt;/p&gt;

&lt;h3&gt;
  
  
  What's Missing?
&lt;/h3&gt;

&lt;p&gt;All these approaches treat the journey order and character attributes as separate concerns. As developers, we often tend to pick one of these approaches because we are used to it and struggle to implement the remaining requirements. It's almost as if we hope it will work great in the end, like magic.&lt;/p&gt;

&lt;p&gt;What if we could encode both in a unified mathematical representation? What if each character could exist as a point in multi-dimensional space, where their position store both when they appear and what they're like? Well, this is why vector embeddings are here for.&lt;/p&gt;

&lt;h2&gt;
  
  
  A New Perspective with Vectors
&lt;/h2&gt;

&lt;p&gt;Instead of thinking of characters as records, nodes, or documents, imagine them as points in a multi-dimensional space. Each dimension represents an attribute that helps you design your data entity correctly:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Journey Position&lt;/strong&gt; (when Marlin meets them)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Helpfulness&lt;/strong&gt; (how much they assist)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Size&lt;/strong&gt; (physical dimensions)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Swimming Style&lt;/strong&gt; (movement patterns)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Courage&lt;/strong&gt; (bravery level)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;With these five dimensions, finding "who comes next in the journey" becomes now a nearest-neighbor search. Finding "similar characters" is just measuring distances in this space. Magic? No. Mathematics. The usage of dimensions with a vector store allows you the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Flexible Insertion&lt;/strong&gt;: Add data in any order; relationships are maintained by vector dimensions with loose references across different records.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-Attribute Similarity&lt;/strong&gt;: Distance calculations consider all dimensions simultaneously. There are no more per-field comparisons.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Semantic Queries&lt;/strong&gt;: Search by example or by constructing meaningful query vectors. Search by meaning instead of purely precision.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid Search&lt;/strong&gt;: Combine vector similarity with attribute filtering. Vectors for semantic search, filters for additional result precision.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: HNSW indices make even high-dimensional searches fast. Search massive amounts of data with O(Log N) performance.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Building Marlin's Journey with Redis Vector Sets
&lt;/h2&gt;

&lt;p&gt;Let's build this step by step. We'll use &lt;a href="https://redis.io/docs/latest/develop/data-types/vector-sets/" rel="noopener noreferrer"&gt;Redis Vector Sets&lt;/a&gt;, providing high-performance vector similarity search and additional filtering capabilities. You will need &lt;a href="https://redis.io/docs/latest/operate/oss_and_stack/install/" rel="noopener noreferrer"&gt;Redis Open Source&lt;/a&gt; for this.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Creating Our Universe
&lt;/h3&gt;

&lt;p&gt;First, let's check that we're starting fresh and understand what we're building. Let's see what data type we're creating:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TYPE finding-nemo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;➡️ Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"none"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Great! We're starting with a clean slate. Now let's add our first character—Marlin. He is the anxious father starting his journey.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;VADD finding-nemo VALUES 5 0.0 0.5 0.2 0.1 0.1 marlin SETATTR '{"species":"clownfish","type":"father","quote":"I have to find my son!"}'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;➡️ Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(integer) 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What just happened?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We've created a new vector set, under the key &lt;code&gt;finding-nemo&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;We've added Marlin as a 5-dimensional point at position (0.0, 0.5, 0.2, 0.1, 0.1)&lt;/li&gt;
&lt;li&gt;We've attached metadata to Marlin about species, role, and his iconic quote&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The vector embedding &lt;code&gt;[0.0, 0.5, 0.2, 0.1, 0.1]&lt;/code&gt; tells us Marlin starts at the first position (0.0), has moderate helpfulness (0.5), is small (0.2), swims cautiously (0.1), and begins with low courage (0.1). In this example, we have used only five dimensions because these are all the attributes we need to implement the scenario. However, you are not limited to this. You can use as many dimensions as you need. Have you ever heard about OpenAI's embedding models that create 1536 dimensions? This is one example of how far you can go!&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Building the Journey (Out of Order!)
&lt;/h3&gt;

&lt;p&gt;Here's where it gets interesting. In traditional systems, we'd need to insert characters in order. But with vectors, watch this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;VADD finding-nemo VALUES 5 6.0 0.9 0.5 0.7 0.7 nigel SETATTR '{"species":"pelican","type":"informant","quote":"Hop inside my mouth!"}'
VADD finding-nemo VALUES 5 1.0 1.0 0.2 0.2 0.2 dory SETATTR '{"species":"blue tang","type":"helper","quote":"Just keep swimming!"}'
VADD finding-nemo VALUES 5 5.0 0.9 1.0 0.9 0.6 whale SETATTR '{"species":"whale","type":"transporter","quote":"*whale sounds*"}'
VADD finding-nemo VALUES 5 3.0 0.7 0.7 0.4 0.3 moonfish SETATTR '{"species":"moonfish","type":"guides","quote":"Follow the EAC!"}'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice we're adding Nigel (position 6.0) before Dory (position 1.0). In a Linked List, this would break our ordering. But vectors don't care about insertion order. They care about position in space.&lt;/p&gt;

&lt;p&gt;Let's complete our cast:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;VADD finding-nemo VALUES 5 7.0 0.0 0.2 0.1 0.8 nemo SETATTR '{"species":"clownfish","type":"son","quote":"Dad!"}'
VADD finding-nemo VALUES 5 4.0 0.9 0.6 0.8 0.5 turtles SETATTR '{"species":"sea turtles","type":"transporters","quote":"Righteous! Righteous!"}'
VADD finding-nemo VALUES 5 2.0 0.7 0.8 0.3 0.3 sharks SETATTR '{"species":"sharks","type":"reformed predators","quote":"Fish are friends, not food!"}'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Verifying Our Vector Universe
&lt;/h3&gt;

&lt;p&gt;Let's examine what we've built. What type of data structure did we create?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TYPE finding-nemo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;➡️ Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;vectorset
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Vector Sets provide a way for you to inspect your data very easily. Use the command &lt;code&gt;VCARD&lt;/code&gt; to count how many characters are in our journey.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;VCARD finding-nemo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;➡️ Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(integer) 8
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What if you want to investigate how many dimensions you are using? This is quite common, as the team that loads vectors into the databases is not always the same one that queries them. Use the command &lt;code&gt;VDIM&lt;/code&gt; to find how many dimensions each character has.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;VDIM finding-nemo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;➡️ Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(integer) 5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you need to retrieve information about your vector set, use the command &lt;code&gt;VINFO&lt;/code&gt; for this.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;VINFO finding-nemo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;➡️ Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1) "quant-type"
2) "int8"
3) "hnsw-m"
4) "16"
5) "vector-dim"
6) "5"
7) "projection-input-dim"
8) "0"
9) "size"
10) "8"
11) "max-level"
12) "1"
13) "attributes-count"
14) "8"
15) "vset-uid"
16) "0"
17) "hnsw-max-node-uid"
18) "8"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Tracing the Journey
&lt;/h3&gt;

&lt;p&gt;Now for the revealing moment. Despite inserting characters randomly, can we trace Marlin's journey in the correct order? The answer is yes. You must start from before the journey (-1.0) and find all the characters in order.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;VSIM finding-nemo VALUES 5 -1.0 0.5 0.2 0.1 0.1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;➡️ Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1) "marlin"
2) "dory"
3) "sharks"
4) "moonfish"
5) "turtles"
6) "whale"
7) "nigel"
8) "nemo"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;🎉 Perfect! But how did this work?&lt;/p&gt;

&lt;p&gt;The query vector &lt;code&gt;[-1.0, 0.5, 0.2, 0.1, 0.1]&lt;/code&gt; is cleverly designed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Position -1.0 places us "before" the journey starts&lt;/li&gt;
&lt;li&gt;The other values (0.5, 0.2, 0.1, 0.1) match Marlin's characteristics&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Redis finds the nearest neighbors in order, effectively tracing the path from start to finish. The significant gaps between journey positions (0, 1, 2... 7) ensure the first dimension dominates the distance calculation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Finding Similar Characters
&lt;/h3&gt;

&lt;p&gt;Vector sets shine at similarity search. Let's explore relationships.  For instance, who are the three characters most similar to Marlin?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;VSIM finding-nemo ELE marlin COUNT 3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;➡️ Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1) "marlin"
2) "dory"    # Makes sense - small fish, next in journey
3) "sharks"  # Next encounter after Dory
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Who's closest to Nemo?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;VSIM finding-nemo ELE nemo COUNT 3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;➡️ Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1) "nemo"
2) "nigel"   # Met right before reunion
3) "whale"   # Carried Marlin to Sidney
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The similarity considers all dimensions, not just journey position, but also size, helpfulness, and other attributes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 6: Finding Helpers with Filtered Searches
&lt;/h3&gt;

&lt;p&gt;Here's where vector sets truly excel over traditional approaches. Let's find all helpers and transporters near Dory.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;VSIM finding-nemo ELE dory FILTER '.type == "helper" || .type == "transporters"' COUNT 5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;➡️ Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1) "dory"
2) "turtles"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This combines vector similarity with attribute filtering, which would require complex queries in traditional databases.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 7: Semantic Queries
&lt;/h3&gt;

&lt;p&gt;Let's find the largest creatures by searching near a "large creature" point. Let's search near a point representing large, helpful creatures.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;VSIM finding-nemo VALUES 5 3.5 0.8 1.0 0.5 0.5 COUNT 3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;➡️ Expected output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2) "moonfish" # Reef largest (size=0.8)
1) "whale"    # Largest (size=1.0)
3) "turtles"  # Also sizeable (size=0.6)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We didn't need to write &lt;code&gt;WHERE size &amp;gt; 0.7&lt;/code&gt; as vector space naturally clusters large creatures together. Semantic search is a powerful type of querying that exploits data's proximity instead of its precision.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;We've solved challenges that stump traditional approaches by representing Marlin's journey as vectors. Through the elegant mathematics of vector spaces, we can query by order, find similar entities, filter by attributes, and add data with flexible insertion. While the Finding Nemo example in this blog post may have been whimsical, all the underlying principles are foundational to modern AI and search systems.&lt;/p&gt;

&lt;p&gt;Vector Sets, part of &lt;a href="https://redis.io/open-source/" rel="noopener noreferrer"&gt;Redis Open Source&lt;/a&gt;, provide the perfect environment for exploring data modeling with vector embeddings, while providing a robust implementation of the HNSW algorithm.&lt;/p&gt;

&lt;p&gt;So, the next time you face a complex data modeling challenge, ask yourself, could this be a vector? Who knows. Sometimes the best solutions come from seeing your data from a different dimension. Pun intended.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>vectordatabase</category>
      <category>redis</category>
      <category>vectorsets</category>
    </item>
    <item>
      <title>Semantic Caching with Spring AI &amp; Redis</title>
      <dc:creator>Raphael De Lio</dc:creator>
      <pubDate>Thu, 31 Jul 2025 09:37:38 +0000</pubDate>
      <link>https://dev.to/redis/semantic-caching-with-spring-ai-redis-2aa4</link>
      <guid>https://dev.to/redis/semantic-caching-with-spring-ai-redis-2aa4</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; You’re building a semantic caching system using Spring AI and Redis to improve LLM application performance.&lt;/p&gt;

&lt;p&gt;Unlike traditional caching that requires exact query matches, semantic caching understands the meaning behind queries and can return cached responses for semantically similar questions.&lt;/p&gt;

&lt;p&gt;It works by storing query-response pairs as vector embeddings in Redis, allowing your application to retrieve cached answers for similar questions without calling the expensive LLM, reducing both latency and costs.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fstkb1wkl2hb1arkilbff.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fstkb1wkl2hb1arkilbff.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  The Problem with Traditional LLM Applications
&lt;/h1&gt;

&lt;p&gt;LLMs are powerful but expensive. Every API call costs money and takes time. When users ask similar questions like “What beer goes with grilled meat?” and “Which beer pairs well with barbecue?”, traditional systems would make separate LLM calls even though these queries are essentially asking the same thing.&lt;/p&gt;

&lt;p&gt;Traditional exact-match caching only works if users ask the identical question word-for-word. But in real applications, users phrase questions differently while seeking the same information.&lt;/p&gt;

&lt;h1&gt;
  
  
  How Semantic Caching Works
&lt;/h1&gt;

&lt;blockquote&gt;
&lt;p&gt;Video: &lt;a href="https://www.youtube.com/watch?v=AtVTT_s8AGc&amp;amp;t=1s" rel="noopener noreferrer"&gt;What is a semantic cache?&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Semantic caching solves this by understanding the &lt;strong&gt;&lt;em&gt;meaning&lt;/em&gt;&lt;/strong&gt; behind queries rather than matching exact text. When a user asks a question:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; The system converts the query into a vector embedding&lt;/li&gt;
&lt;li&gt; It searches for semantically similar cached queries using vector similarity&lt;/li&gt;
&lt;li&gt; If a similar query exists above a certain threshold, it returns the cached response&lt;/li&gt;
&lt;li&gt; If not, it calls the LLM, gets a response, and caches both the query and response for future use&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Behind the scenes, this works thanks to vector similarity search. It turns text into vectors (embeddings) — lists of numbers — stores them in a vector database, and then finds the ones closest to your query when checking for cached responses.&lt;/p&gt;

&lt;p&gt;Today, we’re gonna build a semantic caching system for a beer recommendation assistant. It will remember previous responses to similar questions, dramatically improving response times and reducing API costs.&lt;/p&gt;

&lt;p&gt;To do that, we’ll build a Spring Boot app from scratch and use Redis as our semantic cache store. It’ll handle vector embeddings for similarity matching, enabling our application to provide lightning-fast responses for semantically similar queries.&lt;/p&gt;

&lt;h1&gt;
  
  
  Redis as a Semantic Cache for AI Applications
&lt;/h1&gt;

&lt;blockquote&gt;
&lt;p&gt;Video: &lt;a href="https://www.youtube.com/watch?v=Yhv19le0sBw&amp;amp;t=1s" rel="noopener noreferrer"&gt;What's a vector database&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Redis Open Source 8 not only turns the community version of Redis into a Vector Database, but also makes it the fastest and most scalable database in the market today. Redis 8 allows you to scale to one billion vectors without penalizing latency.&lt;/p&gt;

&lt;p&gt;For semantic caching, Redis serves as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  A vector store using Redis JSON and the Redis Query Engine for storing query embeddings&lt;/li&gt;
&lt;li&gt;  A metadata store for cached responses and additional context&lt;/li&gt;
&lt;li&gt;  A high-performance search engine for finding semantically similar queries&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Spring AI and Redis
&lt;/h1&gt;

&lt;blockquote&gt;
&lt;p&gt;Video: &lt;a href="https://www.youtube.com/watch?v=0U1S0WSsPuE&amp;amp;t=1s" rel="noopener noreferrer"&gt;What’s an embedding model?&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Spring AI provides a unified API for working with various AI models and vector stores. Combined with Redis, it allows developers to easily build semantic caching systems that can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Store and retrieve vector embeddings for semantic search&lt;/li&gt;
&lt;li&gt;  Cache LLM responses with semantic similarity matching&lt;/li&gt;
&lt;li&gt;  Reduce API costs by avoiding redundant LLM calls&lt;/li&gt;
&lt;li&gt;  Improve response times for similar queries&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Building the Application
&lt;/h1&gt;

&lt;p&gt;Our application will be built using Spring Boot with Spring AI and Redis. It will implement a beer recommendation assistant that caches responses semantically, providing fast answers to similar questions about beer pairings.&lt;/p&gt;

&lt;h2&gt;
  
  
  0. GitHub Repository
&lt;/h2&gt;

&lt;p&gt;The full application can be found on GitHub&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/redis-developer/redis-springboot-resources/tree/main/artificial-intelligence/semantic-caching-with-spring-ai" rel="noopener noreferrer"&gt;https://github.com/redis-developer/redis-springboot-resources/tree/main/artificial-intelligence/semantic-caching-with-spring-ai&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Add the required dependencies
&lt;/h2&gt;

&lt;p&gt;From a Spring Boot application, add the following dependencies to your Maven or Gradle file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;implementation("org.springframework.ai:spring-ai-transformers:1.0.0")
implementation("org.springframework.ai:spring-ai-starter-vector-store-redis")
implementation("org.springframework.ai:spring-ai-starter-model-openai")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  2. Configure the Semantic Cache Vector Store
&lt;/h2&gt;

&lt;p&gt;We’ll use Spring AI’s &lt;code&gt;RedisVectorStore&lt;/code&gt; to store and search vector embeddings of cached queries and responses:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@Configuration
class SemanticCacheConfig {
    @Bean
    fun semanticCachingVectorStore(
        embeddingModel: TransformersEmbeddingModel,
        jedisPooled: JedisPooled
    ): RedisVectorStore {
        return RedisVectorStore.builder(jedisPooled, embeddingModel)
            .indexName("semanticCachingIdx")
            .contentFieldName("content")
            .embeddingFieldName("embedding")
            .metadataFields(
                RedisVectorStore.MetadataField("answer", Schema.FieldType.TEXT)
            )
            .prefix("semantic-caching:")
            .initializeSchema(true)
            .vectorAlgorithm(RedisVectorStore.Algorithm.HSNW)
            .build()
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let’s break this down:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Index Name&lt;/strong&gt;: &lt;code&gt;semanticCachingIdx&lt;/code&gt; — Redis will create an index with this name for searching cached responses&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Content Field&lt;/strong&gt;: &lt;code&gt;content&lt;/code&gt; — The raw prompt that will be embedded&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Embedding Field&lt;/strong&gt;: &lt;code&gt;embedding&lt;/code&gt; — The field that will store the resulting vector embedding&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Metadata Fields&lt;/strong&gt;:&lt;/li&gt;
&lt;li&gt;  &lt;code&gt;answer&lt;/code&gt;: TEXT field for storing the LLM's response&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Prefix&lt;/strong&gt;: &lt;code&gt;semantic-caching:&lt;/code&gt; — All keys in Redis will be prefixed with this to organize the data&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Vector Algorithm&lt;/strong&gt;: HSNW — Hierarchical Navigable Small World algorithm for efficient approximate nearest neighbor search&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. Implement the Semantic Caching Service
&lt;/h2&gt;

&lt;p&gt;The SemanticCachingService handles storing and retrieving cached responses from Redis:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@Service
class SemanticCachingService(
    private val semanticCachingVectorStore: RedisVectorStore
) {
    private val logger = LoggerFactory.getLogger(SemanticCachingService::class.java)
    fun storeInCache(prompt: String, answer: String) {
        // Create a document for the vector store
        val document = Document(
            prompt,
            mapOf("answer" to answer)
        )
        // Store the document in the vector store
        semanticCachingVectorStore.add(listOf(document))

        logger.info("Stored response in semantic cache for prompt: ${prompt.take(50)}...")
    }
    fun getFromCache(prompt: String, similarityThreshold: Double = 0.8): String? {
        // Execute similarity search
        val results = semanticCachingVectorStore.similaritySearch(
            SearchRequest.builder()
                .query(prompt)
                .topK(1)
                .build()
        )
        // Check if we found a semantically similar query above threshold
        if (results?.isNotEmpty() == true) {
            val score = results[0].score ?: 0.0
            if (similarityThreshold &amp;lt; score) {
                logger.info("Cache hit! Similarity score: $score")
                return results[0].metadata["answer"] as String
            } else {
                logger.info("Similar query found but below threshold. Score: $score")
            }
        }
        logger.info("No cached response found for prompt")
        return null
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key features of the semantic caching service:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Stores query-response pairs as vector embeddings in Redis&lt;/li&gt;
&lt;li&gt;  Retrieves cached responses using vector similarity search&lt;/li&gt;
&lt;li&gt;  Configurable similarity threshold for cache hits&lt;/li&gt;
&lt;li&gt;  Comprehensive logging for debugging and monitoring&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  4. Integrate with the RAG Service
&lt;/h2&gt;

&lt;p&gt;The RagService orchestrates the semantic caching with the standard RAG pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@Service
class RagService(
    private val chatModel: ChatModel,
    private val vectorStore: RedisVectorStore,
    private val semanticCachingService: SemanticCachingService
) {
    private val logger = LoggerFactory.getLogger(RagService::class.java)
    fun retrieve(message: String): RagResult {
        // Check semantic cache first
        val startCachingTime = System.currentTimeMillis()
        val cachedAnswer = semanticCachingService.getFromCache(message, 0.8)
        val cachingTimeMs = System.currentTimeMillis() - startCachingTime
        if (cachedAnswer != null) {
            logger.info("Returning cached response")
            return RagResult(
                generation = Generation(AssistantMessage(cachedAnswer)),
                metrics = RagMetrics(
                    embeddingTimeMs = 0,
                    searchTimeMs = 0,
                    llmTimeMs = 0,
                    cachingTimeMs = cachingTimeMs,
                    fromCache = true
                )
            )
        }
        // Standard RAG process if no cache hit
        logger.info("No cache hit, proceeding with RAG pipeline")

        // Retrieve relevant documents
        val startEmbeddingTime = System.currentTimeMillis()
        val searchResults = vectorStore.similaritySearch(
            SearchRequest.builder()
                .query(message)
                .topK(5)
                .build()
        )
        val embeddingTimeMs = System.currentTimeMillis() - startEmbeddingTime
        // Create context from retrieved documents
        val context = searchResults.joinToString("\n") { it.text }

        // Generate response using LLM
        val startLlmTime = System.currentTimeMillis()
        val prompt = createPromptWithContext(message, context)
        val response = chatModel.call(prompt)
        val llmTimeMs = System.currentTimeMillis() - startLlmTime
        // Store the response in semantic cache for future use
        val responseText = response.result.output.text ?: ""
        semanticCachingService.storeInCache(message, responseText)
        return RagResult(
            generation = response.result,
            metrics = RagMetrics(
                embeddingTimeMs = embeddingTimeMs,
                searchTimeMs = 0, // Combined with embedding time
                llmTimeMs = llmTimeMs,
                cachingTimeMs = 0,
                fromCache = false
            )
        )
    }
    private fun createPromptWithContext(query: String, context: String): Prompt {
        val systemMessage = SystemMessage("""
            You are a beer recommendation assistant. Use the provided context to answer 
            questions about beer pairings, styles, and recommendations.

            Context: $context
        """.trimIndent())

        val userMessage = UserMessage(query)

        return Prompt(listOf(systemMessage, userMessage))
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key features of the integrated RAG service:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Checks semantic cache before expensive LLM calls&lt;/li&gt;
&lt;li&gt;  Falls back to standard RAG pipeline for cache misses&lt;/li&gt;
&lt;li&gt;  Automatically caches new responses for future use&lt;/li&gt;
&lt;li&gt;  Provides detailed performance metrics including cache hit indicators&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Running the Demo
&lt;/h1&gt;

&lt;p&gt;The easiest way to run the demo is with Docker Compose, which sets up all required services in one command.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Clone the repository
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git clone https://github.com/redis-developer/redis-springboot-resources.git
cd redis-springboot-resources/artificial-intelligence/semantic-caching-with-spring-ai
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2: Configure your environment
&lt;/h2&gt;

&lt;p&gt;Create a &lt;code&gt;.env&lt;/code&gt; file with your OpenAI API key:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;OPENAI_API_KEY=sk-your-api-key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3: Start the services
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker compose up --build
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will start:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;redis&lt;/strong&gt;: for storing both vector embeddings and cached responses&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;redis-insight&lt;/strong&gt;: a UI to explore the Redis data&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;semantic-caching-app&lt;/strong&gt;: the Spring Boot app that implements the semantic caching system&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 4: Use the application
&lt;/h2&gt;

&lt;p&gt;When all services are running, go to &lt;code&gt;localhost:8080&lt;/code&gt; to access the demo. You'll see a beer recommendation interface:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5lv3axkphkkkqa213woq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5lv3axkphkkkqa213woq.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you click on &lt;code&gt;Start Chat&lt;/code&gt;, it may be that the embeddings are still being created, and you get a message asking for this operation to complete. This is the operation where the documents we'll search through will be turned into vectors and then stored in the database. It is done only the first time the app starts up and is required regardless of the vector database you use.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgq0n7l8a9n6qdeze522b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgq0n7l8a9n6qdeze522b.png" width="554" height="235"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once all the embeddings have been created, you can start asking your chatbot questions. It will semantically search through the documents we have stored, try to find the best answer for your questions, and cache the responses semantically in Redis:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frfcg5ebqkdyhq9rdu13f.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frfcg5ebqkdyhq9rdu13f.gif" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you ask something similar to a question had already been asked, your chatbot will retrieve it from the cache instead of sending the query to the LLM. Retrieving an answer much faster now.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F34zqfpbqosu9bl779g02.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F34zqfpbqosu9bl779g02.gif" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Exploring the Data in Redis Insight
&lt;/h1&gt;

&lt;p&gt;RedisInsight provides a visual interface for exploring the cached data in Redis. Access it at &lt;code&gt;localhost:5540&lt;/code&gt; to see:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Semantic Cache Entries&lt;/strong&gt;: Stored as JSON documents with vector embeddings&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Vector Index Schema&lt;/strong&gt;: The schema used for similarity search&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Performance Metrics&lt;/strong&gt;: Monitor cache hit rates and response times&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fno3i763jso2a7nrfhb22.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fno3i763jso2a7nrfhb22.png" alt="captionless image" width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you run the &lt;code&gt;FT.INFO semanticCachingIdx&lt;/code&gt; command in the RedisInsight workbench, you'll see the details of the vector index schema that enables efficient semantic matching.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fix5rqlugz9lv1uoxecat.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fix5rqlugz9lv1uoxecat.png" alt="captionless image" width="800" height="456"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Wrapping up
&lt;/h1&gt;

&lt;p&gt;And that’s it — you now have a working semantic caching system using Spring Boot and Redis.&lt;/p&gt;

&lt;p&gt;Instead of making expensive LLM calls for every similar question, your application can now intelligently cache and retrieve responses based on semantic meaning. Redis handles the vector storage and similarity search with the performance and scalability Redis is known for.&lt;/p&gt;

&lt;p&gt;With Spring AI and Redis, you get an easy way to integrate semantic caching into your Java applications. The combination of vector similarity search for semantic matching and efficient caching gives you a powerful foundation for building cost-effective, high-performance AI applications.&lt;/p&gt;

&lt;p&gt;Whether you’re building chatbots, recommendation engines, or question-answering systems, this semantic caching architecture gives you the tools to dramatically reduce costs while maintaining response quality and improving user experience.&lt;/p&gt;

&lt;p&gt;Try it out, experiment with different similarity thresholds, explore other embedding models, and see how much you can save on LLM costs while delivering faster responses!&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Stay Curious!&lt;/strong&gt;
&lt;/h2&gt;

</description>
      <category>redis</category>
      <category>systemdesign</category>
      <category>springboot</category>
      <category>ai</category>
    </item>
    <item>
      <title>Is your Vector Database Really Fast?</title>
      <dc:creator>Ricardo Ferreira</dc:creator>
      <pubDate>Tue, 22 Jul 2025 15:56:57 +0000</pubDate>
      <link>https://dev.to/redis/is-your-vector-database-really-fast-i62</link>
      <guid>https://dev.to/redis/is-your-vector-database-really-fast-i62</guid>
      <description>&lt;p&gt;A few weeks ago, I was re-watching &lt;a href="https://en.wikipedia.org/wiki/Ford_v_Ferrari" rel="noopener noreferrer"&gt;Ford v Ferrari&lt;/a&gt; (a great movie, by the way), and there's this scene where Carroll Shelby explains that winning isn't just about having the fastest car. It's about the perfect lap. The driver, the weather, the tires, the brake assembly, and even the timing of gear shifts matter. Everything matters.&lt;/p&gt;

&lt;p&gt;That got me thinking about vector databases. We spend so much time debating Postgres vs. Redis vs. Pinecone vs. Weaviate vs. Qdrant vs. Milvus, comparing benchmarks, arguing about which one is "fastest." But here's the thing: we're missing the forest for the trees. Like in racing, the database is only one component of a complex system.&lt;/p&gt;

&lt;p&gt;After spending the last few weeks researching vector search systems, I've learned that the difference between a blazing-fast vector search and a sluggish one rarely comes down to which database you picked. It's about understanding the entire system and optimizing each component. Let me share what I've learned about what really makes vector databases fast.&lt;/p&gt;

&lt;p&gt;💡 &lt;strong&gt;Heads up&lt;/strong&gt;: if you are more of a video person and prefer to learn about this from the comfort of your couch, here is one that summarizes this blog post.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/3ZOt3iwBNdE"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Indexing Algorithms: The Engine Under the Hood
&lt;/h2&gt;

&lt;p&gt;Let's start with the heart of any vector database: the indexing algorithm. This allows us to search millions or billions of vectors without comparing every single one. Let's review the most popular ones.&lt;/p&gt;

&lt;h3&gt;
  
  
  HNSW (Hierarchical Navigable Small World)
&lt;/h3&gt;

&lt;p&gt;HNSW is like building a multi-story parking garage for your vectors. Each level has connections between vectors, with the top levels having long-range connections (think express highways) and lower levels having local connections (neighborhood streets).&lt;/p&gt;

&lt;p&gt;Here's what happens during a search:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Start at the top layer with few, long-range connections&lt;/li&gt;
&lt;li&gt;Greedily traverse to find the general neighborhood&lt;/li&gt;
&lt;li&gt;Descend to lower layers for increasingly precise navigation&lt;/li&gt;
&lt;li&gt;Final layer has all vectors with dense local connections&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The beauty of HNSW is its logarithmic scaling. Doubling your dataset size only adds one more hop to your search path. But memory usage wise, this comes at a cost:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Memory usage = Vector data + (M × 2 × sizeof(int) × number_of_vectors × average_layers)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Where &lt;code&gt;M&lt;/code&gt; is the number of bi-directional links per node. With M=16, which is a common default, you're looking at roughly 64-128 bytes of overhead per vector. For a million 1536-dimensional float32 vectors, that's:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Vector data: 5.7 GB&lt;/li&gt;
&lt;li&gt;Index overhead: ~100 MB&lt;/li&gt;
&lt;li&gt;Total: ~5.8 GB&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key insight: HNSW trades memory for speed. If you have the RAM to afford your entire dataset, it's hard to beat.&lt;/p&gt;

&lt;h3&gt;
  
  
  IVF (Inverted File Index)
&lt;/h3&gt;

&lt;p&gt;IVF takes a different approach. It's like organizing a library by topic before searching. During index building:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Run k-means clustering to create nlist centroids&lt;/li&gt;
&lt;li&gt;Assign each vector to its nearest centroid&lt;/li&gt;
&lt;li&gt;During search, find the nprobe nearest centroids&lt;/li&gt;
&lt;li&gt;Only search vectors within those clusters&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The math here is elegant. If you have N vectors split into √N clusters, and search √N clusters, you're examining approximately N/√N × √N = √N vectors instead of N. That's a massive reduction.&lt;/p&gt;

&lt;p&gt;But here's where it gets interesting. The optimal nlist isn't always √N. It depends on your data distribution:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;

&lt;span class="c1"&gt;# For uniformly distributed data
&lt;/span&gt;&lt;span class="n"&gt;optimal_nlist&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_vectors&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# For clustered data (common with text embeddings)
&lt;/span&gt;&lt;span class="n"&gt;optimal_nlist&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_vectors&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;# Start higher
&lt;/span&gt;
&lt;span class="c1"&gt;# For highly skewed distributions
# You might need even more clusters
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Product Quantization: The Compression Game
&lt;/h3&gt;

&lt;p&gt;PQ is fascinating—it's like JPEG compression for vectors. Instead of storing exact coordinates, you:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Split your vector into m subvectors&lt;/li&gt;
&lt;li&gt;Learn a codebook of 256 centroids for each subspace&lt;/li&gt;
&lt;li&gt;Replace each subvector with its nearest centroid ID (1 byte)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A 1536-dimensional float32 vector (6KB) can be compressed to 96 bytes with m=96, a 98.4% reduction. The tradeoff is accuracy—you're essentially rounding each subvector to one of 256 possible values.&lt;/p&gt;

&lt;p&gt;The clever bit: you can precompute distances between codebook entries. During search, you're just doing lookups and additions, not floating-point multiplications.&lt;/p&gt;

&lt;h3&gt;
  
  
  Here's my decision framework
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Algorithm&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;th&gt;Memory&lt;/th&gt;
&lt;th&gt;Query Time&lt;/th&gt;
&lt;th&gt;Build Time&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;HNSW&lt;/td&gt;
&lt;td&gt;&amp;lt;1M vectors, &amp;lt;5ms latency&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Fastest&lt;/td&gt;
&lt;td&gt;Slow&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IVF&lt;/td&gt;
&lt;td&gt;1M-100M vectors&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IVF-PQ&lt;/td&gt;
&lt;td&gt;&amp;gt;100M vectors&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Slow&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&amp;lt; 1M vectors, need &amp;lt; 5ms latency&lt;/strong&gt;: HNSW, no question&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;1M-100M vectors, can tolerate 10-20ms&lt;/strong&gt;: IVF with tuned parameters&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;100M-1B vectors, memory constrained&lt;/strong&gt;: IVF-PQ or optimized product quantization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;1B+ vectors&lt;/strong&gt;: Distributed IVF-PQ or specialized systems&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  2. Hardware Optimization: The Track Matters
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The GPU Acceleration Paradox
&lt;/h3&gt;

&lt;p&gt;I was surprised that GPUs could slow vector search for many real-world workloads. The issue is data transfer overhead. Moving vectors from CPU to GPU memory takes time—often more than the computation for small batches.&lt;/p&gt;

&lt;p&gt;Consider these benchmarks on a typical setup (NVIDIA A100, PCIe Gen4):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Single query (1536d vector, 1M dataset):
- CPU (AVX-512): 2.3ms
- GPU (including transfer): 5.1ms

Batch of 100 queries:
- CPU: 180ms
- GPU: 22ms

Batch of 1000 queries:
- CPU: 1,750ms
- GPU: 87ms
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The breakeven point is typically around 50-100 concurrent queries. If your workload is primarily single queries, believe it or not, but CPU might be faster!&lt;/p&gt;

&lt;h3&gt;
  
  
  Memory vs. Disk: The Brutal Truth
&lt;/h3&gt;

&lt;p&gt;Everyone knows memory is faster than disk, but the magnitude might surprise you:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Random access latency:
- RAM: ~100 nanoseconds
- NVMe SSD: ~100 microseconds (1,000x slower)
- SATA SSD: ~500 microseconds (5,000x slower)
- HDD: ~10 milliseconds (100,000x slower)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This translates directly to query latency for vector search. A memory-based search might traverse 20 nodes in 2ms, while the exact search hitting disk could take 200ms or more.&lt;/p&gt;

&lt;p&gt;But here's the thing—modern NVMe drives with proper prefetching can narrow this gap. I've seen well-tuned disk-based systems achieve 20-30ms latencies for million-scale datasets. The key is minimizing random access:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Use larger page sizes (64KB instead of 4KB)&lt;/li&gt;
&lt;li&gt;Implement aggressive prefetching&lt;/li&gt;
&lt;li&gt;Keep hot paths (graph upper layers for HNSW) in memory&lt;/li&gt;
&lt;li&gt;Use memory-mapped files with proper madvise hints&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  CPU Vectorization: Free Performance
&lt;/h3&gt;

&lt;p&gt;Modern CPUs have SIMD instructions that can process multiple vector elements simultaneously. The impact is substantial:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="c1"&gt;// To compile with AVX-512 support:&lt;/span&gt;
&lt;span class="c1"&gt;// gcc -mavx512f -O3 vector_ops.c -o vector_ops&lt;/span&gt;

&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;&amp;lt;immintrin.h&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="c1"&gt;// Scalar dot product (simplified)&lt;/span&gt;
&lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="nf"&gt;dot_product_scalar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// AVX-512 dot product (processes 16 floats at once)&lt;/span&gt;
&lt;span class="kt"&gt;float&lt;/span&gt; &lt;span class="nf"&gt;dot_product_avx512&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;__m512&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_mm512_setzero_ps&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;__m512&lt;/span&gt; &lt;span class="n"&gt;va&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_mm512_loadu_ps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
        &lt;span class="n"&gt;__m512&lt;/span&gt; &lt;span class="n"&gt;vb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_mm512_loadu_ps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
        &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_mm512_fmadd_ps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;va&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;_mm512_reduce_add_ps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Distance calculations can be 8-16x faster on modern Intel/AMD processors. Most vector databases enable this automatically, but verify that yours does.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Distance Metrics &amp;amp; Dimensionality: The Physics of Similarity
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Computational Cost Hierarchy
&lt;/h3&gt;

&lt;p&gt;Not all distance metrics are created equal. Here's the computational cost for d-dimensional vectors:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dot Product&lt;/strong&gt;: d multiplications, d-1 additions&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Time complexity: O(d)
Operations: 2d - 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Euclidean Distance&lt;/strong&gt;: d multiplications, d additions, 1 square root&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Time complexity: O(d)
Operations: 2d + 1
Note: Can skip square root for nearest neighbor search
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Cosine Similarity&lt;/strong&gt;: 3d multiplications, 3d-2 additions, 1 division, 2 square roots&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Time complexity: O(d)
Operations: 6d - 1 (if not pre-normalized)
Operations: 2d - 1 (if pre-normalized)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The lesson? If you're using cosine similarity, &lt;strong&gt;always pre-normalize your vectors&lt;/strong&gt;. This one-time cost at insertion may save 67% of computation on every query.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Curse of Dimensionality
&lt;/h3&gt;

&lt;p&gt;High dimensions aren't just about more computation. They fundamentally change the geometry of your search space. With higher dimensions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;All vectors become approximately equidistant&lt;/li&gt;
&lt;li&gt;The ratio between nearest and furthest neighbors approaches 1&lt;/li&gt;
&lt;li&gt;Traditional indexing structures become less effective&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I've measured this effect directly. With random vectors:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;10 dimensions: Nearest/furthest neighbor ratio ≈ 0.5&lt;/li&gt;
&lt;li&gt;100 dimensions: Ratio ≈ 0.8&lt;/li&gt;
&lt;li&gt;1000 dimensions: Ratio ≈ 0.95&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is why dimension reduction can paradoxically improve search quality while reducing computation. Here's what I've used to verify this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.decomposition&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PCA&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.random_projection&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GaussianRandomProjection&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;

&lt;span class="c1"&gt;# Analyze intrinsic dimensionality
# Assuming 'vectors' is a numpy array of shape (n_samples, n_features)
&lt;/span&gt;&lt;span class="n"&gt;pca&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PCA&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_components&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Retain 95% variance
&lt;/span&gt;&lt;span class="n"&gt;pca&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vectors&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Intrinsic dimensionality: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;pca&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;n_components_&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# If significant reduction possible, consider:
# 1. PCA for optimal reduction (slower, better quality)
# 2. Random projection for fast reduction (faster, good quality)
# 3. Autoencoder for non-linear reduction (slowest, best quality)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  4. Embedding Models: The Hidden Performance Lever
&lt;/h2&gt;

&lt;p&gt;Your choice of embedding model doesn't just affect search quality. It has massive performance implications. Here is what I've learned.&lt;/p&gt;

&lt;h3&gt;
  
  
  Model Characteristics That Matter
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Dimensionality&lt;/strong&gt;: This is obvious but often overlooked. OpenAI's ada-002 produces 1536-dimensional vectors, while many open-source models produce 384 or 768 dimensions. That's 4x or 2x less computation and memory.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vector Distribution&lt;/strong&gt;: Some models produce vectors with very different statistical properties:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Analyzing vector distributions
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;analyze_vectors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vectors&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Average norm
&lt;/span&gt;    &lt;span class="n"&gt;norms&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;linalg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vectors&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Avg norm: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;norms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; ± &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;std&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;norms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Sparsity (near-zero components)
&lt;/span&gt;    &lt;span class="n"&gt;sparsity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vectors&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;vectors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Sparsity: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;sparsity&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Component distribution
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Component range: [&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vectors&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vectors&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I've found that models with more uniform distributions (higher entropy) create more challenging search spaces. Models with sparse activations can benefit from specialized indexes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Numerical Precision&lt;/strong&gt;: Many embeddings maintain quality with reduced precision:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;

&lt;span class="c1"&gt;# Test precision reduction impact
# Assuming you have a function to load your vectors
# original_vectors = load_vectors()  # float32
&lt;/span&gt;
&lt;span class="c1"&gt;# Example with random vectors for demonstration:
&lt;/span&gt;&lt;span class="n"&gt;original_vectors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;randn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;384&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;astype&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;float32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;float16_vectors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;original_vectors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;astype&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;float16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;int8_vectors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;original_vectors&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;127&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;astype&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;int8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Measure recall degradation
# Often &amp;lt; 1% loss for float16, &amp;lt; 5% for int8
# You would need to implement a recall measurement function
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Model-Database Co-Design Opportunity
&lt;/h3&gt;

&lt;p&gt;Here's an insight that's often missed: your embedding model and vector database should be designed together. Example optimizations:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Binary embeddings&lt;/strong&gt; (SBERT with binary quantization) can use Hamming distance, enabling bitwise operations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sparse embeddings&lt;/strong&gt; (SPLADE, ColBERT) benefit from inverted indexes rather than dense indexes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-vector embeddings&lt;/strong&gt; need specialized retrieval strategies&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  5. Index Tuning: The Art of Configuration
&lt;/h2&gt;

&lt;p&gt;Default parameters are rarely optimal. Here's how to systematically tune your index:&lt;/p&gt;

&lt;h3&gt;
  
  
  HNSW Tuning Strategy
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;tune_hnsw_parameters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vectors&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;queries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ground_truth&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    This is a template function. You&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ll need to implement:
    - build_hnsw: function to build HNSW index with given parameters
    - benchmark_index: function to measure recall and queries per second

    Example usage with FAISS:
    import faiss

    def build_hnsw(vectors, M, ef_construction):
        index = faiss.IndexHNSWFlat(vectors.shape[1], M)
        index.hnsw.efConstruction = ef_construction
        index.add(vectors)
        return index
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="c1"&gt;# Test M values (connectivity)
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;M&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="c1"&gt;# Test ef_construction values (build quality)
&lt;/span&gt;        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;ef_c&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;build_hnsw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vectors&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ef_construction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ef_c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="c1"&gt;# Test ef_search values (search quality)
&lt;/span&gt;            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;ef_s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                &lt;span class="n"&gt;recall&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;qps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;benchmark_index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;queries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ground_truth&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ef_search&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ef_s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;M&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ef_construction&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ef_c&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ef_search&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ef_s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;recall&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;recall&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;qps&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;qps&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;memory_mb&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;memory_usage&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;
                &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key insights from tuning hundreds of indexes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;M=16 is a sweet spot for most workloads&lt;/li&gt;
&lt;li&gt;ef_construction can often be lower than defaults (200 vs 500) with minimal impact&lt;/li&gt;
&lt;li&gt;ef_search can be adjusted per query based on importance&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  IVF Tuning Strategy
&lt;/h3&gt;

&lt;p&gt;The nlist/nprobe relationship is critical:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;

&lt;span class="c1"&gt;# Theoretical optimal for uniform data
# Assuming you have the number of vectors
&lt;/span&gt;&lt;span class="n"&gt;num_vectors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000000&lt;/span&gt;  &lt;span class="c1"&gt;# Example
&lt;/span&gt;&lt;span class="n"&gt;nlist&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_vectors&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# But real data isn't uniform. Measure cluster imbalance:
# This assumes you've already performed clustering
# Example implementation:
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.cluster&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;KMeans&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;calculate_cluster_imbalance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vectors&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;nlist&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;kmeans&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;KMeans&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_clusters&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;nlist&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kmeans&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit_predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vectors&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;cluster_sizes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bincount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;imbalance_ratio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cluster_sizes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cluster_sizes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;imbalance_ratio&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Highly imbalanced - need more clusters
&lt;/span&gt;        &lt;span class="n"&gt;nlist&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vectors&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;nlist&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;imbalance_ratio&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  6. Chunking Strategies: The Multiplier Effect
&lt;/h2&gt;

&lt;p&gt;Chunking might seem like a preprocessing detail, but it's actually a major performance factor. Your chunking strategy determines:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Vector count&lt;/strong&gt; (linear impact on search time)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vector quality&lt;/strong&gt; (affects search precision)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Index efficiency&lt;/strong&gt; (impacts clustering and traversal)&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Performance-Oriented Chunking
&lt;/h3&gt;

&lt;p&gt;This is how I tested my chunks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;optimize_chunking&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target_latency_ms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Template function for chunking optimization.

    You&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ll need to implement:
    - estimate_max_vectors: based on your benchmarks
    - use_hierarchical_chunking: your chunking strategy
    - test_semantic_coherence: your quality measurement
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Example implementation
&lt;/span&gt;    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;estimate_max_vectors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_latency_ms&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# Based on your benchmarks, e.g., 1ms per 10k vectors
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_latency_ms&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Estimate vectors needed for target latency
&lt;/span&gt;    &lt;span class="n"&gt;max_vectors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;estimate_max_vectors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_latency_ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Calculate optimal chunk size
&lt;/span&gt;    &lt;span class="c1"&gt;# Assuming documents have a 'tokens' attribute
&lt;/span&gt;    &lt;span class="n"&gt;total_tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;optimal_chunk_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;total_tokens&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;max_vectors&lt;/span&gt;

    &lt;span class="c1"&gt;# Adjust for semantic boundaries
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;optimal_chunk_size&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Too small - lose context
&lt;/span&gt;        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Consider hierarchical chunking&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# use_hierarchical_chunking()
&lt;/span&gt;    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;optimal_chunk_size&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Might be too large - test precision
&lt;/span&gt;        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Consider testing semantic coherence&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# test_semantic_coherence()
&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;optimal_chunk_size&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Putting It All Together
&lt;/h2&gt;

&lt;p&gt;After all this technical detail, here's the practical framework I use for optimization:&lt;/p&gt;

&lt;h3&gt;
  
  
  The Optimization Checklist
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Profile First&lt;/strong&gt;: Measure where time is actually spent&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Find the culprit for performance issues&lt;/li&gt;
&lt;li&gt;Adopt observability-first strategies&lt;/li&gt;
&lt;li&gt;Isolate outliers and validate them&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Low-Hanging Fruit&lt;/strong&gt; (often 2-5x improvement):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pre-normalize vectors for cosine similarity&lt;/li&gt;
&lt;li&gt;Enable CPU vectorization&lt;/li&gt;
&lt;li&gt;Tune batch sizes for your hardware&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Algorithmic Changes&lt;/strong&gt; (can be 10x+ improvement):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Choose the right index for your scale&lt;/li&gt;
&lt;li&gt;Consider quantization for large datasets&lt;/li&gt;
&lt;li&gt;Optimize chunking strategy&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;System-Level Optimization&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Co-locate compute and data&lt;/li&gt;
&lt;li&gt;Use connection pooling&lt;/li&gt;
&lt;li&gt;Implement caching for repeated queries&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;Vector database performance isn't about picking the "fastest" database—it's about understanding and optimizing the entire system. Depending on how you configure and use it, the same database can be blazing fast or frustratingly slow.&lt;/p&gt;

&lt;p&gt;The good news? You don't need a multi-million dollar budget to achieve great performance. You need to understand the system and optimize methodically. What's your experience with vector database optimization? Have you found other bottlenecks I didn't cover? Let me know—I'm always learning, and the best insights often come from real-world production challenges.&lt;/p&gt;

</description>
      <category>vectordatabase</category>
      <category>database</category>
      <category>dataengineering</category>
      <category>performance</category>
    </item>
    <item>
      <title>Agent Long-term Memory with Spring AI &amp; Redis</title>
      <dc:creator>Raphael De Lio</dc:creator>
      <pubDate>Wed, 16 Jul 2025 19:57:23 +0000</pubDate>
      <link>https://dev.to/redis/agent-memory-with-spring-ai-redis-58g5</link>
      <guid>https://dev.to/redis/agent-memory-with-spring-ai-redis-58g5</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;TL;DR:&lt;br&gt;
You're building an AI agent with memory using Spring AI and Redis.&lt;/p&gt;

&lt;p&gt;Unlike traditional chatbots that forget previous interactions, memory-enabled agents can recall past conversations and facts.&lt;/p&gt;

&lt;p&gt;It works by storing two types of memory in Redis: short-term (conversation history) and long-term (facts and experiences as vectors), allowing agents to provide personalized, context-aware responses.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;LLMs respond to each message in isolation, treating every interaction as if it's the first time they've spoken with a user. They lack the ability to remember previous conversations, preferences, or important facts.&lt;/p&gt;

&lt;p&gt;Memory-enabled AI agents, on the other hand, can maintain context across multiple interactions. They remember who you are, what you've told them before, and can use that information to provide more personalized, relevant responses.&lt;/p&gt;

&lt;p&gt;In a travel assistant scenario, for example, if a user mentions "I'm allergic to shellfish" in one conversation, and later asks for restaurant recommendations in Boston, a memory-enabled agent would recall the allergy information and filter out inappropriate suggestions, creating a much more helpful and personalized experience.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Video: &lt;a href="https://youtu.be/0U1S0WSsPuE" rel="noopener noreferrer"&gt;What is an embedding model?&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Behind the scenes, this works thanks to vector similarity search. It turns text into vectors (embeddings) — lists of numbers — stores them in a vector database, and then finds the ones closest to your query when relevant information needs to be recalled.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Video: &lt;a href="https://youtu.be/o3XN4dImESE" rel="noopener noreferrer"&gt;What is semantic search?&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Today, we're gonna build a memory-enabled AI agent that helps users plan travel. It will remember user preferences, past trips, and important details across multiple conversations — even if the user leaves and comes back later.&lt;/p&gt;

&lt;p&gt;To do that, we'll build a Spring Boot app from scratch and use Redis as our memory store. It'll handle both short-term memory (conversation history) and long-term memory (facts and preferences as vector embeddings), enabling our agent to provide truly personalized assistance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Redis as a Memory Store for AI Agents
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Video: &lt;a href="https://youtu.be/Yhv19le0sBw" rel="noopener noreferrer"&gt;What is a vector database?&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In the last 15 years, Redis became the foundational infrastructure for realtime applications. Today, with Redis Open Source 8, it's committed to becoming the foundational infrastructure for AI applications as well.&lt;/p&gt;

&lt;p&gt;Redis Open Source 8 not only turns the community version of Redis into a Vector Database, but also makes it the fastest and most scalable database in the market today. Redis 8 allows you to scale to one billion vectors without penalizing latency.&lt;/p&gt;

&lt;p&gt;Learn more: &lt;a href="https://redis.io/blog/searching-1-billion-vectors-with-redis-8/" rel="noopener noreferrer"&gt;https://redis.io/blog/searching-1-billion-vectors-with-redis-8/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For AI agents, Redis serves as both:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A short-term memory store using Redis Lists to maintain conversation history&lt;/li&gt;
&lt;li&gt;A long-term memory store using Redis JSON and the Redis Query Engine that enables vector search to store and retrieve facts and experiences&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Spring AI and Redis
&lt;/h2&gt;

&lt;p&gt;Spring AI provides a unified API for working with various AI models and vector stores. Combined with Redis, it allows our users to easily build memory-enabled AI agents that can:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Store and retrieve vector embeddings for semantic search&lt;/li&gt;
&lt;li&gt;Maintain conversation context across sessions&lt;/li&gt;
&lt;li&gt;Extract and deduplicate memories from conversations&lt;/li&gt;
&lt;li&gt;Summarize long conversations to prevent context window overflow&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Building the Application
&lt;/h2&gt;

&lt;p&gt;Our application will be built using Spring Boot with Spring AI and Redis. It will implement a travel assistant that remembers user preferences and past trips, providing personalized recommendations based on this memory.&lt;/p&gt;

&lt;h3&gt;
  
  
  0. GitHub Repository
&lt;/h3&gt;

&lt;p&gt;The full application can be found on GitHub: &lt;a href="https://github.com/redis-developer/redis-springboot-resources/tree/main/artificial-intelligence/agent-long-term-memory-with-spring-ai" rel="noopener noreferrer"&gt;https://github.com/redis-developer/redis-springboot-resources/tree/main/artificial-intelligence/agent-long-term-memory-with-spring-ai&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Add the required dependencies
&lt;/h3&gt;

&lt;p&gt;From a Spring Boot application, add the following dependencies to your Maven or Gradle file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;    &lt;span class="nf"&gt;implementation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"org.springframework.ai:spring-ai-transformers:1.0.0"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;implementation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"org.springframework.ai:spring-ai-starter-vector-store-redis"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;implementation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"org.springframework.ai:spring-ai-starter-model-openai"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;implementation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"com.redis.om:redis-om-spring:1.0.0-RC3"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Define the Memory model
&lt;/h3&gt;

&lt;p&gt;The core of our implementation is the Memory class that represents items stored in long-term memory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="kd"&gt;data class&lt;/span&gt; &lt;span class="nc"&gt;Memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;memoryType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;MemoryType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"{}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;LocalDateTime&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LocalDateTime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;enum&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MemoryType&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;EPISODIC&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// Personal experiences and preferences&lt;/span&gt;
    &lt;span class="nc"&gt;SEMANTIC&lt;/span&gt;   &lt;span class="c1"&gt;// General knowledge and facts&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Configure the Vector Store
&lt;/h3&gt;

&lt;p&gt;We'll use Spring AI's &lt;code&gt;RedisVectorStore&lt;/code&gt; to store and search vector embeddings of memories:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Configuration&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MemoryVectorStoreConfig&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

    &lt;span class="nd"&gt;@Bean&lt;/span&gt;
    &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;memoryVectorStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;jedisPooled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;JedisPooled&lt;/span&gt;
    &lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;RedisVectorStore&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;RedisVectorStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jedisPooled&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embeddingModel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;indexName&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"longTermMemoryIdx"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;contentFieldName&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embeddingFieldName&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"embedding"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;metadataFields&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="nc"&gt;RedisVectorStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;MetadataField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"memoryType"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Schema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;FieldType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;TAG&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="nc"&gt;RedisVectorStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;MetadataField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"metadata"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Schema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;FieldType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="nc"&gt;RedisVectorStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;MetadataField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"userId"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Schema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;FieldType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;TAG&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="nc"&gt;RedisVectorStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;MetadataField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"createdAt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Schema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;FieldType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"long-term-memory:"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;initializeSchema&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;vectorAlgorithm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;RedisVectorStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Algorithm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;HSNW&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's break this down:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Index Name&lt;/strong&gt;: &lt;code&gt;longTermMemoryIdx&lt;/code&gt; - Redis will create an index with this name for searching memories&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content Field&lt;/strong&gt;: &lt;code&gt;content&lt;/code&gt; - The raw memory content that will be embedded&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embedding Field&lt;/strong&gt;: &lt;code&gt;embedding&lt;/code&gt; - The field that will store the resulting vector embedding&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Metadata Fields&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;memoryType&lt;/code&gt;: TAG field for filtering by memory type (EPISODIC or SEMANTIC)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;metadata&lt;/code&gt;: TEXT field for storing additional context about the memory&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;userId&lt;/code&gt;: TAG field for filtering by user ID&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;createdAt&lt;/code&gt;: TEXT field for storing the creation timestamp&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Implement the Memory Service
&lt;/h3&gt;

&lt;p&gt;The MemoryService handles storing and retrieving memories from Redis:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Service&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MemoryService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;memoryVectorStore&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;RedisVectorStore&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;systemUserId&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"system"&lt;/span&gt;

    &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;storeMemory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;memoryType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;MemoryType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"{}"&lt;/span&gt;
    &lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;StoredMemory&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Check if a similar memory already exists to avoid duplicates&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;similarMemoryExists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;memoryType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;StoredMemory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="nc"&gt;Memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;memoryType&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;memoryType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt; &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="n"&gt;systemUserId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;createdAt&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LocalDateTime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;// Create a document for the vector store&lt;/span&gt;
        &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;document&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="nf"&gt;mapOf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="s"&gt;"memoryType"&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;memoryType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s"&gt;"metadata"&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s"&gt;"userId"&lt;/span&gt; &lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt; &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="n"&gt;systemUserId&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="s"&gt;"createdAt"&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="nc"&gt;LocalDateTime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;// Store the document in the vector store&lt;/span&gt;
        &lt;span class="n"&gt;memoryVectorStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;listOf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;StoredMemory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nc"&gt;Memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;memoryType&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;memoryType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt; &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="n"&gt;systemUserId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;createdAt&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LocalDateTime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;retrieveMemories&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;memoryType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;MemoryType&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Int&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;distanceThreshold&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Float&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.9f&lt;/span&gt;
    &lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StoredMemory&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Build filter expression&lt;/span&gt;
        &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;b&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FilterExpressionBuilder&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;filterList&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mutableListOf&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;FilterExpressionBuilder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Op&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;()&lt;/span&gt;

        &lt;span class="c1"&gt;// Add user filter&lt;/span&gt;
        &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;effectiveUserId&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt; &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="n"&gt;systemUserId&lt;/span&gt;
        &lt;span class="n"&gt;filterList&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;or&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"userId"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;effectiveUserId&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"userId"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;systemUserId&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

        &lt;span class="c1"&gt;// Add memory type filter if specified&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;memoryType&lt;/span&gt; &lt;span class="p"&gt;!=&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;filterList&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"memoryType"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;memoryType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;// Combine filters&lt;/span&gt;
        &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;filterExpression&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filterList&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;
            &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;filterList&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;filterList&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reduce&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;acc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expr&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;and&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;acc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="c1"&gt;// Execute search&lt;/span&gt;
        &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;searchResults&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;memoryVectorStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similaritySearch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nc"&gt;SearchRequest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;topK&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filterExpression&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filterExpression&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;// Transform results to StoredMemory objects&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;searchResults&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mapNotNull&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;distanceThreshold&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;metadata&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;
                &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;memoryObj&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;memoryType&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MemoryType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;valueOf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"memoryType"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="nc"&gt;MemoryType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;SEMANTIC&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                    &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"metadata"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="s"&gt;"{}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"userId"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="n"&gt;systemUserId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;createdAt&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="nc"&gt;LocalDateTime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"createdAt"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;?)&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="nc"&gt;LocalDateTime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="nc"&gt;StoredMemory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;memoryObj&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;null&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key features of the memory service:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stores memories as vector embeddings in Redis&lt;/li&gt;
&lt;li&gt;Retrieves memories using vector similarity search&lt;/li&gt;
&lt;li&gt;Filters memories by user ID and memory type&lt;/li&gt;
&lt;li&gt;Prevents duplicate memories through similarity checking&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5. Implement Spring AI Advisors
&lt;/h3&gt;

&lt;p&gt;We’re going to rely on the Spring AI Advisors API. Advisors are a way to intercept, modify, and enhance AI-driven interactions.&lt;br&gt;
We will implement two advisors: one for retrieval and another for recorder. These advisors will be plugged in our ChatClient and intercept every interaction with the LLM.&lt;/p&gt;

&lt;p&gt;The retrieval advisor runs before your LLM call. It takes the user’s current message, performs a vector similarity search over Redis, and injects the most relevant memories into the system portion of the prompt so the model can ground its answer.&lt;/p&gt;
&lt;h3&gt;
  
  
  5.1 Advisor for Long-term memory retrieval
&lt;/h3&gt;

&lt;p&gt;The retrieval advisor runs before LLM calls. It takes the user’s current message, performs a vector similarity search over Redis, and injects the most relevant memories into the system portion of the prompt so the model can ground its answer.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Component&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;LongTermMemoryRetrievalAdvisor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;memoryService&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;MemoryService&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;CallAdvisor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Ordered&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

  &lt;span class="k"&gt;companion&lt;/span&gt; &lt;span class="k"&gt;object&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;USER_ID&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"ltm_user_id"&lt;/span&gt;   
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;TOP_K&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"ltm_top_k"&lt;/span&gt;      
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;getOrder&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Ordered&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;HIGHEST_PRECEDENCE&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt;
  &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;getName&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"LongTermMemoryRetrievalAdvisor"&lt;/span&gt;

  &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;adviseCall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ChatClientRequest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;CallAdvisorChain&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;ChatClientResponse&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;context&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="nc"&gt;USER_ID&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="s"&gt;"system"&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;k&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;context&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="nc"&gt;TOP_K&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nc"&gt;Int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;

    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;query&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="n"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;memories&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;memoryService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;retrieveRelevantMemories&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;take&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;memoryBlock&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;buildString&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;appendLine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Use the MEMORY below if relevant. Keep answers factual and concise."&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="nf"&gt;appendLine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"----- MEMORY -----"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="n"&gt;memories&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEachIndexed&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;appendLine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"${i+1}. ${m.memory.content}"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="nf"&gt;appendLine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"------------------"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;enrichedPrompt&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;augmentSystemMessage&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;sys&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt;
      &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;existing&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;
      &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mutate&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
          &lt;span class="nf"&gt;buildString&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nf"&gt;appendLine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;memoryBlock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;isNotBlank&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="nf"&gt;appendLine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
              &lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
          &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;enrichedReq&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mutate&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;enrichedPrompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;nextCall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;enrichedReq&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5.2 Advisor for Long-term memory recording
&lt;/h3&gt;

&lt;p&gt;The recorder advisor runs after the assistant responds. It looks at the last user message and the assistant’s reply, asks the model to extract atomic, useful facts (episodic or semantic), deduplicates them, and stores them in Redis.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Component&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;LongTermMemoryRecorderAdvisor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;memoryService&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;MemoryService&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;chatModel&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ChatModel&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;CallAdvisor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Ordered&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

  &lt;span class="kd"&gt;data class&lt;/span&gt; &lt;span class="nc"&gt;MemoryCandidate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;MemoryType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;?)&lt;/span&gt;
  &lt;span class="kd"&gt;data class&lt;/span&gt; &lt;span class="nc"&gt;ExtractionResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;memories&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;MemoryCandidate&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;emptyList&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;extractorConverter&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;BeanOutputConverter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ExtractionResult&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;java&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;getOrder&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="nc"&gt;Int&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Ordered&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;HIGHEST_PRECEDENCE&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;
  &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;getName&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"LongTermMemoryRecorderAdvisor"&lt;/span&gt;

  &lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;adviseCall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ChatClientRequest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;CallAdvisorChain&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;ChatClientResponse&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// 1) Proceed with the normal call (other advisors may have enriched the prompt)&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;res&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;nextCall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;// 2) Build extraction prompt (user + assistant text of *this* turn)&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;userText&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="n"&gt;userMessage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;assistantText&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chatResponse&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;?.&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;?.&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;?.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;

    &lt;span class="c1"&gt;// 3) Ask the model to extract long-term memories as structured JSON&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;schemaHint&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;extractorConverter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;jsonSchema&lt;/span&gt; &lt;span class="c1"&gt;// JSON schema string for the POJO&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;extractSystem&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"""
            You extract LONG-TERM MEMORIES from a dialogue turn.

            A memory is either:

            1. EPISODIC MEMORIES: Personal experiences and user-specific preferences
               Examples: "User prefers Delta airlines", "User visited Paris last year"

            2. SEMANTIC MEMORIES: General domain knowledge and facts
               Examples: "Singapore requires passport", "Tokyo has excellent public transit"

            Only extract clear, factual information. Do not make assumptions or infer information that isn't explicitly stated.
            If no memories can be extracted, return an empty array.

            The instance must conform to this JSON Schema (for validation, do not output it):
              $schemaHint

            Do not include code fences, schema, or properties. Output a single-line JSON object.
        """&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;trimIndent&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;extractUser&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"""
            USER SAID:
            $userText

            ASSISTANT REPLIED:
            $assistantText

            Extract up to 5 memories with correct type; set userId if present/known.
        """&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;trimIndent&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ChatOptions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAiChatOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;responseFormat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ResponseFormat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ResponseFormat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Type&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;JSON_OBJECT&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;extraction&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatModel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="nc"&gt;Prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nf"&gt;listOf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
          &lt;span class="nc"&gt;UserMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;extractUser&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
          &lt;span class="nc"&gt;SystemMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;extractSystem&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;options&lt;/span&gt;
      &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;parsed&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;extractorConverter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;convert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;extraction&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="nc"&gt;ExtractionResult&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;// 4) Persist memories (MemoryService handles dedupe/thresholding)&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"ltm_user_id"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;// optional per-call param&lt;/span&gt;
    &lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;memories&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt;
      &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;owner&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt; &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;
      &lt;span class="n"&gt;memoryService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;storeMemory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;memoryType&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;owner&lt;/span&gt;
      &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  6. Plugging the advisors in our ChatClient
&lt;/h3&gt;

&lt;p&gt;In our ChatConfig class, we will configure our ChatClient as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;    &lt;span class="nd"&gt;@Bean&lt;/span&gt;
    &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;chatModel&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ChatModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="c1"&gt;// chatMemory: ChatMemory, (Necessary for short-term memory)&lt;/span&gt;
        &lt;span class="n"&gt;longTermRecorder&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;LongTermMemoryRecorderAdvisor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;longTermMemoryRetrieval&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;LongTermMemoryRetrievalAdvisor&lt;/span&gt;
    &lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;ChatClient&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;ChatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chatModel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;defaultAdvisors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="c1"&gt;// MessageChatMemoryAdvisor.builder(chatMemory).build(),&lt;/span&gt;
                &lt;span class="n"&gt;longTermRecorder&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;longTermMemoryRetrieval&lt;/span&gt;
            &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  7. Implement the Chat Service
&lt;/h3&gt;

&lt;p&gt;Since the advisors have been plugged in the ChatClient itself, we don’t need to worry about managing memory ourselves when interacting with the LLM. The only thing we need to make sure is that with every interaction we send the expected parameters, namely the session or user ID, so that the advisors know which history to look at.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Service&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ChatService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ChatClient&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;shortTermMemoryRepository&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ShortTermMemoryRepository&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;travelAgentSystemPrompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;chatMemoryRepository&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ChatMemoryRepository&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;log&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LoggerFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getLogger&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ChatService&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;java&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;sendMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;ChatResult&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Use userId as the key for conversation history and long-term memory&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Processing message from user $userId: $message"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="nc"&gt;Prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;travelAgentSystemPrompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="nc"&gt;UserMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;advisors&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ChatMemory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CONVERSATION_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;param&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"ltm_user_id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;ChatResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chatResponse&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;!!&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;


    &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;getConversationHistory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;?&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;chatMemoryRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findByConversationId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;clearConversationHistory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;shortTermMemoryRepository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;deleteById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Cleared conversation history for user $userId from Redis"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  8. Configure the Agent System Prompt
&lt;/h3&gt;

&lt;p&gt;The agent is configured with a system prompt that explains its capabilities and access to different types of memory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Bean&lt;/span&gt;
&lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;travelAgentSystemPrompt&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="nc"&gt;Message&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;promptText&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"""
        You are a travel assistant helping users plan their trips. You remember user preferences
        and provide personalized recommendations based on past interactions.

        You have access to the following types of memory:
        1. Short-term memory: The current conversation thread
        2. Long-term memory:
           - Episodic: User preferences and past trip experiences (e.g., "User prefers window seats")
           - Semantic: General knowledge about travel destinations and requirements

        Always be helpful, personal, and context-aware in your responses.

        Always answer in text format. No markdown or special formatting.
    """&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;trimIndent&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;SystemMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;promptText&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  9. Create the REST Controller
&lt;/h3&gt;

&lt;p&gt;The REST controller exposes endpoints for chat and memory management:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="nd"&gt;@RestController&lt;/span&gt;
&lt;span class="nd"&gt;@RequestMapping&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/api"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ChatController&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;chatService&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ChatService&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

    &lt;span class="nd"&gt;@PostMapping&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/chat"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@RequestBody&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ChatRequest&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;ChatResponse&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;result&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sendMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;ChatResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;metrics&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metrics&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nd"&gt;@GetMapping&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/history/{userId}"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;getHistory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@PathVariable&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;MessageDto&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;chatService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getConversationHistory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt;
            &lt;span class="nc"&gt;MessageDto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;role&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="nc"&gt;SystemMessage&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="s"&gt;"system"&lt;/span&gt;
                    &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="nc"&gt;UserMessage&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="s"&gt;"user"&lt;/span&gt;
                    &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="nc"&gt;AssistantMessage&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="s"&gt;"assistant"&lt;/span&gt;
                    &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="s"&gt;"unknown"&lt;/span&gt;
                &lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="nc"&gt;SystemMessage&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;
                    &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="nc"&gt;UserMessage&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;
                    &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="nc"&gt;AssistantMessage&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;
                    &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nd"&gt;@DeleteMapping&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/history/{userId}"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;clearHistory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@PathVariable&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;chatService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clearConversationHistory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Running the Demo
&lt;/h2&gt;

&lt;p&gt;The easiest way to run the demo is with Docker Compose, which sets up all required services in one command.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Clone the repository
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/redis/redis-springboot-recipes.git
&lt;span class="nb"&gt;cd &lt;/span&gt;redis-springboot-recipes/artificial-intelligence/agent-long-term-memory-with-spring-ai
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Configure your environment
&lt;/h3&gt;

&lt;p&gt;Create a &lt;code&gt;.env&lt;/code&gt; file with your OpenAI API key:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;OPENAI_API_KEY=sk-your-api-key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Start the services
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose up &lt;span class="nt"&gt;--build&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will start:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;redis: for storing both vector embeddings and chat history&lt;/li&gt;
&lt;li&gt;redis-insight: a UI to explore the Redis data&lt;/li&gt;
&lt;li&gt;agent-memory-app: the Spring Boot app that implements the memory-aware AI agent&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 4: Use the application
&lt;/h3&gt;

&lt;p&gt;When all services are running, go to &lt;code&gt;localhost:8080&lt;/code&gt; to access the demo. You'll see a travel assistant interface with a chat panel and a memory management sidebar:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftjtn2ks4srwwtmu3mme8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftjtn2ks4srwwtmu3mme8.png" alt="Screenshot of the Redis Agent Memory demo web interface. The interface is titled “Travel Agent with Redis Memory” and features two main panels: a “Memory Management” section on the left with tabs for Episodic and Semantic memories (currently showing “No episodic memories yet”), and a “Travel Assistant” chat on the right displaying a welcome message. At the top right, there’s a field to enter a user ID and buttons to start or clear the chat. The interface is clean and styled with Redis branding." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Enter a user ID and click "Start Chat":&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzc9982peiet2f9hsb0py.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzc9982peiet2f9hsb0py.png" alt="Close-up screenshot of the user ID input and chat controls. The label “User ID:” appears on the left with a text input field containing the value “raphael”. To the right are two red buttons labeled “Start Chat” and “Clear Chat”." width="363" height="42"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Send a message like: "Hi, my name's Raphael. I went to Paris back in 2009 with my wife for our honeymoon and we had a lovely time. For our 10-year anniversary we're planning to go back. Help us plan the trip!"&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbavzfshheyvac8o3zs3x.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbavzfshheyvac8o3zs3x.gif" alt="Animated screen recording of a user sending a message in the Redis Agent Memory demo. The user, identified as “raphael”, types a detailed message into the chat input box: “Hi, my name’s Raphael. I went to Paris back in 2009 with my wife for our honeymoon and we had a lovely time. For our 10-year anniversary we’re planning to go back. Help us plan the trip!” The cursor then clicks the red “Send” button, initiating the interaction with the AI travel assistant." width="760" height="427"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The system will reply with the response to your message and, in case it identifies potential memories to be stored, they will be stored either as semantic or episodic memories. You can see the stored memories on the "Memory Management" sidebar.&lt;/p&gt;

&lt;p&gt;On top of that, with each message, the system will also return performance metrics.&lt;/p&gt;

&lt;p&gt;If you refresh the page, you will see that all memories and the chat history are gone. &lt;/p&gt;

&lt;p&gt;If you reenter the same user ID, the long-term memories will be reloaded on the sidebar and the short-term memory (the chat history) will be reloaded as well:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsi3t9hf3563r6tqnedjl.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsi3t9hf3563r6tqnedjl.gif" alt="Animated screen recording of the Redis Agent Memory demo after sending a message. The sidebar under “Episodic Memories” now shows two stored entries: one noting that the user went to Paris in 2009 for their honeymoon, and another about planning a return for their 10-year anniversary. The chat assistant responds with a personalized message suggesting activities and asking follow-up questions. The browser page is then refreshed, clearing both the chat history and memory display. After re-entering the same user ID, the agent reloads the long-term memories in the sidebar and restores the conversation history, demonstrating persistent memory retrieval." width="760" height="427"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;If you refresh the page and enter the same user ID, your memories and conversation history will be reloaded&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6eided1rhbxxoes2eaht.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6eided1rhbxxoes2eaht.gif" alt="Animated screen recording of a cleared chat session in the Redis Agent Memory demo. The “Episodic Memories” panel still shows two past memories about a trip to Paris. In the chat panel, the message “Conversation cleared. How can I assist you today?” appears, indicating that the short-term memory has been reset. The user is about to start a new conversation. This demonstrates that although the short-term context is gone, the agent retains access to long-term memories, allowing it to respond with relevant information from past interactions." width="760" height="427"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Exploring the Data in Redis Insight
&lt;/h2&gt;

&lt;p&gt;RedisInsight provides a visual interface for exploring the data stored in Redis. Access it at &lt;code&gt;localhost:5540&lt;/code&gt; to see:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Short-term memory (conversation history) stored in Redis Lists&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzq3d8c7jub35362acujt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzq3d8c7jub35362acujt.png" alt="Screenshot of RedisInsight displaying the contents of the conversation:raphael key. The selected key is a Redis list representing a conversation history. On the right panel, the list shows four indexed elements: system prompts defining the assistant’s role and memory access, a user message asking “Where did I go back in 2009?”, and the assistant’s reply recalling a previous trip to Paris. Below this, several memory entries stored as JSON keys are also visible. This illustrates how short-term chat history is preserved in Redis and replayed per user session." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Long-term memory (facts and experiences) stored as JSON documents with vector embeddings&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo5e9x8q1d0pb2mizac9k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo5e9x8q1d0pb2mizac9k.png" alt="Screenshot of RedisInsight showing a semantic memory stored in Redis. The selected key is a JSON object with the name memory:04d04.... The right panel displays the memory’s fields: createdAt timestamp, empty metadata, memoryType set to “SEMANTIC”, an embedding vector (collapsed), userId set to “system”, and the memory content: “Paris is a beautiful city known for celebrating love”. This illustrates how general knowledge is stored as semantic memory in the AI agent." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The vector index schema used for similarity search&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you run the &lt;code&gt;FT.INFO longTermMemoryIdx&lt;/code&gt; command in the RedisInsight workbench, you'll see the details of the vector index schema that enables efficient memory retrieval.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0hwu6hquvpfik5lhrs31.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0hwu6hquvpfik5lhrs31.png" alt="Screenshot of RedisInsight Workbench showing the schema details of the longTermMemoryIdx vector index. The result of the FT.INFO longTermMemoryIdx command displays an index on JSON documents prefixed with memory:. The schema includes: •    $.content as a TEXT field named content  •    $.embedding as a VECTOR field using HNSW with 384-dimension FLOAT32 vectors and COSINE distance  •    $.memoryType and $.userId as TAG fields  •    $.metadata and $.createdAt as TEXT fields  This shows how memory data is structured and searchable in Redis using RediSearch vector similarity." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping up
&lt;/h2&gt;

&lt;p&gt;And that's it — you now have a working AI agent with memory using Spring Boot and Redis.&lt;/p&gt;

&lt;p&gt;Instead of forgetting everything between conversations, your agent can now remember user preferences, past experiences, and important facts. Redis handles both short-term memory (conversation history) and long-term memory (vector embeddings) — all with the performance and scalability Redis is known for.&lt;/p&gt;

&lt;p&gt;With Spring AI and Redis, you get an easy way to integrate this into your Java applications. The combination of vector similarity search for semantic retrieval and traditional data structures for conversation history gives you a powerful foundation for building truly intelligent agents.&lt;/p&gt;

&lt;p&gt;Whether you're building customer service bots, personal assistants, or domain-specific experts, this memory architecture gives you the tools to create more helpful, personalized, and context-aware AI experiences.&lt;/p&gt;

&lt;p&gt;Try it out, experiment with different memory types, explore other embedding models, and see how far you can push the boundaries of AI agent capabilities!&lt;/p&gt;

&lt;p&gt;Stay Curious!&lt;/p&gt;

</description>
      <category>springboot</category>
      <category>ai</category>
      <category>redis</category>
      <category>vectordatabase</category>
    </item>
    <item>
      <title>Semantic Search with Spring Boot &amp; Redis</title>
      <dc:creator>Raphael De Lio</dc:creator>
      <pubDate>Tue, 29 Apr 2025 08:48:59 +0000</pubDate>
      <link>https://dev.to/redis/semantic-search-with-spring-boot-redis-48l0</link>
      <guid>https://dev.to/redis/semantic-search-with-spring-boot-redis-48l0</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt;&lt;br&gt;
You’re building a &lt;strong&gt;semantic search app&lt;/strong&gt; using &lt;strong&gt;Spring Boot&lt;/strong&gt; and &lt;strong&gt;Redis&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Instead of matching exact words, semantic search finds &lt;strong&gt;meaning&lt;/strong&gt; using &lt;strong&gt;Vector Similarity Search (VSS)&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It works by turning movie synopses into &lt;strong&gt;vectors&lt;/strong&gt; with &lt;strong&gt;embedding models&lt;/strong&gt;, storing them in &lt;strong&gt;Redis&lt;/strong&gt; (as a vector database), and finding the closest matches to user queries.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx9xqm8mh0qh137eo9uab.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx9xqm8mh0qh137eo9uab.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Video: &lt;a href="https://www.youtube.com/watch?v=o3XN4dImESE" rel="noopener noreferrer"&gt;What is semantic search?&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A traditional searching system works by matching the words a user types with the words stored in a database or document collection. It usually looks for exact or partial matches without understanding the meaning behind the words.&lt;/p&gt;

&lt;p&gt;Semantic searching, on the other hand, tries to understand the meaning behind what the user is asking. &lt;strong&gt;It focuses on the concepts, not just the keywords,&lt;/strong&gt; making it much easier for users to find what they really want.&lt;/p&gt;

&lt;p&gt;In a movie streaming service, for example, if a movie’s synopsis is stored in a database as &lt;strong&gt;“A cowboy doll feels threatened when a new space toy becomes his owner’s favorite,”&lt;/strong&gt; but the user searches for &lt;strong&gt;“jealous toy struggles with new rival,”&lt;/strong&gt; a traditional search system might not find the movie because the exact words don’t line up. &lt;/p&gt;

&lt;p&gt;But a semantic a semantic search system can still connect the two ideas and bring up the right movie. It understands the &lt;em&gt;meaning&lt;/em&gt; behind your query — not just the exact words.&lt;/p&gt;

&lt;p&gt;Behind the scenes, this works thanks to &lt;strong&gt;vector similarity search&lt;/strong&gt;. It turns text (or images, or audio) into vectors — lists of numbers —store them in a vector database and then finds the ones closest to your query. &lt;/p&gt;

&lt;p&gt;Today, &lt;strong&gt;we’re gonna build a vector similarity search app that lets users find movies based on the *meaning *of their synopsis — not just exact keyword matches&lt;/strong&gt;. So that even if they don’t know the title, they can still get the right movie based on a generic description of the synopsis.&lt;/p&gt;

&lt;p&gt;To do that, we’ll build a Spring Boot app from scratch and plug in &lt;strong&gt;Redis OM Spring&lt;/strong&gt;. It’ll handle turning our data into vectors, storing them in Redis, and running fast vector searches when users send a query.&lt;/p&gt;

&lt;h2&gt;
  
  
  Redis as a Vector Database
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Video: &lt;a href="https://www.youtube.com/watch?v=Yhv19le0sBw" rel="noopener noreferrer"&gt;What is a vector database?&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In the last 15 years, Redis became the foundational infrastructure for realtime applications.  Today, with Redis 8, it’s commited to becoming the foundational infrastructure for AI applications as well. &lt;/p&gt;

&lt;p&gt;Redis 8 not only turns the community version of Redis into a Vector Database, but also makes it the fastest and most scalable database in the market today. &lt;strong&gt;Redis 8 allows you to scale to one billion vectors without penalizing latency.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Learn more: &lt;a href="https://redis.io/blog/searching-1-billion-vectors-with-redis-8/" rel="noopener noreferrer"&gt;https://redis.io/blog/searching-1-billion-vectors-with-redis-8/&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Redis OM Spring&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;To allow our users and customers to take full advantage of everything Redis can do — with the speed Redis is known for — we decided to implement &lt;strong&gt;Redis OM Spring&lt;/strong&gt;, a library built on top of Spring Data Redis. &lt;/p&gt;

&lt;p&gt;Redis OM Spring allows our users to easily communicate with Redis, model their entities as &lt;strong&gt;JSONs&lt;/strong&gt; or Hashes, efficiently query them by levaraging the &lt;strong&gt;Redis Query Engine&lt;/strong&gt; and even take advantage of probabilistic data structures such as &lt;strong&gt;Count-min Sketch, Bloom Filters, Cuckoo Filters&lt;/strong&gt;, and more. &lt;/p&gt;

&lt;p&gt;Redis OM Spring on GitHub: &lt;a href="https://github.com/redis/redis-om-spring" rel="noopener noreferrer"&gt;https://github.com/redis/redis-om-spring&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Dataset
&lt;/h2&gt;

&lt;p&gt;The dataset we’ll be looking is a catalog of thousands of movies. Each of these movies has metadata such as its title, cast, genre, year, and synopsis. The JSON file representing this dataset can be found in the repository that accompanies this article.&lt;/p&gt;

&lt;p&gt;Sample:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "title": "Toy Story",
  "year": 1995,
  "cast": [
   "Tim Allen",
   "Tom Hanks",
   "Don Rickles"
  ],
  "genres": [
   "Animated",
   "Comedy"
  ],
  "href": "Toy_Story",
  "extract": "Toy Story is a 1995 American computer-animated comedy film directed by John Lasseter, produced by Pixar Animation Studios and released by Walt Disney Pictures. The first installment in the  Toy Story franchise, it was the first entirely computer-animated feature film, as well as the first feature film from Pixar. It was written by Joss Whedon, Andrew Stanton, Joel Cohen, and Alec Sokolow from a story by Lasseter, Stanton, Pete Docter, and Joe Ranft. The film features music by Randy Newman, was produced by Bonnie Arnold and Ralph Guggenheim, and was executive-produced by Steve Jobs and Edwin Catmull. The film features the voices of Tom Hanks, Tim Allen, Don Rickles, Jim Varney, Wallace Shawn, John Ratzenberger, Annie Potts, R. Lee Ermey, John Morris, Laurie Metcalf, and Erik von Detten.",
  "thumbnail": "https://upload.wikimedia.org/wikipedia/en/1/13/Toy_Story.jpg",
  "thumbnail_width": 250,
  "thumbnail_height": 373
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
  
  
  Building the Application
&lt;/h2&gt;

&lt;p&gt;Our application will be built using Spring Boot with Redis OM Spring. &lt;strong&gt;It will allow movies to be searched by their synopsis based on semantic search rather than keyword matching. **Besides that, our application will also allow its users to perform **hybrid search&lt;/strong&gt;, &lt;strong&gt;a technique that combines vector similarity with traditional filtering and sorting.&lt;/strong&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  0. GitHub Repository
&lt;/h3&gt;

&lt;p&gt;**The full application can be found on GitHub: &lt;a href="https://github.com/redis/redis-om-spring/tree/main/demos/roms-vss-movies/src" rel="noopener noreferrer"&gt;**https://github.com/redis/redis-om-spring/tree/main/demos/roms-vss-movies/&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Add the required dependencies
&lt;/h3&gt;

&lt;p&gt;From a Spring Boot application, add the following dependencies to your Maven or Gradle file: &lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;!-- Redis OM Spring for Redis object mapping and vector search --&amp;gt;
&amp;lt;dependency&amp;gt;
    &amp;lt;groupId&amp;gt;com.redis.om.spring&amp;lt;/groupId&amp;gt;
    &amp;lt;artifactId&amp;gt;redis-om-spring&amp;lt;/artifactId&amp;gt;
    &amp;lt;version&amp;gt;0.9.11&amp;lt;/version&amp;gt;
&amp;lt;/dependency&amp;gt;

&amp;lt;!-- Redis OM Spring uses Spring AI for creating embeddings (vectors) --&amp;gt;
&amp;lt;dependency&amp;gt;
    &amp;lt;groupId&amp;gt;org.springframework.ai&amp;lt;/groupId&amp;gt;
    &amp;lt;artifactId&amp;gt;spring-ai-openai&amp;lt;/artifactId&amp;gt;
    &amp;lt;version&amp;gt;1.0.0-M6&amp;lt;/version&amp;gt;
&amp;lt;/dependency&amp;gt;
&amp;lt;dependency&amp;gt;
    &amp;lt;groupId&amp;gt;org.springframework.ai&amp;lt;/groupId&amp;gt;
    &amp;lt;artifactId&amp;gt;spring-ai-transformers&amp;lt;/artifactId&amp;gt;
    &amp;lt;version&amp;gt;1.0.0-M6&amp;lt;/version&amp;gt;
&amp;lt;/dependency&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  2. Define the Movie entity
&lt;/h3&gt;

&lt;p&gt;Redis OM Spring provides two annotations that makes it easy to vectorize data and perform vector similarity search from within Spring Boot.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="mentioned-user" href="https://dev.to/vectorize"&gt;@vectorize&lt;/a&gt;: Automatically generates vector embeddings from the text field&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;@Indexed: Enables vector indexing on a field for efficient search&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The core of the implementation is the Movie class with Redis vector indexing annotations:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@RedisHash // This annotation is used by Redis OM Spring to store the entity as a hash in Redis
public class Movie {

    @Id // IDs are automatically generated by Redis OM Spring as ULID
    private String title;

    @Indexed(sortable = true) // This annotation enables indexing on the field for filtering and sorting
    private int year;

    @Indexed
    private List&amp;lt;String&amp;gt; cast;

    @Indexed
    private List&amp;lt;String&amp;gt; genres;

    private String href;

    // This annotation automatically generates vector embeddings from the text
    @Vectorize(
            destination = "embeddedExtract", // The field where the embedding will be stored
            embeddingType = EmbeddingType.SENTENCE, // Type of embedding to generate (Sentence, Image, face, or word)
            provider = EmbeddingProvider.OPENAI, // The provider for generating embeddings (OpenAI, Transformers, VertexAI, etc.)
            openAiEmbeddingModel = OpenAiApi.EmbeddingModel.TEXT_EMBEDDING_3_LARGE // The specific OpenAI model to use for embeddings
    )
    private String extract;

    // This defines the vector field that will store the embeddings
    // The indexed annotation enables vector search on this field
    @Indexed(
            schemaFieldType = SchemaFieldType.VECTOR, // Defines the field type as a vector
            algorithm = VectorField.VectorAlgorithm.FLAT, // The algorithm used for vector search (FLAT or HNSW)
            type = VectorType.FLOAT32,
            dimension = 3072, // The dimension of the vector (must match the embedding model)
            distanceMetric = DistanceMetric.COSINE, // The distance metric used for similarity search (Cosine or Euclidean)
            initialCapacity = 10
    )
    private byte[] embeddedExtract;

    private String thumbnail;
    private int thumbnailWidth;
    private int thumbnailHeight;

    // Getters and setters...
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;In this example we're using OpenAI's embedding model that requires an OpenAI API Key to be set in the &lt;code&gt;application.properties&lt;/code&gt; file of your application:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;redis.om.spring.ai.open-ai.api-key=${OPEN_AI_KEY}&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If an embedding model is not specified, Redis OM Spring will use a Hugging Face’s Transformers model (all-MiniLM-L6-v2) by default. In this case, make sure you match the number of dimensions in the indexed annotation to 384 which is the number of dimensions created by the default embedding model.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Repository Interface
&lt;/h3&gt;

&lt;p&gt;A simple repository interface that extends RedisEnhancedRepository. This will be used to load the data into Redis using the saveAll() method:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public interface MovieRepository extends RedisEnhancedRepository&amp;lt;Movie, String&amp;gt; {}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This provides basic CRUD operations for Movie entities, with the first generic parameter being the entity type and the second being the ID type.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;4. Search Service&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The search service uses two beans provided by Redis OM Spring:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;EntityStream: For creating a stream of entities to perform searches. The Entity Stream must not be confused with the Java Streams API. The Entity Stream will generate a Redis Command that will be sent to Redis so that Redis can perform the searching, filtering and sorting efficiently on its side.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Embedder: Used for generating the embedding for the query sent by the user. It will be generated following the configuration of the &lt;a class="mentioned-user" href="https://dev.to/vectorize"&gt;@vectorize&lt;/a&gt; annotation defined in the Movie class/&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The search functionality is implemented in the SearchService:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@Service
public class SearchService {

    private static final Logger logger = LoggerFactory.getLogger(SearchService.class);
    private final EntityStream entityStream;
    private final Embedder embedder;

    public SearchService(EntityStream entityStream, Embedder embedder) {
        this.entityStream = entityStream;
        this.embedder = embedder;
    }

    public List&amp;lt;Pair&amp;lt;Movie, Double&amp;gt;&amp;gt; search(
            String query,
            Integer yearMin,
            Integer yearMax,
            List&amp;lt;String&amp;gt; cast,
            List&amp;lt;String&amp;gt; genres,
            Integer numberOfNearestNeighbors) {
        logger.info("Received text: {}", query);
        logger.info("Received yearMin: {} yearMax: {}", yearMin, yearMax);
        logger.info("Received cast: {}", cast);
        logger.info("Received genres: {}", genres);

        if (numberOfNearestNeighbors == null) numberOfNearestNeighbors = 3;
        if (yearMin == null) yearMin = 1900;
        if (yearMax == null) yearMax = 2100;

        // Convert query text to vector embedding
        byte[] embeddedQuery = embedder.getTextEmbeddingsAsBytes(List.of(query), Movie$.EXTRACT).getFirst();

        // Perform vector search with additional filters
        SearchStream&amp;lt;Movie&amp;gt; stream = entityStream.of(Movie.class);
        return stream
                // KNN search for nearest vectors
                .filter(Movie$.EMBEDDED_EXTRACT.knn(numberOfNearestNeighbors, embeddedQuery))
                // Additional metadata filters (hybrid search)
                .filter(Movie$.YEAR.between(yearMin, yearMax))
                .filter(Movie$.CAST.eq(cast))
                .filter(Movie$.GENRES.eq(genres))
                // Sort by similarity score
                .sorted(Movie$._EMBEDDED_EXTRACT_SCORE)
                // Return both the movie and its similarity score
                .map(Fields.of(Movie$._THIS, Movie$._EMBEDDED_EXTRACT_SCORE))
                .collect(Collectors.toList());
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Key features of the search service:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Uses EntityStream to create a search stream for Movie entities&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Converts the text query into a vector embedding&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Uses K-nearest neighbors (KNN) search to find similar vectors&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Applies additional filters for hybrid search (combining vector and traditional search)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Returns pairs of movies and their similarity scores&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5. Movie Service for Data Loading
&lt;/h3&gt;

&lt;p&gt;The MovieService handles loading movie data into Redis. It reads a JSON file containing movie date and save the movies into Redis. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;It may take one or two minutes to load the data for the thousands of movies in the file because the embedding generation is done in the background. The &lt;a class="mentioned-user" href="https://dev.to/vectorize"&gt;@vectorize&lt;/a&gt; annotation will generate the embeddings for the extract field before the movie is saved into Redis.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@Service
public class MovieService {

    private static final Logger log = LoggerFactory.getLogger(MovieService.class);
    private final ObjectMapper objectMapper;
    private final ResourceLoader resourceLoader;
    private final MovieRepository movieRepository;

    public MovieService(ObjectMapper objectMapper, ResourceLoader resourceLoader, MovieRepository movieRepository) {
        this.objectMapper = objectMapper;
        this.resourceLoader = resourceLoader;
        this.movieRepository = movieRepository;
    }

    public void loadAndSaveMovies(String filePath) throws Exception {
        Resource resource = resourceLoader.getResource("classpath:" + filePath);
        try (InputStream is = resource.getInputStream()) {
            List&amp;lt;Movie&amp;gt; movies = objectMapper.readValue(is, new TypeReference&amp;lt;&amp;gt;() {});
            List&amp;lt;Movie&amp;gt; unprocessedMovies = movies.stream()
                    .filter(movie -&amp;gt; !movieRepository.existsById(movie.getTitle()) &amp;amp;&amp;amp;
                            movie.getYear() &amp;gt; 1980
                    ).toList();
            long systemMillis = System.currentTimeMillis();
            movieRepository.saveAll(unprocessedMovies);
            long elapsedMillis = System.currentTimeMillis() - systemMillis;
            log.info("Saved " + movies.size() + " movies in " + elapsedMillis + " ms");
        }
    }

    public boolean isDataLoaded() {
        return movieRepository.count() &amp;gt; 0;
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  5. Search Controller
&lt;/h3&gt;

&lt;p&gt;The REST controller exposes the search endpoint:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@RestController
public class SearchController {

    private final SearchService searchService;

    public SearchController(SearchService searchService) {
        this.searchService = searchService;
    }

    @GetMapping("/search")
    public Map&amp;lt;String, Object&amp;gt; search(
            @RequestParam(required = false) String text,
            @RequestParam(required = false) Integer yearMin,
            @RequestParam(required = false) Integer yearMax,
            @RequestParam(required = false) List&amp;lt;String&amp;gt; cast,
            @RequestParam(required = false) List&amp;lt;String&amp;gt; genres,
            @RequestParam(required = false) Integer numberOfNearestNeighbors
    ) {
        List&amp;lt;Pair&amp;lt;Movie, Double&amp;gt;&amp;gt; matchedMovies = searchService.search(
                text,
                yearMin,
                yearMax,
                cast,
                genres,
                numberOfNearestNeighbors
        );
        return Map.of(
                "matchedMovies", matchedMovies,
                "count", matchedMovies.size()
        );
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  6. Application Bootstrap
&lt;/h3&gt;

&lt;p&gt;The main application class initializes Redis OM Spring and loads data. The @EnableRedisEnhancedRepositories annotation activates Redis OM Spring's repository support:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@SpringBootApplication
@EnableRedisEnhancedRepositories(basePackages = {"dev.raphaeldelio.redis8demo*"})
public class Redis8DemoVectorSimilaritySearchApplication {

    public static void main(String[] args) {
        SpringApplication.run(Redis8DemoVectorSimilaritySearchApplication.class, args);
    }

    @Bean
    CommandLineRunner loadData(MovieService movieService) {
        return args -&amp;gt; {
            if (movieService.isDataLoaded()) {
                System.out.println("Data already loaded. Skipping data load.");
                return;
            }
            movieService.loadAndSaveMovies("movies.json");
        };
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  7. Sample Requests
&lt;/h3&gt;

&lt;p&gt;You can make requests to the search endpoint:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET http://localhost:8082/search?text=A movie about a young boy who goes to a wizardry school

GET http://localhost:8082/search?numberOfNearestNeighbors=1&amp;amp;yearMin=1970&amp;amp;yearMax=1990&amp;amp;text=A movie about a kid and a scientist who go back in time

GET http://localhost:8082/search?cast=Dee Wallace,Henry Thomas&amp;amp;text=A boy who becomes friend with an alien
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Sample request:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET http://localhost:8082/search?numberOfNearestNeighbors=1&amp;amp;yearMin=1970&amp;amp;yearMax=1990&amp;amp;text=A movie about a kid and a scientist who go back in time
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Sample response:&lt;/strong&gt;&lt;/p&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{&lt;br&gt;
  "count": 1,&lt;br&gt;
  "matchedMovies": [&lt;br&gt;
    {&lt;br&gt;
      "first": { // matched movie&lt;br&gt;
        "title": "Back to the Future",&lt;br&gt;
        "year": 1985,&lt;br&gt;
        "cast": [&lt;br&gt;
          "Michael J. Fox",&lt;br&gt;
          "Christopher Lloyd"&lt;br&gt;
        ],&lt;br&gt;
        "genres": [&lt;br&gt;
          "Science Fiction"&lt;br&gt;
        ],&lt;br&gt;
        "extract": "Back to the Future is a 1985 American science fiction film directed by Robert Zemeckis and written by Zemeckis, and Bob Gale. It stars Michael J. Fox, Christopher Lloyd, Lea Thompson, Crispin Glover, and Thomas F. Wilson. Set in 1985, it follows Marty McFly (Fox), a teenager accidentally sent back to 1955 in a time-traveling DeLorean automobile built by his eccentric scientist friend Emmett \"Doc\" Brown (Lloyd), where he inadvertently prevents his future parents from falling in love – threatening his own existence – and is forced to reconcile them and somehow get back to the future.",&lt;br&gt;
        "thumbnail": "&lt;a href="https://upload.wikimedia.org/wikipedia/en/d/d2/Back_to_the_Future.jpg" rel="noopener noreferrer"&gt;https://upload.wikimedia.org/wikipedia/en/d/d2/Back_to_the_Future.jpg&lt;/a&gt;"&lt;br&gt;
      },&lt;br&gt;
      "second": 0.463297247887 // similarity score (the lowest the closest)&lt;br&gt;
    }&lt;br&gt;
  ]&lt;br&gt;
}&lt;br&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
&lt;br&gt;
  &lt;br&gt;
  &lt;br&gt;
  Wrapping up&lt;br&gt;
&lt;/h2&gt;

&lt;p&gt;And that’s it — you now have a working semantic search app using Spring Boot and Redis. &lt;/p&gt;

&lt;p&gt;Instead of relying on exact keyword matches, your app understands the meaning behind the query. Redis handles the heavy part: embedding storage, similarity search, and even traditional filters — all at lightning speed.&lt;/p&gt;

&lt;p&gt;With Redis OM Spring, you get an easy way to integrate this into your Java apps. You only need two annotations: &lt;a class="mentioned-user" href="https://dev.to/vectorize"&gt;@vectorize&lt;/a&gt; and @Indexed and two Beans: EntityStream and Embedder. &lt;/p&gt;

&lt;p&gt;Whether you’re building search, recommendations, or AI-powered assistants, this setup gives you a solid and scalable foundation.&lt;/p&gt;

&lt;p&gt;Try it out, tweak the filters, explore other models, and see how far you can go!&lt;/p&gt;

&lt;h3&gt;
  
  
  More AI Resources
&lt;/h3&gt;

&lt;p&gt;The best way to stay on the path of learning AI is by following the recipes available on the Redis AI Resources GitHub repository. There you can find dozens of recipes that will get you to start building AI apps, fast!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/redis-developer/redis-ai-resources/tree/main" rel="noopener noreferrer"&gt;&lt;strong&gt;GitHub - redis-developer/redis-ai-resources: ✨ A curated list of awesome community resources, integrations, and examples of Redis in the AI ecosystem.&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Stay Curious!
&lt;/h3&gt;

</description>
      <category>java</category>
      <category>vectordatabase</category>
      <category>springboot</category>
      <category>redis</category>
    </item>
    <item>
      <title>Token Bucket Rate Limiter (Redis &amp; Java)</title>
      <dc:creator>Raphael De Lio</dc:creator>
      <pubDate>Mon, 13 Jan 2025 14:15:29 +0000</pubDate>
      <link>https://dev.to/redis/token-bucket-rate-limiter-redis-java-4pi3</link>
      <guid>https://dev.to/redis/token-bucket-rate-limiter-redis-java-4pi3</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://youtu.be/cfF6nXIpDwE" rel="noopener noreferrer"&gt;This article is also available on YouTube!&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft58edaghnxrbmjdyhkf7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft58edaghnxrbmjdyhkf7.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Token Bucket&lt;/strong&gt; algorithm is a flexible and efficient rate-limiting mechanism. It works by filling a bucket with tokens at a fixed rate (e.g., one token per second). Each request consumes a token, and if no tokens are available, the request is rejected. The bucket has a maximum capacity, so it can handle bursts of traffic as long as the burst doesn’t exceed the number of tokens in the bucket.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Looking for a different rate limiter algorithm? &lt;a href="https://raphaeldelio.medium.com/rate-limiting-with-redis-an-essential-guide-df798b1c63db?source=user_profile_page---------1-------------17e03c232bd9---------------" rel="noopener noreferrer"&gt;Check the essential guide.&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Index
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Introduction&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;How the Token Bucket Rate Limiter Works&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Implementation with Redis and Java&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Testing with TestContainers and AssertJ&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Conclusion (GitHub Repo)&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F2160%2F1%2A7cDKq5yh5RD0ygvb3mVwfQ.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F2160%2F1%2A7cDKq5yh5RD0ygvb3mVwfQ.gif" width="1080" height="608"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  1. &lt;strong&gt;Define a Token Refill Rate&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Set a rate at which tokens are added to the bucket, such as 1 token per second or 10 tokens per minute.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. &lt;strong&gt;Track Token Consumption&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;For each incoming request, deduct one token from the bucket.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. &lt;strong&gt;Refill Tokens&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Continuously refill the bucket at the defined rate, up to its maximum capacity, ensuring unused tokens can accumulate for future bursts.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. &lt;strong&gt;Rate Limit Check&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Before processing a request, check if there are enough tokens in the bucket. If the bucket is empty, reject the request until tokens are replenished.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Implement It with Redis and Java
&lt;/h2&gt;

&lt;p&gt;For the &lt;strong&gt;Token Bucket Rate Limiter&lt;/strong&gt;, Redis provides an efficient way to track tokens and implement the algorithm. Here’s how to do it:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Retrieve current token count and last refill time
&lt;/h3&gt;

&lt;p&gt;First, retrieve the current token count and the last refill time:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET rate_limit:&amp;lt;clientId&amp;gt;:count  
GET rate_limit:&amp;lt;clientId&amp;gt;:lastRefill  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;If these keys don’t exist, initialize the token count to the bucket’s maximum capacity and set the current time as the last refill time using SET.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Refill tokens if necessary and update the bucket
&lt;/h3&gt;

&lt;p&gt;Update the token count and last refill date time after processing each request:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SET rate_limit:&amp;lt;clientId&amp;gt;:count &amp;lt;new_token_count&amp;gt;  
SET rate_limit:&amp;lt;clientId&amp;gt;:lastRefill &amp;lt;current_time&amp;gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  3. Allow or reject the request
&lt;/h3&gt;

&lt;p&gt;If tokens are available, allow the request and decrement the count by one using:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DECR rate_limit:&amp;lt;clientId&amp;gt;:count
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
  
  
  Implementing it with Jedis
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Jedis&lt;/strong&gt; is a popular Java library used to interact with **Redis **and we will use it for implementing our rate limiter because it provides a simple and intuitive API for executing Redis commands from JVM applications.&lt;/p&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;Add Jedis to Your Maven File&lt;/strong&gt;:
&lt;/h3&gt;

&lt;p&gt;Check the latest version &lt;a href="https://redis.io/docs/latest/develop/clients/jedis/" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;dependency&amp;gt;
    &amp;lt;groupId&amp;gt;redis.clients&amp;lt;/groupId&amp;gt;
    &amp;lt;artifactId&amp;gt;jedis&amp;lt;/artifactId&amp;gt;
    &amp;lt;version&amp;gt;5.2.0&amp;lt;/version&amp;gt;
&amp;lt;/dependency&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  Create a &lt;strong&gt;TokenBucketRateLimiter&lt;/strong&gt; class:
&lt;/h3&gt;

&lt;p&gt;The class will take:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Accept a Jedis instance.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Define the maximum capacity of the token bucket.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Specify the token refill rate (tokens per second).&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    package io.redis;

    import redis.clients.jedis.Jedis;
    import redis.clients.jedis.Transaction;

    public class TokenBucketRateLimiter {
        private final Jedis jedis;
        private final int bucketCapacity; // Maximum tokens the bucket can hold
        private final double refillRate; // Tokens refilled per second

        public TokenBucketRateLimiter(Jedis jedis, int bucketCapacity, double refillRate) {
            this.jedis = jedis;
            this.bucketCapacity = bucketCapacity;
            this.refillRate = refillRate;
        }
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Validate the Requests
&lt;/h3&gt;

&lt;p&gt;The main task of this rate limiter is to determine whether a client has sufficient tokens to process their request. If yes, the request is allowed, and tokens are deducted. If not, the request is blocked.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Generate the keys&lt;/strong&gt;&lt;br&gt;
We’ll store each client’s token count and last refill time in Redis using unique keys. The keys will look like this:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public boolean isAllowed(String clientId) {
    String keyCount = "rate_limit:" + clientId + ":count";
    String keyLastRefill = "rate_limit:" + clientId + ":lastRefill";
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;For example, if the client ID is user123, their keys would be rate_limit:user123:count and rate_limit:user123:lastRefill.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Fetch Current State&lt;/strong&gt;&lt;br&gt;
We use Redis’s GET command to retrieve the current token count and the last refill time. If the keys don’t exist, we assume the bucket is full, and the last refill time is the current timestamp.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public boolean isAllowed(String clientId) {
    String keyCount = "rate_limit:" + clientId + ":count";
    String keyLastRefill = "rate_limit:" + clientId + ":lastRefill";

    Transaction transaction = jedis.multi();
    transaction.get(keyLastRefill);
    transaction.get(keyCount);
    var results = transaction.exec();

    long currentTime = System.currentTimeMillis();
    long lastRefillTime = results.get(0) != null ? Long.parseLong((String) results.get(0)) : currentTime;
    int tokenCount = results.get(1) != null ? Integer.parseInt((String) results.get(1)) : bucketCapacity;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Refill Tokens&lt;/strong&gt;&lt;br&gt;
Calculate how many tokens should be added based on the time elapsed since the last refill. Ensure the bucket doesn’t exceed its maximum capacity.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;long elapsedTimeMs = currentTime - lastRefillTime;
double elapsedTimeSecs = elapsedTimeMs / 1000.0;
int tokensToAdd = (int) (elapsedTimeSecs * refillRate);

tokenCount = Math.min(bucketCapacity, tokenCount + tokensToAdd);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Step 4: Check Token Availability&lt;/strong&gt;&lt;br&gt;
Compare the current token count to determine if the request can be allowed. &lt;strong&gt;If tokens are available, deduct one token; otherwise, block the request.&lt;/strong&gt;&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;boolean isAllowed = tokenCount &amp;gt; 0;

if (isAllowed) {
    tokenCount--;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Step 5: Update Redis&lt;/strong&gt;&lt;br&gt;
We update the token count and last refill time in Redis. Use a transaction to ensure atomic updates:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Transaction transaction = jedis.multi();
transaction.set(keyLastRefill, String.valueOf(currentTime)); // Update last refill time
transaction.set(keyCount, String.valueOf(tokenCount));       // Update token count
transaction.exec();
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  Complete Implementation
&lt;/h3&gt;

&lt;p&gt;Here’s the full code for the FixedWindowRateLimiter class:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package io.redis;

import redis.clients.jedis.Jedis;
import redis.clients.jedis.Transaction;

public class TokenBucketRateLimiter {
    private final Jedis jedis;
    private final int bucketCapacity; // Maximum tokens the bucket can hold
    private final double refillRate; // Tokens refilled per second

    public TokenBucketRateLimiter(Jedis jedis, int bucketCapacity, double refillRate) {
        this.jedis = jedis;
        this.bucketCapacity = bucketCapacity;
        this.refillRate = refillRate;
    }

    public boolean isAllowed(String clientId) {
        String keyCount = "rate_limit:" + clientId + ":count";
        String keyLastRefill = "rate_limit:" + clientId + ":lastRefill";

        long currentTime = System.currentTimeMillis();

        // Fetch current state
        Transaction transaction = jedis.multi();
        transaction.get(keyLastRefill);
        transaction.get(keyCount);
        var results = transaction.exec();

        long lastRefillTime = results.get(0) != null ? Long.parseLong((String) results.get(0)) : currentTime;
        int tokenCount = results.get(1) != null ? Integer.parseInt((String) results.get(1)) : bucketCapacity;

        // Refill tokens
        long elapsedTimeMs = currentTime - lastRefillTime;
        double elapsedTimeSecs = elapsedTimeMs / 1000.0;
        int tokensToAdd = (int) (elapsedTimeSecs * refillRate);
        tokenCount = Math.min(bucketCapacity, tokenCount + tokensToAdd);

        // Check if the request is allowed
        boolean isAllowed = tokenCount &amp;gt; 0;

        if (isAllowed) {
            tokenCount--; // Consume one token
        }

        // Update Redis state
        transaction = jedis.multi();
        transaction.set(keyLastRefill, String.valueOf(currentTime));
        transaction.set(keyCount, String.valueOf(tokenCount));
        transaction.exec();

        return isAllowed;
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;And we’re ready to start testing it’s behavior!&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing our Rate Limiter
&lt;/h2&gt;

&lt;p&gt;To ensure our Token Bucket Rate Limiter behaves as expected, we’ll write tests for various scenarios. For this, we’ll use three tools:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Redis TestContainers&lt;/strong&gt;: This library spins up an isolated Redis container for testing. This means we don’t need to rely on an external Redis server during our tests. Once the tests are done, the container is stopped, leaving no leftover data.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;JUnit 5&lt;/strong&gt;: Our main testing framework, which helps us define and structure tests with lifecycle methods like @BeforeEach and @AfterEach.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;AssertJ&lt;/strong&gt;: A library that makes assertions readable and expressive, like assertThat(result).isTrue().&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Let’s begin by adding the necessary dependencies to our pom.xml.&lt;/p&gt;

&lt;h3&gt;
  
  
  Adding Dependencies
&lt;/h3&gt;

&lt;p&gt;Here’s what you’ll need in your Maven pom.xml file:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;dependency&amp;gt;
    &amp;lt;groupId&amp;gt;org.junit.jupiter&amp;lt;/groupId&amp;gt;
    &amp;lt;artifactId&amp;gt;junit-jupiter-engine&amp;lt;/artifactId&amp;gt;
    &amp;lt;version&amp;gt;5.10.0&amp;lt;/version&amp;gt;
    &amp;lt;scope&amp;gt;test&amp;lt;/scope&amp;gt;
&amp;lt;/dependency&amp;gt;
&amp;lt;dependency&amp;gt;
    &amp;lt;groupId&amp;gt;com.redis&amp;lt;/groupId&amp;gt;
    &amp;lt;artifactId&amp;gt;testcontainers-redis&amp;lt;/artifactId&amp;gt;
    &amp;lt;version&amp;gt;2.2.2&amp;lt;/version&amp;gt;
    &amp;lt;scope&amp;gt;test&amp;lt;/scope&amp;gt;
&amp;lt;/dependency&amp;gt;
&amp;lt;dependency&amp;gt;
    &amp;lt;groupId&amp;gt;org.assertj&amp;lt;/groupId&amp;gt;
    &amp;lt;artifactId&amp;gt;assertj-core&amp;lt;/artifactId&amp;gt;
    &amp;lt;version&amp;gt;3.11.1&amp;lt;/version&amp;gt;
    &amp;lt;scope&amp;gt;test&amp;lt;/scope&amp;gt;
&amp;lt;/dependency&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Once you’ve added these dependencies, you’re ready to start writing your test class.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting Up the Test Class
&lt;/h3&gt;

&lt;p&gt;The first step is to create a test class named FixedWindowRateLimiterTest. Inside, we’ll define three main components:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Redis Test Container&lt;/strong&gt;: This launches a Redis instance in a Docker container.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Jedis Instance&lt;/strong&gt;: This connects to the Redis container for sending commands.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Rate Limiter&lt;/strong&gt;: The actual TokenBucketRateLimiter instance we’re testing.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here’s how the skeleton of our test class looks:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public class TokenBucketRateLimiterTest {

    private static RedisContainer redisContainer;
    private Jedis jedis;
    private TokenBucketRateLimiter rateLimiter;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  Preparing the Environment Before Each Test
&lt;/h3&gt;

&lt;p&gt;Before running any test, we need to ensure a clean Redis environment. Here’s what we’ll do:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Connect to Redis&lt;/strong&gt;: Use a Jedis instance to connect to the Redis container.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Flush Data&lt;/strong&gt;: Clear any leftover data in Redis to ensure consistent results for each test.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We’ll set this up in a method annotated with @BeforeEach, which runs before every test case.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@BeforeAll
static void startContainer() {
    redisContainer = new RedisContainer("redis:latest");
    redisContainer.withExposedPorts(6379).start();
}

@BeforeEach
void setup() {
    jedis = new Jedis(redisContainer.getHost(), redisContainer.getFirstMappedPort());
    jedis.flushAll();
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;FLUSHALL is an actual Redis command that deletes all the keys of all the existing databases. &lt;a href="https://redis.io/docs/latest/commands/flushall/" rel="noopener noreferrer"&gt;Read more about it in the official documentation&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Cleaning Up After Each Test
&lt;/h3&gt;

&lt;p&gt;After each test, we need to close the Jedis connection to free up resources. This ensures no lingering connections interfere with subsequent tests.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@AfterEach
void tearDown() {
    jedis.close();
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  Full Setup
&lt;/h3&gt;

&lt;p&gt;Here’s how the complete test class looks with everything in place:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public class TokenBucketRateLimiterTest {

    private static RedisContainer redisContainer;
    private Jedis jedis;
    private TokenBucketRateLimiter rateLimiter;

    @BeforeAll
    static void startContainer() {
        redisContainer = new RedisContainer("redis:latest");
        redisContainer.withExposedPorts(6379).start();
    }

    @AfterAll
    static void stopContainer() {
        redisContainer.stop();
    }

    @BeforeEach
    void setup() {
        jedis = new Jedis(redisContainer.getHost(), redisContainer.getFirstMappedPort());
        jedis.flushAll();
    }

    @AfterEach
    void tearDown() {
        jedis.close();
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  Verifying Requests Within the Bucket Capacity
&lt;/h3&gt;

&lt;p&gt;This test ensures the rate limiter allows requests within the defined bucket capacity.&lt;/p&gt;

&lt;p&gt;We configure it with a &lt;strong&gt;capacity of&lt;/strong&gt; &lt;strong&gt;5 tokens&lt;/strong&gt; and a &lt;strong&gt;refill rate of one token per second&lt;/strong&gt;, then call isAllowed(“client-1”) &lt;strong&gt;5 times&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Each call should return true, confirming the rate limiter correctly tracks and permits requests within the capacity.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@Test
void shouldAllowRequestsWithinBucketCapacity() {
    rateLimiter = new TokenBucketRateLimiter(jedis, 5, 1.0);
    for (int i = 1; i &amp;lt;= 5; i++) {
        assertThat(rateLimiter.isAllowed("client-1"))
            .withFailMessage("Request %d should be allowed within bucket capacity", i)
            .isTrue();
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  Verifying Requests Are Denied When Bucket is Empty
&lt;/h3&gt;

&lt;p&gt;This test ensures the rate limiter correctly denies requests once the bucket is empty.&lt;/p&gt;

&lt;p&gt;Configured with a &lt;strong&gt;capacity of&lt;/strong&gt; &lt;strong&gt;5 tokens&lt;/strong&gt; and a &lt;strong&gt;refill rate of one token per second&lt;/strong&gt;, we isAllowed(“client-1”) &lt;strong&gt;5 times&lt;/strong&gt; and expect all to return true.&lt;/p&gt;

&lt;p&gt;On the 6th call, it should return false, verifying the rate limiter blocks requests once the bucket is empty.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@Test
void shouldDenyRequestsOnceBucketIsEmpty() {
    rateLimiter = new TokenBucketRateLimiter(jedis, 5, 1.0);
    for (int i = 1; i &amp;lt;= 5; i++) {
        assertThat(rateLimiter.isAllowed("client-1"))
            .withFailMessage("Request %d should be allowed within bucket capacity", i)
            .isTrue();
    }
    assertThat(rateLimiter.isAllowed("client-1"))
        .withFailMessage("Request beyond bucket capacity should be denied")
        .isFalse();
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  Verifying Bucket is Gradually Refilled
&lt;/h3&gt;

&lt;p&gt;This test ensures the rate limiter refills the bucket correctly after every second.&lt;/p&gt;

&lt;p&gt;Configured with a &lt;strong&gt;capacity of&lt;/strong&gt; &lt;strong&gt;5 tokens&lt;/strong&gt; and a &lt;strong&gt;refill rate of one token per second&lt;/strong&gt;, the first 5 requests (isAllowed(“client-1”)) return true, while the 6th request is denied (false).&lt;/p&gt;

&lt;p&gt;After waiting for two seconds, the next two requests are allowed and the third one is denied. Confirming the refilling behavior works as expected.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    @Test
    void shouldRefillTokensGraduallyAndAllowRequestsOverTime() throws InterruptedException {
        rateLimiter = new TokenBucketRateLimiter(jedis, 5, 1.0);
        String clientId = "client-1";

        for (int i = 1; i &amp;lt;= 5; i++) {
            assertThat(rateLimiter.isAllowed(clientId))
                .withFailMessage("Request %d should be allowed within bucket capacity", i)
                .isTrue();
        }
        assertThat(rateLimiter.isAllowed(clientId))
            .withFailMessage("Request beyond bucket capacity should be denied")
            .isFalse();

        TimeUnit.SECONDS.sleep(2);

        assertThat(rateLimiter.isAllowed(clientId))
            .withFailMessage("Request after partial refill should be allowed")
            .isTrue();
        assertThat(rateLimiter.isAllowed(clientId))
            .withFailMessage("Second request after partial refill should be allowed")
            .isTrue();
        assertThat(rateLimiter.isAllowed(clientId))
            .withFailMessage("Request beyond available tokens should be denied")
            .isFalse();
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  Verifying Independent Handling of Multiple Clients
&lt;/h3&gt;

&lt;p&gt;This test ensures the rate limiter handles multiple clients independently.&lt;/p&gt;

&lt;p&gt;Configured with a &lt;strong&gt;capacity of&lt;/strong&gt; &lt;strong&gt;5 tokens&lt;/strong&gt; and a &lt;strong&gt;refill rate of one token per second&lt;/strong&gt;, the first 5 requests (isAllowed(“client-1”)) return true, while the 6th request is denied (false).&lt;/p&gt;

&lt;p&gt;Simultaneously, all 5 requests from &lt;strong&gt;client-2&lt;/strong&gt; are allowed (true), confirming the rate limiter maintains separate counters for each client.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@Test
void shouldHandleMultipleClientsIndependently() {
    rateLimiter = new TokenBucketRateLimiter(jedis, 5, 1.0);

    String clientId1 = "client-1";
    String clientId2 = "client-2";

    for (int i = 1; i &amp;lt;= 5; i++) {
        assertThat(rateLimiter.isAllowed(clientId1))
            .withFailMessage("Client 1 request %d should be allowed", i)
            .isTrue();
    }
    assertThat(rateLimiter.isAllowed(clientId1))
        .withFailMessage("Client 1 request beyond bucket capacity should be denied")
        .isFalse();

    for (int i = 1; i &amp;lt;= 5; i++) {
        assertThat(rateLimiter.isAllowed(clientId2))
            .withFailMessage("Client 2 request %d should be allowed", i)
            .isTrue();
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  Verifying Token Refill Does Not Exceed Bucket Capacity
&lt;/h3&gt;

&lt;p&gt;This test verifies that the token bucket rate limiter correctly refills tokens up to the defined capacity without exceeding it.&lt;/p&gt;

&lt;p&gt;Configured with a &lt;strong&gt;capacity of 3 tokens&lt;/strong&gt; and a &lt;strong&gt;refill rate of 2 tokens per second&lt;/strong&gt;, the first 3 requests (isAllowed(“client-1”)) return true, while the 4th request is denied (false), indicating the bucket is empty.&lt;/p&gt;

&lt;p&gt;After waiting 3 seconds (enough to refill 6 tokens), the bucket refills only up to its maximum capacity of 3 tokens. The next 3 requests are allowed (true), but any additional request is denied (false), confirming that the rate limiter maintains the specified capacity limit regardless of refill surplus.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@Test
void shouldRefillTokensUpToCapacityWithoutExceedingIt() throws InterruptedException {
    int capacity = 3;
    double refillRate = 2.0;
    String clientId = "client-1";
    rateLimiter = new TokenBucketRateLimiter(jedis, capacity, refillRate);

    for (int i = 1; i &amp;lt;= capacity; i++) {
        assertThat(rateLimiter.isAllowed(clientId))
            .withFailMessage("Request %d should be allowed within initial bucket capacity", i)
            .isTrue();
    }
    assertThat(rateLimiter.isAllowed(clientId))
        .withFailMessage("Request beyond bucket capacity should be denied")
        .isFalse();

    TimeUnit.SECONDS.sleep(3);

    for (int i = 1; i &amp;lt;= capacity; i++) {
        assertThat(rateLimiter.isAllowed(clientId))
            .withFailMessage("Request %d should be allowed as bucket refills up to capacity", i)
            .isTrue();
    }
    assertThat(rateLimiter.isAllowed(clientId))
        .withFailMessage("Request beyond bucket capacity should be denied")
        .isFalse();
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  Verifying Denied Requests Do Not Affect Token Count
&lt;/h3&gt;

&lt;p&gt;This test ensures that the token bucket rate limiter does not count denied requests when updating the token count.&lt;/p&gt;

&lt;p&gt;Configured with a &lt;strong&gt;capacity of 3 tokens&lt;/strong&gt; and a &lt;strong&gt;refill rate of 0.5 tokens per second&lt;/strong&gt;, the first 3 requests (isAllowed(“client-1”)) are allowed (true), depleting the bucket. The 4th request is denied (false), confirming the bucket is empty.&lt;/p&gt;

&lt;p&gt;The Redis token count (rate_limit:client-1:count) is then verified to ensure it accurately reflects the remaining tokens (0 in this case) and does not include denied requests. This confirms that the rate limiter updates the token count only when requests are successfully processed.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@Test
void testRateLimitDeniedRequestsAreNotCounted() {
    int capacity = 3;
    double refillRate = 0.5;
    String clientId = "client-1";
    rateLimiter = new TokenBucketRateLimiter(jedis, capacity, refillRate);

    for (int i = 1; i &amp;lt;= capacity; i++) {
        assertThat(rateLimiter.isAllowed(clientId))
            .withFailMessage("Request %d should be allowed", i)
            .isTrue();
    }
    assertThat(rateLimiter.isAllowed(clientId))
        .withFailMessage("This request should be denied")
        .isFalse();

    String key = "rate_limit:" + clientId + ":count";
    int requestCount = Integer.parseInt(jedis.get(key));
    assertThat(requestCount)
        .withFailMessage("The count should match remaining tokens and not include denied requests")
        .isEqualTo(0);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Is there any other behavior we should verify? Let me know in the comments!&lt;/p&gt;

&lt;p&gt;The Token Bucket Rate Limiter is a flexible and efficient way to manage request rates, and &lt;strong&gt;Redis&lt;/strong&gt; makes it incredibly fast and reliable.&lt;/p&gt;

&lt;p&gt;By leveraging commands like GET, SET, and MULTI/EXEC, we implemented a solution that tracks token counts, refills tokens dynamically based on time elapsed, and ensures the bucket never exceeds its defined capacity.&lt;/p&gt;

&lt;p&gt;Using &lt;strong&gt;Jedis&lt;/strong&gt;, we built a clear and intuitive &lt;strong&gt;Java&lt;/strong&gt; implementation, and with thorough testing using Redis TestContainers, JUnit 5, and AssertJ, we can confidently verify that it works as expected.&lt;/p&gt;

&lt;p&gt;This approach offers a robust foundation for managing request limits while allowing for burst handling and gradual refill, making it adaptable for more advanced rate-limiting scenarios when needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  GitHub Repo
&lt;/h3&gt;

&lt;p&gt;You can find this implementation in &lt;strong&gt;Java&lt;/strong&gt; and &lt;strong&gt;Kotlin&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Java (&lt;a href="https://github.com/raphaeldelio/redis-rate-limiter-java-example/tree/main/src/main/java/io/redis" rel="noopener noreferrer"&gt;Implementation&lt;/a&gt;, &lt;a href="https://github.com/raphaeldelio/redis-rate-limiter-java-example/blob/main/src/test/java/io/redis/TokenBucketRateLimiterTest.java" rel="noopener noreferrer"&gt;Test&lt;/a&gt;)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Kotlin (&lt;a href="https://github.com/raphaeldelio/redis-rate-limiter-kotlin-example/blob/main/src/main/kotlin/org/example/TokenBucketRateLimiter.kt" rel="noopener noreferrer"&gt;Implementation&lt;/a&gt;, &lt;a href="https://github.com/raphaeldelio/redis-rate-limiter-kotlin-example/blob/main/src/test/kotlin/org/example/TokenBucketRateLimiterTest.kt" rel="noopener noreferrer"&gt;Test&lt;/a&gt;)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Stay Curious!
&lt;/h3&gt;

</description>
      <category>redis</category>
      <category>java</category>
      <category>systemdesign</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Fixed Window Counter Rate Limiter (Redis &amp; Java)</title>
      <dc:creator>Raphael De Lio</dc:creator>
      <pubDate>Mon, 30 Dec 2024 13:30:24 +0000</pubDate>
      <link>https://dev.to/redis/fixed-window-counter-rate-limiter-redis-java-dik</link>
      <guid>https://dev.to/redis/fixed-window-counter-rate-limiter-redis-java-dik</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://youtu.be/Ki3WKSNpdRU" rel="noopener noreferrer"&gt;This article is also available on YouTube!&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2j5ljyqccpn0v4aa2kkv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2j5ljyqccpn0v4aa2kkv.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Fixed Window Counter&lt;/strong&gt; is the simplest and most straightforward rate-limiting algorithm. It divides time into fixed intervals (e.g., seconds, minutes, or hours) and counts the number of requests within each interval. If the count exceeds a predefined threshold, the requests are rejected until the next interval begins.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Looking for a more precise algorithm? Take a look at the Sliding Window Log implementation. (Coming soon)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Index
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Introduction&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;How the Fixed Window Counter Rate Limiter Works&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Implementation with Redis and Java&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Testing with TestContainers and AssertJ&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Conclusion (GitHub Repo)&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;How It Works&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi5vldnjqp6aos1afq9et.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi5vldnjqp6aos1afq9et.gif" width="1080" height="608"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;1. Define a Window Interval&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Choose a time interval, such as 1 second, 1 minute, or 1 hour.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. &lt;strong&gt;Track Requests&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Use a counter to track the number of requests made during the current window.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. &lt;strong&gt;Reset Counter:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;At the end of the time window, reset the counter to zero and start counting again for the new window.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. &lt;strong&gt;Rate Limit Check:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Compare the counter against the allowed limit. If it exceeds the limit, reject further requests until the next window.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Implement It with Redis and Java
&lt;/h2&gt;

&lt;p&gt;There are two ways to implement the Fixed Rate Limiter with Redis. The simplest way is by:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Use the INCR command to increment the counter in Redis each time a request is allowed
&lt;/h3&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;INCR my_counter
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;If there's no counter set yet, the INCR command will create one as zero and then increment it to one.&lt;/p&gt;

&lt;p&gt;If the counter is already set, the INCR commany will simply increment it by one.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Set the key to expire in one minute if it’s newly created
&lt;/h3&gt;

&lt;p&gt;If the counter doesn’t exist, we need to set a time-to-live to ensure the time window lasts only for the specified period. &lt;strong&gt;But we should only set an expiration if it doesn’t already exist&lt;/strong&gt;. Otherwise, Redis would reset the expiration, and older requests could be counted beyond the allowed time.&lt;/p&gt;

&lt;p&gt;We’ll use the EXPIRE command with the NX flag on the key. &lt;strong&gt;The NX flag ensures the expiration is only set if the key doesn’t already have one.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This approach is smart because the counter will only track requests during the key’s lifespan. &lt;strong&gt;Once the key expires and is removed, the counter resets, ensuring we only account for requests within the intended time window.&lt;/strong&gt;&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;EXPIRE my_counter 60 NX
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  3. Check the counter for each new request
&lt;/h3&gt;

&lt;p&gt;When a new request comes in, check the counter to see how many requests have been made. If it’s below the threshold, allow the process and increment the counter. If not, block the process from proceeding.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If the key doesn’t exist, assume the counter starts at 0.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET my_counter
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Cool! Now that we understand the basics of our implementation, let’s implement it in Java with Jedis.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementing it with Jedis
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Jedis&lt;/strong&gt; is a popular Java library used to interact with **Redis **and we will use it for implementing our rate because it provides a simple and intuitive API for executing Redis commands from JVM applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  Start by adding the Jedis library to your Maven file:
&lt;/h3&gt;

&lt;p&gt;Check the latest version &lt;a href="https://redis.io/docs/latest/develop/clients/jedis/" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    &amp;lt;dependency&amp;gt;
        &amp;lt;groupId&amp;gt;redis.clients&amp;lt;/groupId&amp;gt;
        &amp;lt;artifactId&amp;gt;jedis&amp;lt;/artifactId&amp;gt;
        &amp;lt;version&amp;gt;5.2.0&amp;lt;/version&amp;gt;
    &amp;lt;/dependency&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Create a FixedWindowRateLimiter class:
&lt;/h3&gt;

&lt;p&gt;The class will take:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;A Jedis instance.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A time window size (e.g., 60 seconds).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The maximum number of allowed requests.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    package io.redis;

    import redis.clients.jedis.Jedis;
    import redis.clients.jedis.Transaction;
    import redis.clients.jedis.args.ExpiryOption;

    public class FixedWindowRateLimiter {

        private final Jedis jedis;
        private final int windowSize;
        private final int limit;

        public FixedWindowRateLimiter(Jedis jedis, long windowSize, int limit) {
            this.jedis = jedis;
            this.limit = limit;
            this.windowSize = windowSize;
        }
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Validate the Requests
&lt;/h3&gt;

&lt;p&gt;The main job of this rate limiter is to check if a client is within their allowed request limit. If yes, the request is allowed, and the counter is updated. If not, the request is blocked.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Generate a key&lt;/strong&gt;&lt;br&gt;
We’ll store each client’s request count as a Redis key. To make keys unique for each client, we’ll format them like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    public boolean isAllowed(String clientId) {
        String key = "rate_limit:" + clientId;
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For example, if the client ID is user123, their key would be rate_limit:user123.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Fetch the Current Counter&lt;/strong&gt;&lt;br&gt;
We’ll use Redis’s GET command to check how many requests the client has made so far. If the key doesn’t exist, we assume the client hasn’t made any requests, so the counter is 0.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    public boolean isAllowed(String clientId) {
        String key = "rate_limit:" + clientId;
        String currentCountStr = jedis.get(key);
        int currentCount = currentCountStr != null ? Integer.parseInt(currentCountStr) : 0;
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 3: Check the Request Limit&lt;/strong&gt;&lt;br&gt;
Next, we compare the current count to the allowed limit. If the counter is less than the limit, the request is allowed. Otherwise, it’s blocked.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    public boolean isAllowed(String clientId) {
        String key = "rate_limit:" + clientId;
        String currentCountStr = jedis.get(key);
        int currentCount = currentCountStr != null ? Integer.parseInt(currentCountStr) : 0;

        boolean isAllowed = currentCount &amp;lt; limit;
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 4: Increment the Counter and Set Expiration&lt;/strong&gt;&lt;br&gt;
If the request is allowed**, we need to do two things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Increment the Counter&lt;/strong&gt;: Use the Redis INCR command to increase the request count by 1.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Set an Expiration&lt;/strong&gt;: Use the EXPIRE command to ensure the counter resets at the end of the time window. To make sure the expiration won’t reset everytime we increment the counter, we also need to set the NX flag.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We’ll do this in a &lt;strong&gt;transaction&lt;/strong&gt; to ensure that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Both INCR and EXPIRE happen together, avoiding race conditions.&lt;/li&gt;
&lt;li&gt;Both INCR and EXPIRE are pipelined (sent in a batch to Redis) to reduce the number of network trips, improving performance.
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    if (isAllowed) {
        Transaction transaction = jedis.multi();
        transaction.incr(key); // Increment the counter
        transaction.expire(key, windowSize, ExpiryOption.NX); // Set expiration only if not already set
        transaction.exec(); // Execute both commands atomically
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;blockquote&gt;
&lt;p&gt;The first request marks the start of the time window. Any subsequent requests during this window’s lifespan will increment the counter.&lt;br&gt;
 Once the window expires, the key is automatically removed from Redis. The next request after that will define the start of a new window.&lt;br&gt;
 If we didn’t set the NX flag, the expiration would be reset everytime the counter is incremented, increasing the lifespan of the window.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;Complete Implementation&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Here’s the full code for the FixedWindowRateLimiter class:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package io.redis;

    import redis.clients.jedis.Jedis;
    import redis.clients.jedis.Transaction;
    import redis.clients.jedis.args.ExpiryOption;

    public class FixedWindowRateLimiter {

        private final Jedis jedis;
        private final int windowSize;
        private final int limit;

        public FixedWindowRateLimiter(Jedis jedis, long windowSize, int limit) {
            this.jedis = jedis;
            this.limit = limit;
            this.windowSize = windowSize;
        }

        public boolean isAllowed(String clientId) {
            String key = "rate_limit:" + clientId;
            String currentCountStr = jedis.get(key);
            int currentCount = currentCountStr != null ? Integer.parseInt(currentCountStr) : 0;

            boolean isAllowed = currentCount &amp;lt; limit;

            if (isAllowed) {
                Transaction transaction = jedis.multi();
                transaction.incr(key);
                transaction.expire(key, windowSize, ExpiryOption.NX); // Set expire only if not set
                transaction.exec();
            }

            return isAllowed;
        }
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And we’re ready to start testing it’s behavior!&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing our Rate Limiter
&lt;/h2&gt;

&lt;p&gt;To ensure our Fixed Window Rate Limiter behaves as expected, we’ll write tests for various scenarios. For this, we’ll use three tools:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Redis TestContainers&lt;/strong&gt;: This library spins up an isolated Redis container for testing. This means we don’t need to rely on an external Redis server during our tests. Once the tests are done, the container is stopped, leaving no leftover data.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;JUnit 5&lt;/strong&gt;: Our main testing framework, which helps us define and structure tests with lifecycle methods like @BeforeEach and @AfterEach.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;AssertJ&lt;/strong&gt;: A library that makes assertions readable and expressive, like assertThat(result).isTrue().&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Let’s begin by adding the necessary dependencies to our pom.xml.&lt;/p&gt;

&lt;h3&gt;
  
  
  Adding Dependencies
&lt;/h3&gt;

&lt;p&gt;Here’s what you’ll need in your Maven pom.xml file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;dependency&amp;gt;
        &amp;lt;groupId&amp;gt;org.junit.jupiter&amp;lt;/groupId&amp;gt;
        &amp;lt;artifactId&amp;gt;junit-jupiter-engine&amp;lt;/artifactId&amp;gt;
        &amp;lt;version&amp;gt;5.10.0&amp;lt;/version&amp;gt;
        &amp;lt;scope&amp;gt;test&amp;lt;/scope&amp;gt;
    &amp;lt;/dependency&amp;gt;

    &amp;lt;dependency&amp;gt;
        &amp;lt;groupId&amp;gt;com.redis&amp;lt;/groupId&amp;gt;
        &amp;lt;artifactId&amp;gt;testcontainers-redis&amp;lt;/artifactId&amp;gt;
        &amp;lt;version&amp;gt;2.2.2&amp;lt;/version&amp;gt;
        &amp;lt;scope&amp;gt;test&amp;lt;/scope&amp;gt;
    &amp;lt;/dependency&amp;gt;

    &amp;lt;dependency&amp;gt;
        &amp;lt;groupId&amp;gt;org.assertj&amp;lt;/groupId&amp;gt;
        &amp;lt;artifactId&amp;gt;assertj-core&amp;lt;/artifactId&amp;gt;
        &amp;lt;version&amp;gt;3.11.1&amp;lt;/version&amp;gt;
        &amp;lt;scope&amp;gt;test&amp;lt;/scope&amp;gt;
    &amp;lt;/dependency&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once you’ve added these dependencies, you’re ready to start writing your test class.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting Up the Test Class
&lt;/h3&gt;

&lt;p&gt;The first step is to create a test class named FixedWindowRateLimiterTest. Inside, we’ll define three main components:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Redis Test Container&lt;/strong&gt;: This launches a Redis instance in a Docker container.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Jedis Instance&lt;/strong&gt;: This connects to the Redis container for sending commands.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Rate Limiter&lt;/strong&gt;: The actual FixedWindowRateLimiter instance we’re testing.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here’s how the skeleton of our test class looks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public class FixedWindowRateLimiterTest {

        private static final RedisContainer redisContainer = new RedisContainer("redis:latest")
                .withExposedPorts(6379);

        private Jedis jedis;
        private FixedWindowRateLimiter rateLimiter;

        // Start Redis container once before any tests run
        static {
            redisContainer.start();
        }
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Preparing the Environment Before Each Test&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Before running any test, we need to ensure a clean Redis environment. Here’s what we’ll do:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Connect to Redis&lt;/strong&gt;: Use a Jedis instance to connect to the Redis container.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Flush Data&lt;/strong&gt;: Clear any leftover data in Redis to ensure consistent results for each test.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We’ll set this up in a method annotated with @BeforeEach, which runs before every test case.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    @BeforeEach
    public void setup() {
        jedis = new Jedis(redisContainer.getHost(), redisContainer.getFirstMappedPort());
        jedis.flushAll();
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;FLUSHALL is an actual Redis command that deletes all the keys of all the existing databases. &lt;a href="https://redis.io/docs/latest/commands/flushall/" rel="noopener noreferrer"&gt;Read more about it in the official documentation&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Cleaning Up After Each Test&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;After each test, we need to close the Jedis connection to free up resources. This ensures no lingering connections interfere with subsequent tests.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    @AfterEach
    public void tearDown() {
        jedis.close();
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Full Setup&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Here’s how the complete test class looks with everything in place:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    public class FixedWindowRateLimiterTest {
        private static final RedisContainer redisContainer = new RedisContainer("redis:latest")
                .withExposedPorts(6379);

        private Jedis jedis;
        private FixedWindowRateLimiter rateLimiter;

        static {
            redisContainer.start();
        }

        @BeforeEach
        public void setup() {
            jedis = new Jedis(redisContainer.getHost(), redisContainer.getFirstMappedPort());
            jedis.flushAll();
        }

        @AfterEach
        public void tearDown() {
            jedis.close();
        }
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Verifying Requests Within the Limit&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;This test ensures the rate limiter allows requests within the defined limit.&lt;/p&gt;

&lt;p&gt;We configure it with a &lt;strong&gt;limit of&lt;/strong&gt; &lt;strong&gt;5 requests&lt;/strong&gt; and a &lt;strong&gt;10-second window&lt;/strong&gt;, then call isAllowed(“client-1”) &lt;strong&gt;5 times&lt;/strong&gt;. Each call should return true, confirming the rate limiter correctly tracks and permits requests under the limit.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    @Test
    public void shouldAllowRequestsWithinLimit() {
        rateLimiter = new FixedWindowRateLimiter(jedis, 10, 5);
        for (int i = 1; i &amp;lt;= 5; i++) {
            assertThat(rateLimiter.isAllowed("client-1"))
                    .withFailMessage("Request " + i + " should be allowed")
                    .isTrue();
        }
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Verifying &lt;strong&gt;Requests Beyond the Limit&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;This test ensures the rate limiter correctly denies requests once the defined limit is exceeded.&lt;/p&gt;

&lt;p&gt;Configured with a &lt;strong&gt;limit of&lt;/strong&gt; &lt;strong&gt;5 requests&lt;/strong&gt; in a &lt;strong&gt;60-second window&lt;/strong&gt;, we call isAllowed(“client-1”) &lt;strong&gt;5 times&lt;/strong&gt; and expect all to return true. On the 6th call, it should return false, verifying the rate limiter blocks requests beyond the allowed limit.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    @Test
    public void shouldDenyRequestsOnceLimitIsExceeded() {
        rateLimiter = new FixedWindowRateLimiter(jedis, 60, 5);
        for (int i = 1; i &amp;lt;= 5; i++) {
            assertThat(rateLimiter.isAllowed("client-1"))
                    .withFailMessage("Request " + i + " should be allowed")
                    .isTrue();
        }

        assertThat(rateLimiter.isAllowed("client-1"))
                .withFailMessage("Request beyond limit should be denied")
                .isFalse();
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Verifying Requests After Window Reset&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;This test ensures the rate limiter resets correctly after the fixed window expires.&lt;/p&gt;

&lt;p&gt;Configured with a &lt;strong&gt;limit of 5 requests&lt;/strong&gt; and a &lt;strong&gt;1-second window&lt;/strong&gt;, the first 5 requests (isAllowed(“client-1”)) return true, while the 6th request is denied (false).&lt;/p&gt;

&lt;p&gt;After waiting for the window to expire, the next request is allowed (true), confirming the reset behavior works as expected.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    @Test
    public void shouldAllowRequestsAgainAfterFixedWindowResets() throws InterruptedException {
        int limit = 5;
        String clientId = "client-1";
        int windowSize = 1;
        rateLimiter = new FixedWindowRateLimiter(jedis, windowSize, limit);

        for (int i = 1; i &amp;lt;= limit; i++) {
            assertThat(rateLimiter.isAllowed(clientId))
                    .withFailMessage("Request " + i + " should be allowed")
                    .isTrue();
        }

        assertThat(rateLimiter.isAllowed(clientId))
                .withFailMessage("Request beyond limit should be denied")
                .isFalse();

        Thread.sleep((windowSize + 1) * 1000);

        assertThat(rateLimiter.isAllowed(clientId))
                .withFailMessage("Request after window reset should be allowed")
                .isTrue();
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Verifying Independent Handling of Multiple Clients&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;This test ensures the rate limiter handles multiple clients independently.&lt;/p&gt;

&lt;p&gt;Configured with a &lt;strong&gt;limit of 5 requests&lt;/strong&gt; and a &lt;strong&gt;10-second window&lt;/strong&gt;, the first 5 requests from &lt;strong&gt;client-1&lt;/strong&gt; are allowed (true), while the 6th is denied (false).&lt;/p&gt;

&lt;p&gt;Simultaneously, all 5 requests from &lt;strong&gt;client-2&lt;/strong&gt; are allowed (true), confirming the rate limiter maintains separate counters for each client.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    @Test
    public void shouldHandleMultipleClientsIndependently() {
        int limit = 5;
        String clientId1 = "client-1";
        String clientId2 = "client-2";
        int windowSize = 10;
        rateLimiter = new FixedWindowRateLimiter(jedis, windowSize, limit);

        for (int i = 1; i &amp;lt;= limit; i++) {
            assertThat(rateLimiter.isAllowed(clientId1))
                    .withFailMessage("Client 1 request " + i + " should be allowed")
                    .isTrue();
        }

        assertThat(rateLimiter.isAllowed(clientId1))
                .withFailMessage("Client 1 request beyond limit should be denied")
                .isFalse();

        for (int i = 1; i &amp;lt;= limit; i++) {
            assertThat(rateLimiter.isAllowed(clientId2))
                    .withFailMessage("Client 2 request " + i + " should be allowed")
                    .isTrue();
        }
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Verifying Requests Are Denied Until Fixed Window Resets&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;This test ensures the rate limiter denies additional requests until the fixed window expires.&lt;/p&gt;

&lt;p&gt;Configured with a &lt;strong&gt;limit of 3 requests&lt;/strong&gt; and a &lt;strong&gt;5-second window&lt;/strong&gt;, the first 3 requests (isAllowed(“client-1”)) are allowed (true), while the 4th is denied (false).&lt;/p&gt;

&lt;p&gt;After waiting for half the window duration (2.5 seconds), requests are still denied (false).&lt;/p&gt;

&lt;p&gt;Once the window fully resets (after another 2.5 seconds), the next request is allowed (true), confirming proper behavior during and after the fixed window.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    @Test
    public void shouldDenyAdditionalRequestsUntilFixedWindowResets() throws InterruptedException {
        int limit = 3;
        int windowSize = 5;
        String clientId = "client-1";
        rateLimiter = new FixedWindowRateLimiter(jedis, windowSize, limit);

        for (int i = 1; i &amp;lt;= limit; i++) {
            assertThat(rateLimiter.isAllowed(clientId))
                    .withFailMessage("Request " + i + " should be allowed within limit")
                    .isTrue();
        }

        assertThat(rateLimiter.isAllowed(clientId))
                .withFailMessage("Request beyond limit should be denied")
                .isFalse();

        Thread.sleep(2500);

        assertThat(rateLimiter.isAllowed(clientId))
                .withFailMessage("Request should still be denied within the same fixed window")
                .isFalse();

        Thread.sleep(2500);

        assertThat(rateLimiter.isAllowed(clientId))
                .withFailMessage("Request should be allowed after fixed window reset")
                .isTrue();
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Verifying Denied Requests Are Not Counted&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;This test ensures that requests denied by the rate limiter are not included in the request count.&lt;/p&gt;

&lt;p&gt;Configured with a limit of 3 requests and a 5-second window, the first 3 requests (isAllowed(“client-1”)) are allowed (true), while the 4th is denied (false).&lt;/p&gt;

&lt;p&gt;Afterward, the Redis key for the client is checked to confirm the stored count equals the limit (3), ensuring denied requests do not increase the counter.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    @Test
    public void testRateLimitDeniedRequestsAreNotCounted() {
        int limit = 3;
        int windowSize = 5;
        String clientId = "client-1";
        rateLimiter = new FixedWindowRateLimiter(jedis, windowSize, limit);

        for (int i = 1; i &amp;lt;= limit; i++) {
            assertThat(rateLimiter.isAllowed(clientId))
                    .withFailMessage("Request " + i + " should be allowed")
                    .isTrue();
        }

        assertThat(rateLimiter.isAllowed(clientId))
                .withFailMessage("This request should be denied")
                .isFalse();

        String key = "rate_limit:" + clientId;
        int requestCount = Integer.parseInt(jedis.get(key));
        assertThat(requestCount)
                .withFailMessage("The count (" + requestCount + ") should be equal to the limit (" + limit + "), not counting the denied request")
                .isEqualTo(limit);
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Is there any other behavior we should verify? Let me know in the comments!&lt;/p&gt;

&lt;p&gt;The Fixed Window Rate Limiter is a simple yet effective way to manage request rates, and &lt;strong&gt;Redis&lt;/strong&gt; makes it incredibly fast and reliable.&lt;/p&gt;

&lt;p&gt;By using commands like INCR and EXPIRE, we created a solution that tracks and limits requests while automatically resetting counters when the time window expires.&lt;/p&gt;

&lt;p&gt;With &lt;strong&gt;Jedis&lt;/strong&gt;, we built an easy-to-understand Java implementation, and thanks to thorough testing with Redis TestContainers, JUnit 5, and AssertJ, we can trust it works as expected.&lt;/p&gt;

&lt;p&gt;This approach is a great starting point for handling request limits and can easily be adapted for more complex scenarios if needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  GitHub Repo
&lt;/h3&gt;

&lt;p&gt;You can find this implementation in &lt;strong&gt;Java&lt;/strong&gt; and &lt;strong&gt;Kotlin&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Java (&lt;a href="https://github.com/raphaeldelio/redis-rate-limiter-java-example/blob/main/src/main/java/io/redis/FixedWindowRateLimiter.java" rel="noopener noreferrer"&gt;Implementation&lt;/a&gt;, &lt;a href="https://github.com/raphaeldelio/redis-rate-limiter-java-example/blob/main/src/test/java/io/redis/FixedWindowRateLimiterTest.java" rel="noopener noreferrer"&gt;Test&lt;/a&gt;)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Kotlin (&lt;a href="https://github.com/raphaeldelio/redis-rate-limiter-kotlin-example/blob/main/src/main/kotlin/org/example/FixedWindowRateLimiter.kt" rel="noopener noreferrer"&gt;Implementation&lt;/a&gt;, &lt;a href="https://github.com/raphaeldelio/redis-rate-limiter-kotlin-example/blob/main/src/test/kotlin/org/example/FixedWindowRateLimiterTest.kt" rel="noopener noreferrer"&gt;Test&lt;/a&gt;)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Stay Curious!
&lt;/h3&gt;

</description>
      <category>java</category>
      <category>redis</category>
      <category>systemdesign</category>
    </item>
  </channel>
</rss>
