<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Roman Tsypuk</title>
    <description>The latest articles on DEV Community by Roman Tsypuk (@tsypuk).</description>
    <link>https://dev.to/tsypuk</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3046434%2F28893b0e-f611-479d-9d03-17aca3a1f8f1.jpeg</url>
      <title>DEV Community: Roman Tsypuk</title>
      <link>https://dev.to/tsypuk</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/tsypuk"/>
    <language>en</language>
    <item>
      <title>AWS Golden Jacket: My Journey of Learning and Certification</title>
      <dc:creator>Roman Tsypuk</dc:creator>
      <pubDate>Mon, 08 Jun 2026 05:05:58 +0000</pubDate>
      <link>https://dev.to/aws-builders/aws-golden-jacket-my-journey-of-learning-and-certification-fab</link>
      <guid>https://dev.to/aws-builders/aws-golden-jacket-my-journey-of-learning-and-certification-fab</guid>
      <description>&lt;p&gt;Cloud technologies have been one of my biggest professional passions for many years, and AWS has played a significant role in that journey.&lt;/p&gt;

&lt;p&gt;Through continuous learning, hands-on experience, and the pursuit of AWS certifications, I gained knowledge that helped me make better technical, architectural, and business decisions across a wide range of projects. Over time, I expanded my expertise across multiple AWS domains and eventually completed every AWS certification available at that time.&lt;br&gt;
As an award for these achievements, I was honored to receive the AWS Golden Jacket. This post is the story of that journey.&lt;/p&gt;

&lt;p&gt;I hope it motivates individuals who are considering a similar path and helps remove some of the uncertainty surrounding AWS certifications.&lt;/p&gt;

&lt;h2&gt;
  
  
  When AWS was a DevOps-Only land
&lt;/h2&gt;

&lt;p&gt;Long before I started pursuing AWS certifications, I still remember visiting the Certification Center located on the campus, where I completed multiple Oracle Java certifications. Every visit was an opportunity to meet other professionals preparing for exams—developers, DevOps engineers, project managers, analysts, and architects—each pursuing their own goals.&lt;/p&gt;

&lt;p&gt;One thing stood out: AWS certifications were primarily taken only by DevOps engineers. At the time, many organizations treated AWS as an infrastructure-only domain. More then that - developers rarely had access to the AWS console, all these was DevOps responsibilities. Maybe it was due to carrier path requirements for DevOps that were built in the company.&lt;/p&gt;

&lt;p&gt;Fortunately, I was working on a large distributed system hosted across AWS and Rackspace infrastructure. Because our development team participated in on-call rotations and incident response, gaining AWS access became a necessity rather than a privilege. And that was a moment when all started - EC2, S3, RabitMQ, SQS, SES, S3, Spark, RDS, EMR, Redshift. Working as developer I permanently was learning Cloud - it was given that missing feeling from Telco when you have access and control to equipment and servers infrasturcture distributed globe-wide.&lt;/p&gt;

&lt;h2&gt;
  
  
  SAA-C01: First AWS certification
&lt;/h2&gt;

&lt;p&gt;Later, working in different startups and having experience with Amazon Web Services I already was driving the projects arthictetures,&lt;br&gt;
was involved into discoveries and PoCs, was helping others to learn AWS. At that moment I wanted to structure gained practical experience with Cloud and force it with certification with focus on real use cases and how they can be implemented with AWS. By that time I have scored all existing Oracle Java certifications and wanted to expand knowledge from Core Languages to Cloud Infrastructure.&lt;/p&gt;

&lt;p&gt;Why I choose Architect certification — it gave a very quick and wide view to AWS ecosystem, its core principles, source of knowledge. At that moment I did not know anything about Golden Jacket and was keen to increase the expertise in AWS.&lt;/p&gt;

&lt;h2&gt;
  
  
  Next Certifications
&lt;/h2&gt;

&lt;p&gt;Just after SAA-C01: Architect were DVA-C01:Developer and SOA-C01:SysOps - that were a great extension of AWS knowledge deeper to VPC level and CI/CD tools.&lt;br&gt;
Next was BDS-C00:BigData exam - amazing exam and content that you will not meet today it was all-in-one BigData, DataBases, Machine Learning at that time, enjoyed it a lot and used many concepts from it at many projects.&lt;/p&gt;

&lt;p&gt;SCS-C01: Security - a MUST have caveat when you are working with AWS, only after that learning you have a full clear picture of IAM and permissions that allows you operate at multi-account organisation level aws accounts. I had exprerience working with huge enterprise companies that were using multiple accounts aggregated into organisation and the knowledge from this set always helped to find vulnerabilities, missconfigurations, show them to clients and fix the breaches.&lt;/p&gt;

&lt;p&gt;SAP-C01: Professinal Architect is a strong hard exam that moves you to next the level, instead of knowing the landscape and integration of services it allows you deep dive into full end-to-end architecture and each service in isolation. DOP-P01: Devops Professional is a logical continuation - exam materials very coresspond to Architect plus added focus on IaC and infra. This match allows you fluently operate at complex infrastructure levels.&lt;/p&gt;

&lt;p&gt;CLF-C01: Cloud Practitioner was more a check mark - at that moment there were opportunities and requests from customer for free AWS certification, so why not.&lt;/p&gt;

&lt;p&gt;At that moment Big Data exam was retired and splitted into 3 new certifications - DBS-C01: Databases, DAS-C01: DataAnalytics and MLS-C01: Machine Learning, and it was a right decision. Each Certification allows deeply learn domains. Today all 3 of them are retiered and topics have migrated into different domain areas of other exams. This Data-oriented set of exams is a great to have in portfolio - every system in startup/enterprise has challanges and issues on persistence layer that require optimisation.&lt;/p&gt;

&lt;p&gt;AIF-C01: AI Practitioner opens hot AI topic in AWS certification and later continued DEA-C01: Data Engineering and MLA-C01: ML engineering.&lt;/p&gt;

&lt;h2&gt;
  
  
  ANS-C01: AWS Advanced Networking - the Toughest Certification
&lt;/h2&gt;

&lt;p&gt;Having a telecommunications background, I initially assumed this certification would be one of the easier ones. I had hands-on experience with radio and fiber-optic networks and had completed networking-related certifications from vendors such as Ericsson, Nokia, Huawei, and Siemens. Networking was already part of my professional foundation.&lt;/p&gt;

&lt;p&gt;But, my first attempt ended under circumstances completely outside my control. During the exam, a large-scale attack on Ukraine's energy infrastructure caused a nationwide blackout. Power and internet connectivity disappeared instantly. By that point, I had already answered approximately 40 questions.&lt;br&gt;
When I later received the score report, I saw that I had achieved around 65% of the required score. It was impossible not to wonder whether I could have passed if I had been able to finish the exam.&lt;/p&gt;

&lt;p&gt;My second attempt was even more frustrating. I scored 731 points, just 19 points below the passing score of 750.&lt;br&gt;
This time, the exam focused heavily on topics such as AWS Direct Connect, BGP routing, hybrid networking, and enterprise-scale connectivity patterns—areas that many cloud engineers rarely encounter in day-to-day work. Another challenge was the gap between exam preparation materials and the rapidly evolving AWS platform. Some learning resources contained information that was no longer fully aligned with current AWS services, limits, and best practices. While studying, I often found myself asking:&lt;br&gt;
Should I answer according to the latest AWS documentation, or according to the knowledge that existed when the exam questions were created?&lt;br&gt;
I chose to follow the most current AWS guidance, even when I suspected the exam might expect an older answer.&lt;/p&gt;

&lt;p&gt;After this try I switched to other Certification that were more related to my projects activity and decided that ANS-C01 will be the last in chain. Once completed all certification, finally I returned again to ANS-C01. But this time I have another approach - by that time I have got some outdated CISCO network equipment, few tiny clients and assembled home LAB.&lt;br&gt;
With this setup, I experimented with technologies and scenarios — setup of vlans and advanced routing, passing connection of self-hosted Proxmox to Cloud with site-to-site vpn, layered DNS resolvers in cloud and on-prem, Deep packet inspection with snort, pfSence setup and integration with vlans.&lt;br&gt;
Now I'm using this stack at CCNA course and for PoCs. These practical setups, allows quickly refreshing the knowledge and try different setups. And finally, I completed the exam, even having some saved time after it with confidence of done.&lt;/p&gt;

&lt;h2&gt;
  
  
  ANS-C01: But it is for sure the most in-depth certification
&lt;/h2&gt;

&lt;p&gt;Advanced Networking ultimately became one of the most rewarding certifications in my AWS journey.&lt;br&gt;
Once you understand networking at a deeper level, you begin to see AWS differently. You gain insight into how services communicate internally, how new AWS services are integrated into the broader ecosystem, and how managed services operate behind the scenes from own accounts and data plane connections.&lt;br&gt;
Certification goes far beyond networking itself. It changes how you think about architecture, scalability, resilience, and the design decisions that power modern cloud platforms.&lt;/p&gt;

&lt;h2&gt;
  
  
  My full AWS certification flow:
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;SAA-C00: AWS Certified Solutions Architect - Associate&lt;/li&gt;
&lt;li&gt;DVA-C01: AWS Certified Developer - Associate&lt;/li&gt;
&lt;li&gt;SOA-C01: AWS Certified SysOps Administrator - Associate&lt;/li&gt;
&lt;li&gt;BDS-C00: AWS Certified Big Data - Specialty&lt;/li&gt;
&lt;li&gt;SCS-C01: AWS Certified Security - Specialty&lt;/li&gt;
&lt;li&gt;SAP-C01: AWS Certified Solutions Architect - Professional&lt;/li&gt;
&lt;li&gt;DOP-C01: AWS Certified DevOps Engineer - Professional&lt;/li&gt;
&lt;li&gt;CLF-C01: AWS Certified Cloud Practitioner&lt;/li&gt;
&lt;li&gt;ANS-C01: AWS Certified Advanced Networking - Specialty (Failed)&lt;/li&gt;
&lt;li&gt;DBS-C01: ANS-C01: AWS Certified Advanced Networking - Specialty (Failed)&lt;/li&gt;
&lt;li&gt;DAS-C01: AWS Certified Database - Specialty&lt;/li&gt;
&lt;li&gt;DAS-C01: AWS Certified Data Analytics - Specialty&lt;/li&gt;
&lt;li&gt;MLS-C01: AWS Certified Machine Learning - Specialty&lt;/li&gt;
&lt;li&gt;DEA-C01: AWS Certified Data Engineer - Associate&lt;/li&gt;
&lt;li&gt;SAP-C02: AWS Certified Solutions Architect - Professional (Recertified)&lt;/li&gt;
&lt;li&gt;DOP-C02: AWS Certified DevOps Engineer - Professional (Recertified)&lt;/li&gt;
&lt;li&gt;AIF-C01: AWS Certified AI Practitioner&lt;/li&gt;
&lt;li&gt;SCS-C02: AWS Certified Security - Specialty (Recertified)&lt;/li&gt;
&lt;li&gt;MLA-C01: AWS Certified Machine Learning Engineer - Associate&lt;/li&gt;
&lt;li&gt;ANS-C01: AWS Certified Advanced Networking - Specialty&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Build your own path based on thing you are interested in, tasks, demand
&lt;/h2&gt;

&lt;p&gt;One lesson I learned throughout my certification journey is that there is no universal roadmap.  The best certifications are the ones that align with your interests, your responsibilities, and the problems you solve every day. I always choose certifications that were relevant to the projects I was working on. The knowledge gained during preparation helped me improve application performance, optimize infrastructure costs, strengthen security, and make better architectural decisions.&lt;br&gt;
Certifications are most valuable when they support real-world experience. Choose the path that helps you grow in the areas that matter most to your career, and the benefits will extend far beyond passing an exam.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why learn AWS and certify
&lt;/h2&gt;

&lt;p&gt;I quickly realized that AWS certifications provide a well-structured and detailed learning path that helps you focus on specific domains while also building a broad understanding of cloud technologies. The preparation materials are high quality and organized in a way that makes it easier to grow across different areas of expertise.&lt;br&gt;
Studying for these certifications strengthens knowledge across multiple domains, including AI/ML, DevOps, architecture, networking, and databases. This foundation allows you not only to communicate effectively with engineers and business stakeholders but also to contribute to a wide range of project-level decisions.&lt;/p&gt;

&lt;p&gt;For me, AWS certifications became a powerful tool for self-improvement and continuous growth across roles such as architect, engineer, DevOps, networking, data, and AI. This hands-on expertise has helped me support multiple companies in solving complex, real-world challenges.&lt;/p&gt;

&lt;h2&gt;
  
  
  Golden Jacket it is not a collection of badges, it is your constant learning and growth
&lt;/h2&gt;

&lt;p&gt;The knowledge it is not some fact table or absolute that you quote daily, it is a connected information from your experience in different domains, industries, languages, people, etc.&lt;br&gt;
As more bridges between this isolated islands you can establish - more effesiant you can utilise it. Cause the same principles of resilience and high availability are in&lt;br&gt;
financial systems, telco equipment. AWS Certifications allow you not only grow this islands, but rebuild them into mainland. On every next certification you fully rebuild your entire understanding and think about same components from multiple edges.&lt;br&gt;
This is very important to track targets from all angels and positions not only how everybody used to.&lt;/p&gt;

&lt;h2&gt;
  
  
  Everything Fails — Including Certification Attempts
&lt;/h2&gt;

&lt;p&gt;Even failure is part of the process. There is no perfect study plan, perfect environment, or perfect exam day. Unexpected events can affect your preparation, concentration, internet connectivity, health, personal life, or simply your performance during the exam itself.&lt;br&gt;
Because of that, it is important not to view certification exams as something extraordinary or unattainable. You may fail an exam attempt, and that is perfectly normal.&lt;br&gt;
As engineers, we design systems with the assumption that failures will happen. Servers fail. Networks fail. Storage devices fail. Entire regions can fail. Resilient architectures are not built by pretending failures do not exist—they are built by planning for them and recovering from them.&lt;/p&gt;

&lt;p&gt;The same principle applies to certifications. A failed exam is not the end of the journey. It is feedback.&lt;br&gt;
The score report highlights areas where your knowledge can be strengthened. It reveals gaps in understanding and often points to topics that deserve deeper practical experience. Every unsuccessful attempt provides valuable information that can help you prepare more effectively for the next one.&lt;br&gt;
All of the certifications I completed were passed on the first attempt, except one required additional study, more hands-on practice, and a different approach. Looking back, the certification that challenged me the most often became the ones from which I learned the most.&lt;/p&gt;

&lt;p&gt;Success is not defined by never failing. Success is defined by how quickly you analyze what happened, adjust your approach, and continue moving forward.&lt;br&gt;
Treat certification failures the same way you would treat a production incident: investigate the root cause, identify improvements, implement corrective actions, and try again.&lt;br&gt;
The goal is not perfection. The goal is continuous growth.&lt;/p&gt;

&lt;h2&gt;
  
  
  AWS Certifications evolve with the IT Industry
&lt;/h2&gt;

&lt;p&gt;One of the most common arguments against AWS certifications is that they expire after three years.&lt;br&gt;
At first glance, that may seem like a disadvantage. Why invest the time and effort if you need to recertify in the future?&lt;br&gt;
Your perspective will change once you understood how quickly AWS evolves.&lt;/p&gt;

&lt;p&gt;Every year, AWS launches new services, introduces major features, updates architectural best practices, and expands into entirely new technology domains. Areas such as serverless computing, containers, machine learning, generative AI, observability, and security have changed dramatically over the last few years alone.&lt;br&gt;
Three years in cloud technology is a long time.&lt;br&gt;
The purpose of recertification is not simply to renew a badge. It is an opportunity to revisit your knowledge, learn what has changed, and stay current with the latest developments across the AWS ecosystem.&lt;/p&gt;

&lt;p&gt;Another important aspect of the AWS certification program is its tiered structure. As you progress to higher-level certifications, AWS automatically renews related associate-level certifications. This means that maintaining advanced and professional certifications often helps keep your broader certification portfolio current as well.&lt;br&gt;
Several certifications that I earned no longer exist in their original form. AWS Certified Big Data – Specialty, AWS Certified Advanced Analytics – Specialty, and AWS Certified Database – Specialty have all been retired or replaced by newer certifications that better reflect the current industry focus.&lt;/p&gt;

&lt;p&gt;Although the certifications themselves changed, the knowledge behind them did not disappear.&lt;br&gt;
The principles of data processing, analytics, database design, scalability, performance optimization, and architecture remain relevant regardless of how AWS restructures its certification portfolio. Technologies evolve, but foundational engineering concepts continue to provide value throughout your career.&lt;br&gt;
AWS has also experimented with highly specialized certifications over the years. Some, such as the AWS Certified Alexa Skill Builder or SAP Specialty, targeted very specific domains and audiences. Others focused on technologies that were relevant to a particular period of AWS growth. This year AWS Advanced Machine Learning Specialty is also retiered and AWS GenAI Professional Developer is a new alternative to it.&lt;/p&gt;

&lt;p&gt;This evolution is a sign of a healthy certification program.&lt;br&gt;
As the cloud industry changes, certifications must change as well. New technologies emerge, demand shifts, and learning paths adapt to reflect what engineers need to know today rather than what was relevant five years ago.&lt;/p&gt;

&lt;p&gt;Instead of viewing the three-year validity period as a limitation, I see it as an invitation to continue learning.&lt;br&gt;
The cloud industry never stands still, and neither should we.&lt;/p&gt;

&lt;h2&gt;
  
  
  non-Certification Challanges
&lt;/h2&gt;

&lt;p&gt;Preparing for and passing AWS certifications is not always about studying services, architectures, and best practices. Sometimes the biggest challenges come from circumstances completely outside your control.&lt;/p&gt;

&lt;p&gt;During the COVID-19 pandemic, testing centers around the world suspended operations, making remote-proctored exams the primary certification option. Later, when the war in Ukraine began, the situation became even more complicated. Remote sessions were often the only available way to continue pursuing certifications.&lt;/p&gt;

&lt;p&gt;For many candidates, a certification exam means going to comfortable center. For some of us, it also meant planning around air raid alerts, power outages, and uncertainty about whether connectivity would remain available throughout the exam session.&lt;br&gt;
A large-scale blackout interrupts remind you that even the best preparation cannot eliminate every external risk.&lt;/p&gt;

&lt;p&gt;COVID-19. War. Blackouts. These were challenges, not excuses. If I was able to complete my certification journey under these circumstances, then temporary setbacks should never stop you from pursuing your own goals. Focus on what you can control, keep learning, and continue moving forward.&lt;/p&gt;

&lt;h2&gt;
  
  
  Once certified how to request AWS Golden Jacket
&lt;/h2&gt;

&lt;p&gt;After completing all AWS certifications, there is no direct “order button” or automated system that tracks eligibility.&lt;br&gt;
Instead, access to the Golden Jacket program is typically handled through specific AWS-related communities and programs.&lt;br&gt;
Depending on your involvement with AWS, the path may go through one of the following channels:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AWS Employee&lt;/li&gt;
&lt;li&gt;AWS Ambassador&lt;/li&gt;
&lt;li&gt;AWS Community Builder&lt;/li&gt;
&lt;li&gt;AWS Partner Organisation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One important detail that is often surprising: even AWS Technical Account Managers (TAM) may not always be familiar with the Golden Jacket process. In such cases, it is common to connect with other TAMs or AWS representatives who have previously supported similar requests.&lt;/p&gt;

&lt;p&gt;Yes, this is you next challange, for me this process was scratched into 1 year.&lt;/p&gt;

&lt;p&gt;Based on the program you have applied — delivery options differ. Some can deliver it to your home others require your mandatory in-personal presence on events like - re-Invent or AWS Summit. These are places where a special certification lounge  is located to get you swags, receive Jacket, network with others.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Hidden Benefit of Learning AWS
&lt;/h2&gt;

&lt;p&gt;One unexpected benefit of my AWS certification journey was how much it expanded my knowledge beyond AWS itself.&lt;br&gt;
As you grow in cloud engineering, you begin to look beyond services and APIs. You become interested in what happens behind the scenes—operating systems, networking stacks, virtualization platforms, orchestration systems, databases, storage engines, and the open-source technologies that power modern cloud platforms.&lt;br&gt;
The deeper I went into AWS, the more I found myself exploring the technologies underneath it.&lt;br&gt;
This curiosity eventually led me to build my own home lab environment. Today, I run a self-hosted infrastructure based on Proxmox, pfSense, and a variety of open-source tools and services. It provides a practical environment for experimentation, learning, and validating architectural ideas before applying them in production environments.&lt;/p&gt;

&lt;p&gt;This broader perspective makes it easier to design hybrid architectures, integrate cloud and on-premises systems, work across multiple cloud providers, and understand the technologies that power modern distributed systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  AWS Subject-matter expert
&lt;/h2&gt;

&lt;p&gt;I'm excited that was invited to join AWS Subject matter expert program, where I can contribute to certifications, review exams details. This is AWS certification program that is focus on improvement of exams quality.&lt;/p&gt;

&lt;h2&gt;
  
  
  Golden Jacket is not a final stop
&lt;/h2&gt;

&lt;p&gt;The plans are continuing to lead the learning journey.&lt;br&gt;
Once I have the opportunity, will be thrilled to visit conferences re-Invent and Global Summits in person and network with other AWS Golden Jackets holders.&lt;/p&gt;

&lt;p&gt;For those interested in AWS learning and certifications, I have created an application that facilitates faster learning and growth. You are welcome to join it.&lt;/p&gt;

&lt;p&gt;I would like to thank everyone who supported, encouraged, challenged, and inspired me throughout this AWS journey — your guidance, knowledge, and motivation made this achievement possible.&lt;/p&gt;

&lt;h2&gt;
  
  
  Links:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://play.google.com/store/apps/details?id=io.github.devsteamit.goldenjacket&amp;amp;hl=en" rel="noopener noreferrer"&gt;Android APP: Golden Jacket&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>career</category>
      <category>cloud</category>
      <category>learning</category>
    </item>
    <item>
      <title>Running local AI LLM anywhere: from EC2 instances to Edge Devices</title>
      <dc:creator>Roman Tsypuk</dc:creator>
      <pubDate>Sun, 07 Jun 2026 06:27:50 +0000</pubDate>
      <link>https://dev.to/aws-builders/running-local-ai-llm-anywhere-from-ec2-instances-to-edge-devices-2hhm</link>
      <guid>https://dev.to/aws-builders/running-local-ai-llm-anywhere-from-ec2-instances-to-edge-devices-2hhm</guid>
      <description>&lt;p&gt;Llama.cpp is one of the most efficient frameworks for running Large Language Models locally. Written in pure C/C++, it is optimized for performance and low resource consumption, making it a popular choice for developers who want direct control over model inference without additional runtime layers.&lt;/p&gt;

&lt;p&gt;It supports a broad ecosystem of open-source models, including Llama, Mistral, Gemma, Phi, Qwen, Falcon, Yi, Solar, and many others, while providing efficient execution across CPUs, GPUs, mobile devices, and edge hardware.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benefits and use cases running Local Edge Device LLM
&lt;/h2&gt;

&lt;p&gt;Having possibility to run LLM on Edge Devices opens multiple use cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Minimal latency to inference, no 3rd party hops, man in the middle&lt;/li&gt;
&lt;li&gt;Operatable offline mode, local LLM does not require full-time Internet connection. Connection is needed for model updates or inferent results upload. System can stay year runing in ofline mode&lt;/li&gt;
&lt;li&gt;Edge device can perform inference and get results on a smaller/faster local model. Only in case if model results do not pass defined threshold the bigger more powerful model from Cloud can be used and called&lt;/li&gt;
&lt;li&gt;using quantization trained model can be compacted according to available resources of CPU and RAM&lt;/li&gt;
&lt;li&gt;embeddings calculation can be done on the edge side allowing integrate RAG and vector stores&lt;/li&gt;
&lt;li&gt;privacy: there is no Provider that is logging all requests and responses using your data and uses it for traing of next level model&lt;/li&gt;
&lt;li&gt;sensitive data is sent away from device to 3rd parties&lt;/li&gt;
&lt;li&gt;no celular or wifi needed&lt;/li&gt;
&lt;li&gt;infrastructure costs savings - there is no need to have big fleet of LLM inferences for all edge connections. Cloud model is used as a fallback when local LLM results are out of thresholds&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Installation on ARM Edge hardware
&lt;/h2&gt;

&lt;p&gt;Once you got SSH access to EC2 instance, follow this commands to install llamacpp:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/ggerganov/llama.cpp
&lt;span class="nb"&gt;cd &lt;/span&gt;llama.cpp
&lt;span class="nb"&gt;mkdir &lt;/span&gt;build
&lt;span class="nb"&gt;cd &lt;/span&gt;build
cmake .. &lt;span class="nt"&gt;-DCMAKE_CXX_FLAGS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"-mcpu=native"&lt;/span&gt; &lt;span class="nt"&gt;-DCMAKE_C_FLAGS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"-mcpu=native"&lt;/span&gt;
cmake &lt;span class="nt"&gt;--build&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="nt"&gt;--config&lt;/span&gt; Release &lt;span class="nt"&gt;-j&lt;/span&gt; &lt;span class="sb"&gt;`&lt;/span&gt;&lt;span class="nb"&gt;nproc&lt;/span&gt;&lt;span class="sb"&gt;`&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Installation on Android Devices
&lt;/h2&gt;

&lt;p&gt;First install TERMUX emulator do device, then you can enable sshd service, and remotly connect to you device from any machine and run installation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;~ &lt;span class="nv"&gt;$ &lt;/span&gt;pkg upgrade &lt;span class="nt"&gt;-y&lt;/span&gt;
~ &lt;span class="nv"&gt;$ &lt;/span&gt;pkg &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; clang wget cmake git
~ &lt;span class="nv"&gt;$ &lt;/span&gt;pkg &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; clang wget cmake git
clang is already the newest version &lt;span class="o"&gt;(&lt;/span&gt;21.1.8-2&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
wget is already the newest version &lt;span class="o"&gt;(&lt;/span&gt;1.25.0-1&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
cmake is already the newest version &lt;span class="o"&gt;(&lt;/span&gt;4.3.2-1&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
git is already the newest version &lt;span class="o"&gt;(&lt;/span&gt;2.54.0&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
Summary:
  Upgrading: 0, Installing: 0, Removing: 0, Not Upgrading: 0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;~ &lt;span class="nv"&gt;$ &lt;/span&gt;git clone https://github.com/ggerganov/llama.cpp
Cloning into &lt;span class="s1"&gt;'llama.cpp'&lt;/span&gt;...
remote: Enumerating objects: 95392, &lt;span class="k"&gt;done&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
remote: Counting objects: 100% &lt;span class="o"&gt;(&lt;/span&gt;80/80&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="k"&gt;done&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
remote: Compressing objects: 100% &lt;span class="o"&gt;(&lt;/span&gt;45/45&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="k"&gt;done&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
remote: Total 95392 &lt;span class="o"&gt;(&lt;/span&gt;delta 51&lt;span class="o"&gt;)&lt;/span&gt;, reused 35 &lt;span class="o"&gt;(&lt;/span&gt;delta 35&lt;span class="o"&gt;)&lt;/span&gt;, pack-reused 95312 &lt;span class="o"&gt;(&lt;/span&gt;from 3&lt;span class="o"&gt;)&lt;/span&gt;
Receiving objects: 100% &lt;span class="o"&gt;(&lt;/span&gt;95392/95392&lt;span class="o"&gt;)&lt;/span&gt;, 393.55 MiB | 13.94 MiB/s, &lt;span class="k"&gt;done&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
Resolving deltas: 100% &lt;span class="o"&gt;(&lt;/span&gt;67912/67912&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="k"&gt;done&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;~ &lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;llama.cpp
~ &lt;span class="nv"&gt;$ &lt;/span&gt;cmake &lt;span class="nt"&gt;-B&lt;/span&gt; build
~ &lt;span class="nv"&gt;$ &lt;/span&gt;cmake &lt;span class="nt"&gt;--build&lt;/span&gt; build &lt;span class="nt"&gt;--config&lt;/span&gt; Release
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Download models GGUF
&lt;/h2&gt;

&lt;p&gt;One of the key innovations in the llama.cpp ecosystem is the GGUF model format. GGUF was designed to optimize model storage and inference by supporting a wide range of quantization schemes that reduce the precision of model weights without significantly impacting model quality.&lt;/p&gt;

&lt;p&gt;Instead of storing weights exclusively in FP32 or FP16 formats, GGUF allows models to be quantized to lower-bit representations such as 8-bit, 6-bit, 5-bit, or even 4-bit integers. This substantially decreases memory usage, reduces bandwidth requirements, and improves inference performance.&lt;/p&gt;

&lt;p&gt;These optimizations enable modern LLMs to run efficiently on CPUs, including Arm-based processors, making local AI inference practical on laptops, edge devices, and other resource-constrained systems.&lt;/p&gt;

&lt;p&gt;Here are few huggingface GGUF-models I have experimented with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;wget https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct-GGUF/resolve/main/qwen2.5-0.5b-instruct-q4_k_m.gguf
wget https://huggingface.co/unsloth/Llama-3.2-1B-Instruct-GGUF/resolve/b69aef112e9f895e6f98d7ae0949f72ff09aa401/Llama-3.2-1B-Instruct-Q3_K_M.gguf
wget https://huggingface.co/QuantFactory/SmolLM2-1.7B-Instruct-GGUF/resolve/main/SmolLM2-1.7B-Instruct.Q3_K_M.gguf
wget https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct-GGUF/resolve/main/qwen2.5-0.5b-instruct-q4_k_m.gguf
wget https://huggingface.co/sensura/Qwen3-0.6B-Q3_K_M-GGUF/resolve/main/qwen3-0.6b-q3_k_m.gguf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Quantization
&lt;/h2&gt;

&lt;p&gt;Consider the model file Llama-3.2-1B-Instruct-Q4_K_M.gguf. Each part of the name provides information about the model architecture, size, and quantization level.&lt;/p&gt;

&lt;p&gt;The most important component for inference efficiency is Q4_K_M. The Q4 indicates that the model weights have been quantized to 4-bit precision, meaning each parameter requires only 4 bits of storage instead of the 16 bits typically used by FP16 models. The K_M suffix refers to a specific quantization scheme used by llama.cpp that balances model quality and performance.&lt;/p&gt;

&lt;p&gt;By reducing the precision of billions of model parameters, quantization dramatically decreases both model size and memory requirements. As a result, models that would otherwise require several gigabytes—or even hundreds of gigabytes for larger parameter counts—can run on consumer hardware with significantly less RAM while maintaining good inference quality.&lt;/p&gt;

&lt;h3&gt;
  
  
  Start LLM server llama.cpp
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;./llama.cpp/build/bin/llama-server &lt;span class="nt"&gt;-t&lt;/span&gt; 4 &lt;span class="nt"&gt;-m&lt;/span&gt; models/Llama-3.2-1B-Instruct-Q4_K_M.gguf &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0 &lt;span class="nt"&gt;--port&lt;/span&gt; 9090 &lt;span class="nt"&gt;-np&lt;/span&gt; 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To run llama with mcp servers support additional key should be used:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;./llama.cpp/build/bin/llama-server &lt;span class="nt"&gt;-t&lt;/span&gt; 4 &lt;span class="nt"&gt;-m&lt;/span&gt; models/qwen2.5-0.5b-instruct-q4_k_m.gguf &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0 &lt;span class="nt"&gt;--port&lt;/span&gt; 9090 &lt;span class="nt"&gt;-np&lt;/span&gt; 1 &lt;span class="nt"&gt;--webui-mcp-proxy&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To run llama with support of embeddings endpoint;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./llama.cpp/build/bin/llama-server &lt;span class="nt"&gt;-m&lt;/span&gt; models/qwen2.5-0.5b-instruct-q4_k_m.gguf &lt;span class="nt"&gt;-t&lt;/span&gt; 4 &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0 &lt;span class="nt"&gt;--port&lt;/span&gt; 9090  &lt;span class="nt"&gt;--embeddings&lt;/span&gt; &lt;span class="nt"&gt;--pooling&lt;/span&gt; cls
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  llamacpp has built in OpenWEB UI frontend
&lt;/h2&gt;

&lt;p&gt;Here are results running &lt;code&gt;qwen2.5-0.5b-instruct-q4_k_m.gguf&lt;/code&gt; model on extra small ARM instance, where is free 1G RAM:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu2cay06nolna2lzs2cna.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu2cay06nolna2lzs2cna.png" alt="img1.png" width="799" height="335"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here are resuls running same &lt;code&gt;qwen2.5-0.5b-instruct-q4_k_m.gguf&lt;/code&gt; model on Android phone that has 8Cores and 8GB RAM:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmh0kxbs2iwmr195987bm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmh0kxbs2iwmr195987bm.png" alt="img2.png" width="800" height="413"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Token speed dramatically differs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Connect OpenAI client to Model
&lt;/h2&gt;

&lt;p&gt;With this framework you can use available models, host them even on your Android phone. Now having your local AI running&lt;br&gt;
same SDKs can be used to work with it, with just few parameters adjustment (URL, KEY, MODEL_NAME):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dummy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;admin_api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dummy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://127.0.0.1:9090/v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;qwen2.5-0.5b-instruct-q4_k_m.gguf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;introduce yourself&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;As an artificial intelligence language model, I am Qwen and I was created by Alibaba Cloud. I am a language model designed to assist users in various tasks such as writing, communication, and processing text. I am programmed to understand, generate, and respond to text, and I am constantly learning and improving based on the data I have been trained on. I am here to help and to support you in any way that I can.

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;here are detailed results of model execution:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"messages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"introduce yourself"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"qwen2.5-0.5b-instruct-q4_k_m.gguf"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"choices"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"finish_reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stop"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"index"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"As an artificial intelligence language model, I am Qwen and I was created by Alibaba Cloud. I am a language model designed to assist users in various tasks such as writing, communication, and processing text. I am programmed to understand, generate, and respond to text, and I am constantly learning and improving based on the data I have been trained on. I am here to help and to support you in any way that I can."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"assistant"&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"created"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1780758287&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"chatcmpl-eZuyspAITmOmnYdCQf5blxdQGJviy41W"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"qwen2.5-0.5b-instruct-q4_k_m.gguf"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"object"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"chat.completion"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"system_fingerprint"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"b9295-95405ac65"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"timings"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"cache_n"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"predicted_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;4865.387&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"predicted_n"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;88&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"predicted_per_second"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;18.08694765699008&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"predicted_per_token_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;55.28848863636363&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"prompt_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;557.404&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"prompt_n"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"prompt_per_second"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;57.4089888124233&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"prompt_per_token_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;17.418875&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"usage"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"completion_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;88&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"prompt_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"prompt_tokens_details"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"cached_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"total_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's now run Llama3.2 model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;./llama.cpp/build/bin/llama-server &lt;span class="nt"&gt;-t&lt;/span&gt; 4 &lt;span class="nt"&gt;-m&lt;/span&gt; models/Llama-3.2-1B-Instruct-Q4_K_M.gguf &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0 &lt;span class="nt"&gt;--port&lt;/span&gt; 9090 &lt;span class="nt"&gt;-np&lt;/span&gt; 1 &lt;span class="nt"&gt;--webui-mcp-proxy&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hello, I'm an artificial intelligence model known as Llama. Llama stands for "Large Language Model Meta AI."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"choices"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"finish_reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stop"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"index"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Hello, I'm an artificial intelligence model known as Llama. Llama stands for &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;Large Language Model Meta AI.&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"assistant"&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"created"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1780758446&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"chatcmpl-kTksGv3J8zKpesmhjNzf7NWRUQTdYYUR"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Llama-3.2-1B-Instruct-Q4_K_M.gguf"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"object"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"chat.completion"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"system_fingerprint"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"b9295-95405ac65"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"timings"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"cache_n"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"predicted_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1736.759&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"predicted_n"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"predicted_per_second"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;14.394628155086572&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"predicted_per_token_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;69.47036&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"prompt_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1286.379&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"prompt_n"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;38&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"prompt_per_second"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;29.540283229126096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"prompt_per_token_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;33.85207894736842&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"usage"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"completion_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"prompt_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;38&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"prompt_tokens_details"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"cached_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"total_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;63&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Benchmark
&lt;/h3&gt;

&lt;p&gt;When running llama.cpp on Android devices, it is important to understand the underlying ARM CPU architecture. An octa-core processor does not mean eight equal-performance cores. Most modern mobile chipsets use a big.LITTLE design that combines a small number of high-performance cores with several power-efficient cores.&lt;/p&gt;

&lt;p&gt;For LLM inference, the highest performance is not always achieved by using all available cores. Additional efficiency cores can introduce scheduling overhead and may contribute less computational power, resulting in lower token generation rates than a configuration that uses only the high-performance cores.&lt;/p&gt;

&lt;p&gt;The best approach is to benchmark your device with different thread counts and compare tokens-per-second performance. Once you identify the optimal configuration, tune llama.cpp accordingly to maximize inference speed and overall efficiency.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;~ &lt;span class="nv"&gt;$ &lt;/span&gt;./llama.cpp/build/bin/llama-bench &lt;span class="nt"&gt;-t&lt;/span&gt; 4,6,8 &lt;span class="nt"&gt;-m&lt;/span&gt; models/Llama-3.2-1B-Instruct-Q4_K_M.gguf  &lt;span class="nt"&gt;-p&lt;/span&gt; 128,256,512 &lt;span class="nt"&gt;-n&lt;/span&gt; 64
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;model&lt;/th&gt;
&lt;th&gt;size&lt;/th&gt;
&lt;th&gt;params&lt;/th&gt;
&lt;th&gt;backend&lt;/th&gt;
&lt;th&gt;threads&lt;/th&gt;
&lt;th&gt;test&lt;/th&gt;
&lt;th&gt;t/s&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;llama 1B Q4_K - Medium&lt;/td&gt;
&lt;td&gt;762.81 MiB&lt;/td&gt;
&lt;td&gt;1.24 B&lt;/td&gt;
&lt;td&gt;CPU&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;pp128&lt;/td&gt;
&lt;td&gt;46.43 ± 0.23&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;llama 1B Q4_K - Medium&lt;/td&gt;
&lt;td&gt;762.81 MiB&lt;/td&gt;
&lt;td&gt;1.24 B&lt;/td&gt;
&lt;td&gt;CPU&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;pp256&lt;/td&gt;
&lt;td&gt;49.40 ± 0.38&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;llama 1B Q4_K - Medium&lt;/td&gt;
&lt;td&gt;762.81 MiB&lt;/td&gt;
&lt;td&gt;1.24 B&lt;/td&gt;
&lt;td&gt;CPU&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;pp512&lt;/td&gt;
&lt;td&gt;50.47 ± 0.33&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;llama 1B Q4_K - Medium&lt;/td&gt;
&lt;td&gt;762.81 MiB&lt;/td&gt;
&lt;td&gt;1.24 B&lt;/td&gt;
&lt;td&gt;CPU&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;tg64&lt;/td&gt;
&lt;td&gt;14.35 ± 0.06&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;llama 1B Q4_K - Medium&lt;/td&gt;
&lt;td&gt;762.81 MiB&lt;/td&gt;
&lt;td&gt;1.24 B&lt;/td&gt;
&lt;td&gt;CPU&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;pp128&lt;/td&gt;
&lt;td&gt;51.29 ± 0.25&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;llama 1B Q4_K - Medium&lt;/td&gt;
&lt;td&gt;762.81 MiB&lt;/td&gt;
&lt;td&gt;1.24 B&lt;/td&gt;
&lt;td&gt;CPU&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;pp256&lt;/td&gt;
&lt;td&gt;54.54 ± 0.55&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;llama 1B Q4_K - Medium&lt;/td&gt;
&lt;td&gt;762.81 MiB&lt;/td&gt;
&lt;td&gt;1.24 B&lt;/td&gt;
&lt;td&gt;CPU&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;pp512&lt;/td&gt;
&lt;td&gt;55.42 ± 0.21&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;llama 1B Q4_K - Medium&lt;/td&gt;
&lt;td&gt;762.81 MiB&lt;/td&gt;
&lt;td&gt;1.24 B&lt;/td&gt;
&lt;td&gt;CPU&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;tg64&lt;/td&gt;
&lt;td&gt;14.13 ± 0.04&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;llama 1B Q4_K - Medium&lt;/td&gt;
&lt;td&gt;762.81 MiB&lt;/td&gt;
&lt;td&gt;1.24 B&lt;/td&gt;
&lt;td&gt;CPU&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;pp128&lt;/td&gt;
&lt;td&gt;50.08 ± 1.49&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;llama 1B Q4_K - Medium&lt;/td&gt;
&lt;td&gt;762.81 MiB&lt;/td&gt;
&lt;td&gt;1.24 B&lt;/td&gt;
&lt;td&gt;CPU&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;pp256&lt;/td&gt;
&lt;td&gt;57.29 ± 0.37&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;llama 1B Q4_K - Medium&lt;/td&gt;
&lt;td&gt;762.81 MiB&lt;/td&gt;
&lt;td&gt;1.24 B&lt;/td&gt;
&lt;td&gt;CPU&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;pp512&lt;/td&gt;
&lt;td&gt;58.07 ± 0.48&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;llama 1B Q4_K - Medium&lt;/td&gt;
&lt;td&gt;762.81 MiB&lt;/td&gt;
&lt;td&gt;1.24 B&lt;/td&gt;
&lt;td&gt;CPU&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;tg64&lt;/td&gt;
&lt;td&gt;14.36 ± 0.15&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Limitation
&lt;/h3&gt;

&lt;p&gt;When running from cli from termux emulator on Android, Edge device or EC2 instance, there is limitation that GPU resources are not available for c++ code,&lt;br&gt;
so inference is done only using CPU resources. But for some Edge ARM bare-metal devices llamacpp has support for cuda and can inference using&lt;br&gt;
GPU, which should give better tokens speed.&lt;/p&gt;

&lt;p&gt;I have tried running llamacpp with few models on Android device - it runs smooze and the performance is higher compared to ARM-based IoT devices (but this is due to ARM chipset architecture).&lt;br&gt;
However, on DataComputation intensive tasks, you can obser that the rate of CPU is fully utilized - thus screen can blink during tokens generation.&lt;br&gt;
Such sideeffects will gone when you offload LLM to GPU benefits, check that you hardware is in a supported list. Other alternative can be &lt;strong&gt;Litert-lm&lt;/strong&gt; framework that is natively supported by google and allows invoke LLM on Android devices in CPU and GPU.&lt;/p&gt;

&lt;h2&gt;
  
  
  Links:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://llama-cpp.com/" rel="noopener noreferrer"&gt;https://llama-cpp.com/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/google-ai-edge/gallery/blob/main/Android/src/app/src/main/assets/skills/text-spinner/SKILL.md" rel="noopener noreferrer"&gt;https://github.com/google-ai-edge/gallery/blob/main/Android/src/app/src/main/assets/skills/text-spinner/SKILL.md&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/google-ai-edge/LiteRT-LM/blob/main/docs/api/kotlin/getting_started.md" rel="noopener noreferrer"&gt;https://github.com/google-ai-edge/LiteRT-LM/blob/main/docs/api/kotlin/getting_started.md&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/vladimirvivien/llm-go/blob/main/litertlm-intro/weather-tool-chat/weather.go" rel="noopener noreferrer"&gt;https://github.com/vladimirvivien/llm-go/blob/main/litertlm-intro/weather-tool-chat/weather.go&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://medium.com/@vladimirvivien/litertlm-go-on-device-llm-inference-with-go-and-googles-litert-lm-07241f431a8e" rel="noopener noreferrer"&gt;https://medium.com/@vladimirvivien/litertlm-go-on-device-llm-inference-with-go-and-googles-litert-lm-07241f431a8e&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.openai.com/api/docs/guides/agents/models" rel="noopener noreferrer"&gt;https://developers.openai.com/api/docs/guides/agents/models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.openai.com/api/docs/guides/agents/models" rel="noopener noreferrer"&gt;https://developers.openai.com/api/docs/guides/agents/models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.openai.com/api/docs/quickstart" rel="noopener noreferrer"&gt;https://developers.openai.com/api/docs/quickstart&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;ARM: &lt;a href="https://learn.arm.com/learning-paths/servers-and-cloud-computing/deepseek-cpu/deepseek-chatbot/" rel="noopener noreferrer"&gt;https://learn.arm.com/learning-paths/servers-and-cloud-computing/deepseek-cpu/deepseek-chatbot/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;adnroid ssh: &lt;a href="https://wiki.termux.com/wiki/Remote_Access" rel="noopener noreferrer"&gt;https://wiki.termux.com/wiki/Remote_Access&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;ollamacpp benchmark: &lt;a href="https://turingpi.com/run-llm-locally-arm-rk3588-ollama-llama-cpp/" rel="noopener noreferrer"&gt;https://turingpi.com/run-llm-locally-arm-rk3588-ollama-llama-cpp/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;ai galery: &lt;a href="https://github.com/google-ai-edge/gallery" rel="noopener noreferrer"&gt;https://github.com/google-ai-edge/gallery&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;openclaw android &lt;a href="https://github.com/Mohd-Mursaleen/openclaw-android" rel="noopener noreferrer"&gt;https://github.com/Mohd-Mursaleen/openclaw-android&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;llm on android: &lt;a href="https://geekymd.me/blog/running-local-llm-on-android" rel="noopener noreferrer"&gt;https://geekymd.me/blog/running-local-llm-on-android&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;llm on mobile: &lt;a href="https://farmaker47.medium.com/run-gemma-and-vlms-on-mobile-with-llama-cpp-dbb6e1b19a93" rel="noopener noreferrer"&gt;https://farmaker47.medium.com/run-gemma-and-vlms-on-mobile-with-llama-cpp-dbb6e1b19a93&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;arm developer: &lt;a href="https://learn.arm.com/learning-paths/servers-and-cloud-computing/ai-agent-on-cpu/set-up/" rel="noopener noreferrer"&gt;https://learn.arm.com/learning-paths/servers-and-cloud-computing/ai-agent-on-cpu/set-up/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;hugging faces: &lt;a href="https://huggingface.co/models?library=gguf" rel="noopener noreferrer"&gt;https://huggingface.co/models?library=gguf&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://alain-airom.medium.com/a-first-experience-with-llama-cpp-cac181c64461" rel="noopener noreferrer"&gt;https://alain-airom.medium.com/a-first-experience-with-llama-cpp-cac181c64461&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;edge AI article: &lt;a href="https://pub.towardsai.net/your-edge-llm-is-memory-bound-trading-compute-for-bandwidth-to-hit-30-tokens-per-second-via-litert-eaaf8523eba1" rel="noopener noreferrer"&gt;https://pub.towardsai.net/your-edge-llm-is-memory-bound-trading-compute-for-bandwidth-to-hit-30-tokens-per-second-via-litert-eaaf8523eba1&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
    <item>
      <title>Redis connection monkey patching in Ruby Jungles</title>
      <dc:creator>Roman Tsypuk</dc:creator>
      <pubDate>Fri, 10 Apr 2026 12:51:31 +0000</pubDate>
      <link>https://dev.to/aws-builders/redis-connection-monkey-patching-in-ruby-jungles-4k7o</link>
      <guid>https://dev.to/aws-builders/redis-connection-monkey-patching-in-ruby-jungles-4k7o</guid>
      <description>&lt;p&gt;Some programming languages allow developers to “hack” or extend their internals by overriding existing methods in standard libraries, dynamically attaching new behavior to objects, or modifying classes at runtime.&lt;/p&gt;

&lt;p&gt;One of the languages that strongly embraces this flexibility is &lt;strong&gt;Ruby&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This ability is often referred to as &lt;strong&gt;monkey patching&lt;/strong&gt;, and while it should be used with caution, it can be extremely powerful in real-world scenarios—especially when dealing with legacy systems or unavailable source code.&lt;/p&gt;

&lt;h1&gt;
  
  
  Ruby and Runtime Flexibility
&lt;/h1&gt;

&lt;p&gt;Ruby is a highly dynamic, object-oriented language where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Classes can be reopened and modified at any time&lt;/li&gt;
&lt;li&gt;Methods can be overridden or extended dynamically&lt;/li&gt;
&lt;li&gt;Behavior can be injected into existing objects or modules&lt;/li&gt;
&lt;li&gt;Even core classes (like String, Array, etc.) can be modified&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This makes Ruby particularly well-suited for rapid prototyping, metaprogramming, runtime instrumentation, patching legacy dependencies.&lt;/p&gt;

&lt;p&gt;However, this flexibility comes with responsibility: poorly designed patches can introduce hard-to-debug issues.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example
&lt;/h2&gt;

&lt;p&gt;A simple example of extending a built-in class:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;patch&lt;/span&gt;
    &lt;span class="s2"&gt;"---"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="nb"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upcase&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;"---"&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# rbi&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"aaa"&lt;/span&gt;.patch
&lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"---AAA---"&lt;/span&gt;

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"test"&lt;/span&gt;.patch
&lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"---aaa---"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This demonstrates how easily Ruby allows you to modify even core classes like &lt;code&gt;String&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-world Example: Patching Redis Connection Pool
&lt;/h2&gt;

&lt;p&gt;I encountered a set of legacy Ruby applications that depended on outdated libraries. These dependencies were no longer available in Git repositories, although prebuilt gems were still stored in an internal artifact repository.&lt;/p&gt;

&lt;p&gt;As part of a Redis migration, I needed to identify all polyglot services connecting to Redis instances. The goal was to introduce a &lt;code&gt;CLIENT_NAME&lt;/code&gt; for every Redis client, regardless of the programming language used.&lt;br&gt;
So that majority of services followed projects structure +/- similar &lt;code&gt;go-lang&lt;/code&gt; stack, but those Ruby legacy services were out of the landscape.&lt;/p&gt;
&lt;h3&gt;
  
  
  Challenges
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;No access to source repositories of dependencies&lt;/li&gt;
&lt;li&gt;No explicit Redis connection URLs&lt;/li&gt;
&lt;li&gt;A proprietary “DIY Redis discovery” mechanism&lt;/li&gt;
&lt;li&gt;Redis connections abstracted behind internal libraries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This made it difficult to instrument Redis clients in a standard way.&lt;/p&gt;
&lt;h2&gt;
  
  
  Solution: Monkey Patching
&lt;/h2&gt;

&lt;p&gt;Fortunately, Ruby’s monkey patching capabilities provided a way forward.&lt;/p&gt;

&lt;p&gt;Even without modifying third-party libraries, I was able to intercept Redis connection creation and inject metadata at runtime.&lt;/p&gt;

&lt;p&gt;The idea was simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;As soon as a Redis connection is established, annotate it with metadata such as service name, Ruby version, and Redis client version.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  Original Connection Code (Simplified):
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="nn"&gt;RedisConfig&lt;/span&gt;
  &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Connection&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nc"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_instance!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="n"&gt;redis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="n"&gt;redis&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  Patched Implementation
&lt;/h3&gt;

&lt;p&gt;I created a module that overrides the &lt;strong&gt;create_instance!&lt;/strong&gt; method and augments it with additional instrumentation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="nn"&gt;ServicePatch&lt;/span&gt;
  &lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="nn"&gt;RedisMetadataPatch&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_instance!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;blk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
        &lt;span class="n"&gt;set_open_api_metadata!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;blk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;blk&lt;/span&gt;
      &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;

    &lt;span class="kp"&gt;private&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;set_open_api_metadata!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="n"&gt;safe_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:setname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'SERVICE_NAME'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;r_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="n"&gt;safe_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:setinfo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'LIB-NAME'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"ruby:&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="no"&gt;RUBY_VERSION&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;r_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="n"&gt;safe_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:setinfo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'LIB-VER'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;Redis&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;VERSION&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;r_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;safe_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;rescue&lt;/span&gt; &lt;span class="no"&gt;Redis&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;BaseError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;StandardError&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
      &lt;span class="nb"&gt;warn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"[redis metadata] &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;r_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;inspect&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; failed: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;class&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;message&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="kp"&gt;nil&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="no"&gt;RedisConfig&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;singleton_class&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;prepend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;ServicePatch&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;RedisMetadataPatch&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using prepend ensures that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The patched method runs before the original implementation&lt;/li&gt;
&lt;li&gt;super correctly delegates to the original method&lt;/li&gt;
&lt;li&gt;The patch is cleanly layered without modifying original code&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;p&gt;After deploying this patch, all Redis clients automatically started reporting metadata.&lt;br&gt;
Here is monitoring from &lt;strong&gt;Redis&lt;/strong&gt; server-side that shows how now these ruby services are instrumenting connection name:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;valkey.xxxx.xx.xxxx.xxx.cache.amazonaws.com:6379&amp;gt; monitor
OK
1774951026.839060 &lt;span class="o"&gt;[&lt;/span&gt;0 xx.xx.95.236:48528] &lt;span class="s2"&gt;"hello"&lt;/span&gt; &lt;span class="s2"&gt;"3"&lt;/span&gt; &lt;span class="s2"&gt;"setname"&lt;/span&gt; &lt;span class="s2"&gt;"service-api1"&lt;/span&gt;
1774951026.839435 &lt;span class="o"&gt;[&lt;/span&gt;0 xx.xx.95.236:48528] &lt;span class="s2"&gt;"client"&lt;/span&gt; &lt;span class="s2"&gt;"setname"&lt;/span&gt; &lt;span class="s2"&gt;"service-api1"&lt;/span&gt;
1774951026.840134 &lt;span class="o"&gt;[&lt;/span&gt;0 xx.xx.95.236:48528] &lt;span class="s2"&gt;"client"&lt;/span&gt; &lt;span class="s2"&gt;"setinfo"&lt;/span&gt; &lt;span class="s2"&gt;"LIB-NAME"&lt;/span&gt; &lt;span class="s2"&gt;"ruby:4.0.1"&lt;/span&gt;
1774951026.840142 &lt;span class="o"&gt;[&lt;/span&gt;0 xx.xx.95.236:48528] &lt;span class="s2"&gt;"client"&lt;/span&gt; &lt;span class="s2"&gt;"setinfo"&lt;/span&gt; &lt;span class="s2"&gt;"LIB-VER"&lt;/span&gt; &lt;span class="s2"&gt;"5.4.1"&lt;/span&gt;
1774951026.840614 &lt;span class="o"&gt;[&lt;/span&gt;0 xx.xx.95.236:48528] &lt;span class="s2"&gt;"ping"&lt;/span&gt;
1774951031.463576 &lt;span class="o"&gt;[&lt;/span&gt;0 xx.xx.70.215:58252] &lt;span class="s2"&gt;"hello"&lt;/span&gt; &lt;span class="s2"&gt;"3"&lt;/span&gt; &lt;span class="s2"&gt;"setname"&lt;/span&gt; &lt;span class="s2"&gt;"service-api2"&lt;/span&gt;
1774951031.464538 &lt;span class="o"&gt;[&lt;/span&gt;0 xx.xx.70.215:58252] &lt;span class="s2"&gt;"client"&lt;/span&gt; &lt;span class="s2"&gt;"setname"&lt;/span&gt; &lt;span class="s2"&gt;"service-api1"&lt;/span&gt;
1774951031.468056 &lt;span class="o"&gt;[&lt;/span&gt;0 xx.xx.70.215:58252] &lt;span class="s2"&gt;"client"&lt;/span&gt; &lt;span class="s2"&gt;"setinfo"&lt;/span&gt; &lt;span class="s2"&gt;"LIB-NAME"&lt;/span&gt; &lt;span class="s2"&gt;"ruby:4.0.1"&lt;/span&gt;
1774951031.468066 &lt;span class="o"&gt;[&lt;/span&gt;0 xx.xx.70.215:58252] &lt;span class="s2"&gt;"client"&lt;/span&gt; &lt;span class="s2"&gt;"setinfo"&lt;/span&gt; &lt;span class="s2"&gt;"LIB-VER"&lt;/span&gt; &lt;span class="s2"&gt;"5.4.1"&lt;/span&gt;
1774951031.468728 &lt;span class="o"&gt;[&lt;/span&gt;0 xx.xx.70.215:58252] &lt;span class="s2"&gt;"ping"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Observability Gains
&lt;/h2&gt;

&lt;p&gt;Once the instrumentation was in place, I was able to use a custom Redis client scanner to analyze traffic to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;identify which services are connected to which Redis instances&lt;/li&gt;
&lt;li&gt;track command usage patterns&lt;/li&gt;
&lt;li&gt;detect idle or misbehaving clients&lt;/li&gt;
&lt;li&gt;correlate activity across polyglot systems&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────┬──────────────────────┬──────────────────────┬─────────┬───────┬───────┬────────┬────────┬────────┬────────┐
│ Client Addr         │ Name                 │ Lib                  │ Lib Ver │ Age   │ Idle  │    GET │   MGET │    SET │ ZRANGE │
├─────────────────────┼──────────────────────┼──────────────────────┼─────────┼───────┼───────┼────────┼────────┼────────┼────────┤
│ xx.xx.226.123:27613 │ service-api1         │ ruby:4.0.1           │ 5.4.1   │ 27740 │ 14    │      0 │      2 │     12 │      0 │
│ xx.xx.240.240:32031 │ service-api2         │ ruby:4.0.1           │ 5.4.1   │ 89306 │ 1838  │      0 │      8 │     48 │      0 │
│ xx.xx.240.240:41498 │ service-api3         │ ruby:4.0.1           │ 5.4.1   │ 89306 │ 189   │      0 │     13 │     87 │      0 │
│ xx.xx.254.221:58628 │ service-api4         │ ruby:4.0.1           │ 5.4.1   │ 10503 │ 64    │      0 │     11 │     72 │      0 │
│ xx.xx.254.221:9620  │ service-api5         │ ruby:4.0.1           │ 5.4.1   │ 10503 │ 1238  │      0 │      9 │     54 │      0 │
└─────────────────────┴──────────────────────┴──────────────────────┴─────────┴───────┴───────────────────────────────────────────
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This approach allowed me to instrument legacy Ruby applications without modifying their dependencies or internal logic. By leveraging Ruby’s dynamic capabilities, I was able to introduce observability into a previously opaque system.&lt;/p&gt;

&lt;p&gt;In environments with legacy constraints, such techniques can turn blockers into manageable engineering problems.&lt;/p&gt;

&lt;p&gt;And &lt;code&gt;Ruby&lt;/code&gt; is very straightforward language to write code, some ideas from it migrated to &lt;code&gt;kotlin&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Links:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.ruby-lang.org/en/" rel="noopener noreferrer"&gt;https://www.ruby-lang.org/en/&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
    <item>
      <title>From Redis to Valkey: pre-migration Reconnaissance — detect all apps &amp; connections in realtime</title>
      <dc:creator>Roman Tsypuk</dc:creator>
      <pubDate>Sun, 05 Apr 2026 17:57:21 +0000</pubDate>
      <link>https://dev.to/aws-builders/from-redis-to-valkey-pre-migration-reconnaissance-detect-all-apps-connections-in-realtime-39nj</link>
      <guid>https://dev.to/aws-builders/from-redis-to-valkey-pre-migration-reconnaissance-detect-all-apps-connections-in-realtime-39nj</guid>
      <description>&lt;h2&gt;
  
  
  Abstract
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Abraham Lincoln&lt;/strong&gt;: "Give me six hours to chop down a tree and I will spend the first four sharpening the axe"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Redis Valkey is getting more popular due to its performance increase compared to classic Redis version, I'm starting a seris of posts&lt;br&gt;
related to migration from one vendors' Redis implemnetation to &lt;code&gt;AWS Valkey&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;To choose proper migration technics, the most important step is a reconnaissance of pre-migration. In this post I'll explain how native Redis features can help to identify all services that have connection to Redis (what is really hard in distributed environment of enterprise level infrastucture, that was created with periodically changed stack, languages, SDKs by multiple engineering generations).&lt;/p&gt;
&lt;h2&gt;
  
  
  Valkey project bried history
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Redis&lt;/strong&gt; first version release was in &lt;strong&gt;2009&lt;/strong&gt;, since that time it grown from cache &lt;strong&gt;key-value&lt;/strong&gt; storage to &lt;strong&gt;PubSub&lt;/strong&gt;, &lt;strong&gt;Stream&lt;/strong&gt;, &lt;strong&gt;DB&lt;/strong&gt; and used in a lot of projects.&lt;/p&gt;

&lt;p&gt;Thus cloud providers &lt;code&gt;AWS&lt;/code&gt;, &lt;code&gt;GCP&lt;/code&gt;, &lt;code&gt;Azure&lt;/code&gt;, &lt;code&gt;Oracle&lt;/code&gt; started providing Redis as a managed service - allowing engineers offload cluster management and all heavy lifting to providers.&lt;/p&gt;

&lt;p&gt;But &lt;code&gt;Redis&lt;/code&gt; company provided own cloud - &lt;code&gt;RedisCloud&lt;/code&gt; with Redis-managed-services. So to have more clients in RedisCloud and bigger monetization, starting from &lt;code&gt;Redis version 7.4&lt;/code&gt; the license was changed oriented to other cloud-providers to pay for Redis if it is offered as a managed service.&lt;/p&gt;

&lt;p&gt;At that moment a git fork of Redis was created &lt;code&gt;https://github.com/valkey-io/valkey&lt;/code&gt; under name &lt;code&gt;Valkey&lt;/code&gt; and it was maintained both by open-source community and Cloud-providers, since this fork had original OS license. Redis is written on &lt;code&gt;C&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Today there are 2 different repositories and projects, each having its own release version and code name:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Redis (&lt;a href="https://github.com/redis/redis" rel="noopener noreferrer"&gt;https://github.com/redis/redis&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Valkey (&lt;a href="https://github.com/valkey-io/valkey" rel="noopener noreferrer"&gt;https://github.com/valkey-io/valkey&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At this moment the latest release versions are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Redis v8.6.2&lt;/li&gt;
&lt;li&gt;Valkey v9.0.3 (on top of Redis v7.2.4)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So both projects are implementing different features in parallel, like multithreading that was added recently.&lt;/p&gt;
&lt;h2&gt;
  
  
  Pre-migration Reconnaissance
&lt;/h2&gt;

&lt;p&gt;I'm preparing migration of Redis instances from Redis-cloud to AWS Valkey and before migration need info about data access patterns and all Redis producers/consumers.&lt;/p&gt;
&lt;h3&gt;
  
  
  Identify clients that are reading/writing to Redis:
&lt;/h3&gt;

&lt;p&gt;There are different technics that can be used to identify all writers/readers of Redis - tracing and monitoring tools&lt;br&gt;
like &lt;strong&gt;datadog&lt;/strong&gt;, &lt;strong&gt;x-ray&lt;/strong&gt;, analysys of &lt;strong&gt;ENV&lt;/strong&gt; variables set for services with aliases to endpoint of Redis.&lt;/p&gt;

&lt;p&gt;And things are getting more complex in real life. Do not be surprised to see in your enterprise-leve stack:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;distributed environment&lt;/li&gt;
&lt;li&gt;AWS multi-account deployments, privatelink, vpc-perrings established, etc.&lt;/li&gt;
&lt;li&gt;tons of running services that are written in multiple languages (go, java, ts, ruby)&lt;/li&gt;
&lt;li&gt;lack of documentation, stackholders, etc.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here I will show the technical that I found extra useful, that is natively supported out of the box by Redis, does&lt;br&gt;
not require installation of any third-party agents, monitoring stack, etc.&lt;/p&gt;
&lt;h2&gt;
  
  
  redis-cli
&lt;/h2&gt;

&lt;p&gt;Redis out of the box has functionality that allows to gather information about clients. You need establish&lt;br&gt;
connection to your Redis server through cli and execute commands.&lt;/p&gt;
&lt;h3&gt;
  
  
  LIST CLIENT
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;The &lt;strong&gt;CLIENT LIST&lt;/strong&gt; command returns information and statistics about the client connections server in a mostly human&lt;br&gt;
readable format.&lt;/p&gt;

&lt;p&gt;You can use one of the optional subcommands to filter the list. The TYPE type subcommand filters the list by clients'&lt;br&gt;
type, where type is one of normal, master, replica, and pubsub. Note that clients blocked by the &lt;strong&gt;MONITOR&lt;/strong&gt; command&lt;br&gt;
belong&lt;br&gt;
to the normal class.&lt;/p&gt;

&lt;p&gt;The ID filter only returns entries for clients with IDs matching the client-id arguments.&lt;br&gt;
&lt;/p&gt;


&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;redis-cxxxxx.us-east-1.ec2.cloud.xxxx.com:6379&amp;gt; client list
&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3066004040000 &lt;span class="nv"&gt;addr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xx.xx.xx.135:30746 &lt;span class="nv"&gt;laddr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xx.20.4.124:18585 &lt;span class="nv"&gt;fd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4789 &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;age&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;18701 &lt;span class="nv"&gt;idle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;827 &lt;span class="nv"&gt;flags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;N &lt;span class="nv"&gt;db&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;sub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;psub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;ssub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;multi&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nt"&gt;-1&lt;/span&gt; &lt;span class="nv"&gt;watch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;obl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;events&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;r &lt;span class="nv"&gt;cmd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;get &lt;span class="nv"&gt;user&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xxx &lt;span class="nv"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2 lib-name&lt;span class="o"&gt;=&lt;/span&gt;go-redis&lt;span class="o"&gt;(&lt;/span&gt;,go1.24.13&lt;span class="o"&gt;)&lt;/span&gt; lib-ver&lt;span class="o"&gt;=&lt;/span&gt;9.17.2
&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3081428040000 &lt;span class="nv"&gt;addr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xx.xx.xx.145:31848 &lt;span class="nv"&gt;laddr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xx.20.4.124:18585 &lt;span class="nv"&gt;fd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4959 &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;age&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;538 &lt;span class="nv"&gt;idle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;221 &lt;span class="nv"&gt;flags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;N &lt;span class="nv"&gt;db&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;sub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;psub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;ssub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;multi&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nt"&gt;-1&lt;/span&gt; &lt;span class="nv"&gt;watch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;obl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;events&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;r &lt;span class="nv"&gt;cmd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;get &lt;span class="nv"&gt;user&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xxx &lt;span class="nv"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2 lib-name&lt;span class="o"&gt;=&lt;/span&gt; lib-ver&lt;span class="o"&gt;=&lt;/span&gt;
&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2956190040001 &lt;span class="nv"&gt;addr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xx.xx.xx.118:1604 &lt;span class="nv"&gt;laddr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xx.20.4.124:18585 &lt;span class="nv"&gt;fd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;5117 &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;age&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;148140 &lt;span class="nv"&gt;idle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;5529 &lt;span class="nv"&gt;flags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;N &lt;span class="nv"&gt;db&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;sub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;psub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;ssub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;multi&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nt"&gt;-1&lt;/span&gt; &lt;span class="nv"&gt;watch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;obl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;events&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;r &lt;span class="nv"&gt;cmd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;set &lt;/span&gt;&lt;span class="nv"&gt;user&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xxx &lt;span class="nv"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2 lib-name&lt;span class="o"&gt;=&lt;/span&gt; lib-ver&lt;span class="o"&gt;=&lt;/span&gt;
&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3072343040001 &lt;span class="nv"&gt;addr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xx.xx.xx.126:46494 &lt;span class="nv"&gt;laddr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xx.20.4.124:18585 &lt;span class="nv"&gt;fd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4048 &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;age&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;11493 &lt;span class="nv"&gt;idle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;428 &lt;span class="nv"&gt;flags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;N &lt;span class="nv"&gt;db&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;sub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;psub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;ssub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;multi&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nt"&gt;-1&lt;/span&gt; &lt;span class="nv"&gt;watch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;obl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;events&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;r &lt;span class="nv"&gt;cmd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;get &lt;span class="nv"&gt;user&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xxx &lt;span class="nv"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2 lib-name&lt;span class="o"&gt;=&lt;/span&gt;go-redis&lt;span class="o"&gt;(&lt;/span&gt;,go1.24.13&lt;span class="o"&gt;)&lt;/span&gt; lib-ver&lt;span class="o"&gt;=&lt;/span&gt;9.17.2
&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3078769040001 &lt;span class="nv"&gt;addr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xx.xx.xx.173:9717 &lt;span class="nv"&gt;laddr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xx.20.4.124:18585 &lt;span class="nv"&gt;fd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4184 &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;age&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3714 &lt;span class="nv"&gt;idle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="nv"&gt;flags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;N &lt;span class="nv"&gt;db&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;sub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;psub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;ssub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;multi&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nt"&gt;-1&lt;/span&gt; &lt;span class="nv"&gt;watch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;obl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;events&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;r &lt;span class="nv"&gt;cmd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;get &lt;span class="nv"&gt;user&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xxx &lt;span class="nv"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2 lib-name&lt;span class="o"&gt;=&lt;/span&gt;go-redis&lt;span class="o"&gt;(&lt;/span&gt;,go1.24.13&lt;span class="o"&gt;)&lt;/span&gt; lib-ver&lt;span class="o"&gt;=&lt;/span&gt;9.17.2
&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3080824040001 &lt;span class="nv"&gt;addr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xx.xx.xx.158:35937 &lt;span class="nv"&gt;laddr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xx.20.4.124:18585 &lt;span class="nv"&gt;fd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;5035 &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;age&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1267 &lt;span class="nv"&gt;idle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;flags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;N &lt;span class="nv"&gt;db&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;sub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;psub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;ssub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;multi&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nt"&gt;-1&lt;/span&gt; &lt;span class="nv"&gt;watch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;obl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;events&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;r &lt;span class="nv"&gt;cmd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;get &lt;span class="nv"&gt;user&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xxx &lt;span class="nv"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2 lib-name&lt;span class="o"&gt;=&lt;/span&gt;go-redis&lt;span class="o"&gt;(&lt;/span&gt;,go1.24.13&lt;span class="o"&gt;)&lt;/span&gt; lib-ver&lt;span class="o"&gt;=&lt;/span&gt;9.17.2
&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3081894040001 &lt;span class="nv"&gt;addr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xx.xx.xx.245:36487 &lt;span class="nv"&gt;laddr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xx.20.4.124:18585 &lt;span class="nv"&gt;fd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3686 &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;age&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;26 &lt;span class="nv"&gt;idle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;5 &lt;span class="nv"&gt;flags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;N &lt;span class="nv"&gt;db&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;sub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;psub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;ssub&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;multi&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nt"&gt;-1&lt;/span&gt; &lt;span class="nv"&gt;watch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;obl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;events&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;r &lt;span class="nv"&gt;cmd&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;get &lt;span class="nv"&gt;user&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;xxx &lt;span class="nv"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2 lib-name&lt;span class="o"&gt;=&lt;/span&gt; lib-ver&lt;span class="o"&gt;=&lt;/span&gt;
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The most interesting data:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;id: a unique 64-bit client ID.&lt;/li&gt;
&lt;li&gt;addr: address/port of the client.&lt;/li&gt;
&lt;li&gt;laddr: address/port of local address client connected to (bind address).&lt;/li&gt;
&lt;li&gt;name: the name set by the client with CLIENT SETNAME.&lt;/li&gt;
&lt;li&gt;age: total duration of the connection in seconds.&lt;/li&gt;
&lt;li&gt;idle: idle time of the connection in seconds.&lt;/li&gt;
&lt;li&gt;db: current database ID.&lt;/li&gt;
&lt;li&gt;cmd: last command played.&lt;/li&gt;
&lt;li&gt;user: the authenticated username of the client.&lt;/li&gt;
&lt;li&gt;lib-name - the name of the client library that is being used.&lt;/li&gt;
&lt;li&gt;lib-ver - the version of the client library.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Having this information from active Redis will allow detecting all client, distinguish them by sesssionID and track&lt;br&gt;
additional information about&lt;br&gt;
library versio, library name, name of the client. Also additional details about the session.&lt;/p&gt;
&lt;h3&gt;
  
  
  MONITOR
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;MONITOR is a debugging command that streams back every command processed by the Redis server. It can help in&lt;br&gt;
understanding what is happening to the database. This command can both be used via redis-cli and via telnet.&lt;/p&gt;

&lt;p&gt;Because MONITOR streams back all commands, its use comes at a cost. The following (totally unscientific) benchmark&lt;br&gt;
numbers illustrate what the cost of running MONITOR can be.&lt;br&gt;
&lt;/p&gt;


&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;redis-cxxxxx.us-east-1.ec2.cloud.xxxx.com:6379&amp;gt; monitor
OK
1774787120.638084 &lt;span class="o"&gt;[&lt;/span&gt;0 xx.xx.xx.222:40277] &lt;span class="s2"&gt;"get"&lt;/span&gt; &lt;span class="s2"&gt;"prefixkey:namespacea:default_data"&lt;/span&gt;
1774787120.649084 &lt;span class="o"&gt;[&lt;/span&gt;0 xx.xx.xx.243:58512] &lt;span class="s2"&gt;"get"&lt;/span&gt; &lt;span class="s2"&gt;"prefixkey:namespace2:default_data"&lt;/span&gt;
1774787120.652084 &lt;span class="o"&gt;[&lt;/span&gt;0 xx.xx.xx.222:40277] &lt;span class="s2"&gt;"get"&lt;/span&gt; &lt;span class="s2"&gt;"prefixkey:namespace3:default_data"&lt;/span&gt;
1774787120.687083 &lt;span class="o"&gt;[&lt;/span&gt;0 xx.xx.xx.50:64040] &lt;span class="s2"&gt;"zrange"&lt;/span&gt; &lt;span class="s2"&gt;"prefixkey:namespace4:allowed_data"&lt;/span&gt; &lt;span class="s2"&gt;"0"&lt;/span&gt; &lt;span class="s2"&gt;"-1"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  CLIENTNAME feature
&lt;/h2&gt;

&lt;p&gt;Output from these commands will be enough in 99% of cases. If you are hosting your apps on EC2, EKS,&lt;br&gt;
ECS, lambda functions - you can easily &lt;strong&gt;map IP addresses&lt;/strong&gt; to dedicated &lt;strong&gt;pods/container IPs&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;But I got a corner-case - the source Redis is running in &lt;code&gt;RedisLabs&lt;/code&gt; - it is a cloud environment that provides&lt;br&gt;
redis-as-a-service (under the hood deploying it to cloud infrasstructure, in case AWS EC2 instances).&lt;/p&gt;

&lt;p&gt;But Consumer apps are running in &lt;code&gt;kubernetes&lt;/code&gt; cluster and since RedisLabs is hosted externally out of &lt;strong&gt;AWS account&lt;/strong&gt; the network&lt;br&gt;
traffic flows through k8s cluster node.&lt;/p&gt;

&lt;p&gt;Since on every kubernetes node is running a lot of payloads in each pod, but all of them when connecting to Redis will have&lt;br&gt;
the same IP address — the IP address of the cluster node on which they are hosted.&lt;/p&gt;

&lt;p&gt;So this info can limit the blast radius of services but since &lt;code&gt;node afinity&lt;/code&gt; is not in use and due to big landscape of&lt;br&gt;
services, it is hard to identify Redis writers/readers - all pods will have same address set of Cluster nodes.&lt;/p&gt;

&lt;p&gt;And here comes &lt;strong&gt;CLIENT_NAME&lt;/strong&gt; - this feature is available since Redis Open Source 2.6.9 - it assigns a name to the&lt;br&gt;
current connection.&lt;/p&gt;

&lt;p&gt;The assigned name is displayed in the output of &lt;code&gt;CLIENT LIST&lt;/code&gt; so that it is possible to identify the client that&lt;br&gt;
performed a given connection.&lt;/p&gt;

&lt;p&gt;However, it is not possible to use spaces in the connection name as this would violate the format of the &lt;code&gt;CLIENT LIST&lt;/code&gt;&lt;br&gt;
reply.&lt;/p&gt;

&lt;p&gt;Every new connection starts without an assigned name.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;setting names to connections is a good way to debug connection leaks due to bugs in the application using Redis.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Once clients are instrumented and deployed, using redis-cli we can check the data.&lt;/p&gt;
&lt;h2&gt;
  
  
  Writing tool to track clients and operations on Redis:
&lt;/h2&gt;

&lt;p&gt;To make this information in a table format I have created python-based tool that interacts with Redis using &lt;strong&gt;RESP&lt;/strong&gt;&lt;br&gt;
protocol and renders information:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌────────────────┬─────────────────┬───────────────────────┬─────────┬─────────┬──────────────────────────────────┬─────┐
│ Client IP      │ Name            │ Lib                   │ Lib Ver │ User    │ Full Key                         │ GET │
├────────────────┼─────────────────┼───────────────────────┼─────────┼─────────┼──────────────────────────────────┼─────┤
│ 10.xx.xx.53    │                 │                       │         │ xxxx    │                                  │   0 │
│ 10.xx.xx.173   │                 │ go-redis(,go1.21.1)   │ 8.0.2   │ xxxx    │ prefixkey:namespace1:default_data│   1 │
│                │                 │                       │         │         │ prefixkey:namespace2:default_data│   1 │
│ 10.100.238.99  │                 │                       │         │ xxxx    │                                  │   0 │
│ 10.100.244.76  │                 │                       │         │ xxxx    │                                  │   0 │
│ 10.100.75.149  │                 │ go-redis(,go1.24.13)  │ 9.17.2  │ default │ prefixkey:namespace1:default_data│   1 │
│                │                 │                       │         │         │ prefixkey:namespace1:default_data│   1 │
│ 10.104.127.44  │                 │                       │         │ default │                                  │   0 │
│ 10.104.136.207 │                 │ go-redis(,go1.24.13)  │ 9.17.2  │ default │                                  │   0 │
│ 10.104.94.31   │                 │                       │         │ default │                                  │   0 │
│ 10.170.3.253   │ monitor-v2-prod │ python-socket-monitor │ 1.0     │ default │                                  │   0 │
└────────────────┴─────────────────┴───────────────────────┴─────────┴─────────┴──────────────────────────────────┴─────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For some period in realtime we can track what operations are performed on Redis server and who are the client.&lt;br&gt;
Once all of them are identified and there no unknown areas it's time to analyze Redis instance and plan migraion.&lt;/p&gt;
&lt;h3&gt;
  
  
  Instrument client with clientName
&lt;/h3&gt;

&lt;p&gt;Redis allows instrument redis SDK with name of the client to be identified on every connection. Here is go-lang code&lt;br&gt;
snippet:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;rdb&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Options&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Addr&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;       &lt;span class="s"&gt;"REDIS_HOST:REDIS_PORT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Password&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="s"&gt;"REDIS_PASSS"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;DB&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;         &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ClientName&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"service1-writer"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same can be done on every programming languages libraries, or even at the low-level of socket with Redis RESP-protocol&lt;br&gt;
level:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;send_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sock&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Send a Redis RESP protocol command.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
  &lt;span class="n"&gt;command&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\r\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;arg&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;arg_str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;command&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arg_str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\r\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;arg_str&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\r\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
  &lt;span class="n"&gt;sock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sendall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;set_client_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sock&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;client_name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Assign a client name to the current Redis connection.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;client_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt;
  &lt;span class="nf"&gt;send_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sock&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CLIENT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SETNAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;client_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;read_line&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;+OK&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ConnectionError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CLIENT SETNAME failed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;set_client_info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sock&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lib_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lib_version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Assign client library metadata to the current Redis connection.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attribute&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;LIB-NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lib_name&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;LIB-VER&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lib_version&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="k"&gt;continue&lt;/span&gt;
    &lt;span class="nf"&gt;send_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sock&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CLIENT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SETINFO&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attribute&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;read_line&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Redis &amp;lt; 7.2 may not support CLIENT SETINFO, so ignore failures.
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
      &lt;span class="k"&gt;continue&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Challenge 1: that there are too many connections to this RedisDB
&lt;/h2&gt;

&lt;p&gt;With our custom monitor tool, now we are tracking all clients with details and suddenly see that there are too many&lt;br&gt;
clients and keys that are being accessed.&lt;br&gt;
Since Redis instance is used by multiple service and we are interested in extraction from it only specific prefix&lt;br&gt;
pattern key, we can modify the monitor tool to track only those prefixes access.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;add predicate for prefix of keyset—after that, you will see only client names, IPs and operations to a limited subset&lt;br&gt;
of keystore are that is a subject of extraction.&lt;br&gt;
{: .prompt-danger }&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  Challenge 2: client versions are too old and will not support Valkey
&lt;/h2&gt;

&lt;p&gt;It's important to check SDK client versions—do not be surprised that you are using outdated Redis and outdated&lt;br&gt;
libraries.&lt;br&gt;
In this case, if you provision the latest AWS Valkey and will try to just change the endpoint URL in apps config that&lt;br&gt;
will not work - you apps will get error connections in RESP procol and commands.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;upgrade and align client versions to the latest supported redis SDK&lt;br&gt;
{: .prompt-danger }&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  Completed redis client versions and clientnames alignment
&lt;/h2&gt;

&lt;p&gt;After upgrading all clients code to latest golang redis library version (that is for today is 9.18.0) and adding client&lt;br&gt;
name to redis sessions, now we can review active connections and see what keys are being accessed and ready for the&lt;br&gt;
planning the actual migration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌────────────────┬─────────────────┬───────────────────────┬─────────┬─────────┬──────────────────────────────────┬─────┐
│ Client IP      │ Name            │ Lib                   │ Lib Ver │ User    │ Full Key                         │ GET │
├────────────────┼─────────────────┼───────────────────────┼─────────┼─────────┼──────────────────────────────────┼─────┤
│ 10.xx.xx.53    │                 │                       │         │ xxxx    │                                  │   0 │
│ 10.xx.xx.173   │ service1-writer │ go-redis-(.go1.24.13) │ 9.18.0  │ xxxx    │ prefixkey:namespace1:default_data│   1 │
│                │                 │                       │         │         │ prefixkey:namespace2:default_data│   1 │
│ 10.xx.xx.149   │ service2-reader │ go-redis-(.go1.24.13) │ 9.18.0  │ xxxx    │ prefixkey:namespace1:default_data│   1 │
│ 10.xx.xx.142   │ service3-reader │ go-redis-(.go1.24.13) │ 9.18.0  │ xxxx    │ prefixkey:namespace1:default_data│   1 │
│ 10.xx.xx.141   │ service4-reader │ go-redis-(.go1.24.13) │ 9.18.0  │ xxxx    │ prefixkey:namespace1:default_data│   1 │
│ 10.xx.xx.129   │ service5-writer │ go-redis-(.go1.24.13) │ 9.18.0  │ xxxx    │ prefixkey:namespace1:default_data│   1 │
│ 10.xx.xx.159   │ service3-reader │ go-redis-(.go1.24.13) │ 9.18.0  │  xxxx   │ prefixkey:namespace1:default_data│   1 │
│ 10.xx.xx.44    │                 │                       │         │ xxxx    │                                  │   0 │
│ 10.xx.xx.207   │ service2-reader │ go-redis-(.go1.24.13) │ 9.18.0  │ xxxx    │                                  │   0 │
│ 10.xx.xx.31    │                 │                       │         │ xxxx    │                                  │   0 │
│ 10.xx.xx.253   │ monitor-v2-prod │ python-socket-monitor │ 1.0     │ xxxx    │                                  │   0 │
└────────────────┴─────────────────┴───────────────────────┴─────────┴─────────┴──────────────────────────────────┴─────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;On top of this tool can be added read/write ios per client measurement, key access detection or any analytic&lt;br&gt;
operations for troubleshooting &amp;amp; refactoring. Its intention is not limited to pre-migration analysis.&lt;br&gt;
{: .prompt-danger }&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Reconnaissance step output summary
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;updated and aligned library versions of all Redis SDKs&lt;/li&gt;
&lt;li&gt;properly trackable client names and all connections&lt;/li&gt;
&lt;li&gt;fully identified all writers/readers that are planned to be migrated to AWS Valkey&lt;/li&gt;
&lt;li&gt;clear understanding of data access patterns (read/write to keystore)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the next post, I’ll dive into Redis topologies and the critical components to consider when preparing and running a migration, as they depend on the chosen topology.&lt;/p&gt;

&lt;h2&gt;
  
  
  Links:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://valkey.io/topics/migration/" rel="noopener noreferrer"&gt;https://valkey.io/topics/migration/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/valkey-io/valkey" rel="noopener noreferrer"&gt;https://github.com/valkey-io/valkey&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>database</category>
      <category>devops</category>
      <category>monitoring</category>
    </item>
    <item>
      <title>How switching to SQS Batch operations improves Performance an Billing</title>
      <dc:creator>Roman Tsypuk</dc:creator>
      <pubDate>Tue, 23 Dec 2025 19:36:59 +0000</pubDate>
      <link>https://dev.to/aws-builders/optimizing-sqs-operations-using-batch-api-14fn</link>
      <guid>https://dev.to/aws-builders/optimizing-sqs-operations-using-batch-api-14fn</guid>
      <description>&lt;h2&gt;
  
  
  Abstract
&lt;/h2&gt;

&lt;p&gt;In this post, we explore how refactoring SQS message processing from individual &lt;strong&gt;SendMessage&lt;/strong&gt; calls to &lt;strong&gt;Batch SendMessage&lt;/strong&gt; operations can&lt;br&gt;
significantly improve application performance and reduce SQS billing costs by lowering IOPS usage.&lt;/p&gt;


&lt;h2&gt;
  
  
  The idea
&lt;/h2&gt;

&lt;p&gt;When monitoring a Golang application with DataDog, we can measure SQS message sending in detail. By comparing a traditional loop-based send approach versus batch sending, we can see clear differences in timing, network calls, and resource usage.&lt;/p&gt;
&lt;h2&gt;
  
  
  Full Datadog tracing of SQS is not supported for all languages:
&lt;/h2&gt;

&lt;p&gt;Set DD_TRACE_CLOUD_REQUEST_PAYLOAD_TAGGING=all and DD_TRACE_CLOUD_RESPONSE_PAYLOAD_TAGGING=all on this service to enable complete payload tagging.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.datadoghq.com/tracing/guide/aws_payload_tagging/?tab=nodejs" rel="noopener noreferrer"&gt;https://docs.datadoghq.com/tracing/guide/aws_payload_tagging/?tab=nodejs&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DD_TRACE_CLOUD_REQUEST_PAYLOAD_TAGGING=all
DD_TRACE_CLOUD_RESPONSE_PAYLOAD_TAGGING=all
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Golang, you can leverage Datadog attribute tags to inspect payload metadata.&lt;/p&gt;

&lt;h2&gt;
  
  
  Regular SQS message send operations
&lt;/h2&gt;

&lt;p&gt;Sending messages one by one involves multiple network calls and extra overhead.&lt;/p&gt;

&lt;p&gt;Following tracing diagram displays how timing looks like when using loop operation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk3qs5qv35gorimaiezo6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk3qs5qv35gorimaiezo6.png" alt="img2.png" width="800" height="237"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For example, sending &lt;code&gt;7 messages&lt;/code&gt; individually took &lt;code&gt;175ms&lt;/code&gt;, with &lt;code&gt;7 separate HTTP requests&lt;/code&gt;. The first call typically dominates the timing due to DNS lookup and connection setup.&lt;/p&gt;

&lt;p&gt;But since the service in running in same K8S cluster, we can assume that experiment is clean and no additional overhead is present.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sending messages in a Batch
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj26jrzxb9vlmtxegi9y8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj26jrzxb9vlmtxegi9y8.png" alt="img3.png" width="800" height="50"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AWS SQS allows sending up to 10 messages per batch. Sending &lt;code&gt;20 messages&lt;/code&gt; in &lt;code&gt;2 batches&lt;/code&gt; demonstrates significant efficiency gains:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sent 3x more messages.&lt;/li&gt;
&lt;li&gt;Made 10x fewer HTTP requests.&lt;/li&gt;
&lt;li&gt;Total processing time reduced by ~3x.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Response examples:
&lt;/h3&gt;

&lt;p&gt;When batch send is performed in a batch response, for each given message there is a status, including the error.&lt;br&gt;
So the batch can be completed, but some messages in a batch can fail, parsing this response status will allow to efficiently&lt;br&gt;
replay or handle with a fallback business logic of such cases.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Successful"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"MessageID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"655f3404-fbe4-4c51-8868-b5c604bd5f6d"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"MessageID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"daf36653-9abb-490b-b620-608efa24a219"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"MessageID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"93f4dcfd-0500-4076-90f2-3b880b32c943"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"MessageID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"f6c7b079-98f5-4290-b293-2ac6e43ed6f2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"4"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"MessageID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2b4a96bc-b4ec-4711-9473-d887dd3213f7"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"5"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"MessageID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1bd30cd9-f9c1-4b47-8d6d-2e23ce771841"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"6"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"MessageID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"8eed75ef-2563-442e-a191-6b3dff29d635"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"7"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"MessageID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"c65a36ce-7ce0-444c-9974-96648dcae0ea"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"8"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"MessageID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"75379265-52f9-4a60-8c3a-0537cffdaa80"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"9"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"MessageID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"59239903-d4d9-498f-9a08-6d7d7ae8beba"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"10"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"MessageID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"9a614c58-113b-487d-a8f1-7509f93b42f9"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"11"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"MessageID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1077de5c-8f0f-4d5b-a0fe-dca45712bfdf"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"12"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"MessageID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"8b0f5836-0e01-4a88-9793-4bac2a6d879a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Failed"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  AWS Console Behavior
&lt;/h3&gt;

&lt;p&gt;Batch sending does not change how messages appear in SQS. Each message is stored individually, so consumers don’t need any changes to handle batches.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa8qtypwx2rc65tzkfevd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa8qtypwx2rc65tzkfevd.png" alt="img4.png" width="800" height="317"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Same messages, same structures are posted and present in SQS.&lt;/p&gt;

&lt;p&gt;However, there are other optimization technics that allow optimise consumer batch size when polling messages from &lt;code&gt;SQS&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Golang Implementation Example
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;entry&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;sqs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SendMessageBatchRequestEntry&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;Id&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;          &lt;span class="n"&gt;aws&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%d"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="c"&gt;// Unique ID within batch&lt;/span&gt;
  &lt;span class="n"&gt;MessageBody&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;aws&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;taskConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MessageGroupId&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetMessageGroupId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;taskConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MessageGroupId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;taskConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MessageDeduplicationId&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetMessageDeduplicationId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;taskConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MessageDeduplicationId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;taskConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DelaySeconds&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetDelaySeconds&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;taskConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DelaySeconds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;entries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;BatchResult&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Successful&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;BatchResultEntry&lt;/span&gt;
    &lt;span class="n"&gt;Failed&lt;/span&gt;     &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;BatchResultEntry&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// BatchResultEntry represents a single entry in a batch result&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;BatchResultEntry&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ID&lt;/span&gt;        &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;MessageID&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;Error&lt;/span&gt;     &lt;span class="kt"&gt;error&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Send batch&lt;/span&gt;
&lt;span class="n"&gt;input&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;sqs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SendMessageBatchInput&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;QueueUrl&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;stp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;Entries&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;  &lt;span class="n"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;stp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SendMessageBatchWithContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;handleSqsErrors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="c"&gt;// Mark all entries in this batch as failed&lt;/span&gt;
  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;batch&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Failed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Failed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BatchResultEntry&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="n"&gt;ID&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;    &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%d"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;continue&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Dedicated message details
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fohhnij2i5cnxq2bohnm9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fohhnij2i5cnxq2bohnm9.png" alt="img5.png" width="751" height="483"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Exactly this &lt;code&gt;messageID&lt;/code&gt; was returned in a batch is success response section.&lt;/p&gt;

&lt;h2&gt;
  
  
  Additional things to check and optimize
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Deduplication technic
&lt;/h3&gt;

&lt;p&gt;Before sending the messages, perform the deduplication—this will reduce the SQS iops usage, decrease the latency of processing and reduce the load on the consumer side, avoid unneeded storage read queries, rewrites, etc.&lt;/p&gt;

&lt;h3&gt;
  
  
  Distributed Tracing frameworks can consume SQS batch slot for metaiformation
&lt;/h3&gt;

&lt;p&gt;Some frameworks for distributed tracing are propagating metainformation through async transports like SQS.&lt;br&gt;
If you are using them, check integrations, it can affect the max batch size. I.e. &lt;strong&gt;Datadog&lt;/strong&gt; uses 1 batch element to propagate metainformation with tracing,&lt;br&gt;
that will be consumed and applied with span to a same trace.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;x-ray&lt;/strong&gt; since it is a proprietary &lt;strong&gt;AWS&lt;/strong&gt; technology does not utilize any slots in a SQS batch and uses UDP server to submit span/trace info.&lt;/p&gt;

&lt;h2&gt;
  
  
  Limitations:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;message size payload (1Mb)&lt;/li&gt;
&lt;li&gt;batch size (10 messages)&lt;/li&gt;
&lt;li&gt;payload serialization (JSON)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusions:
&lt;/h2&gt;

&lt;p&gt;Switching implementation from loop Send to Batch Send, allowed significantly decrease the overall timing, decrease network round trips and as a bonus decrease the SQS billing (due to decrease API calls in x10 times).&lt;/p&gt;

&lt;h2&gt;
  
  
  Links:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.datadoghq.com/tracing/guide/aws_payload_tagging/?tab=nodejs" rel="noopener noreferrer"&gt;https://docs.datadoghq.com/tracing/guide/aws_payload_tagging/?tab=nodejs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-batch-api-actions.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-batch-api-actions.html&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>monitoring</category>
      <category>aws</category>
      <category>go</category>
      <category>performance</category>
    </item>
    <item>
      <title>Orchestrating AI multi-agent infrastructure with AWS Bedrock, OpenAI and n8n</title>
      <dc:creator>Roman Tsypuk</dc:creator>
      <pubDate>Fri, 26 Sep 2025 17:39:06 +0000</pubDate>
      <link>https://dev.to/aws-builders/orchestrating-ai-multi-agent-infrastructure-with-aws-bedrock-openai-and-n8n-28p6</link>
      <guid>https://dev.to/aws-builders/orchestrating-ai-multi-agent-infrastructure-with-aws-bedrock-openai-and-n8n-28p6</guid>
      <description>&lt;h2&gt;
  
  
  Abstract
&lt;/h2&gt;

&lt;p&gt;This article explores how to build a &lt;code&gt;multi-agent&lt;/code&gt; AI ecosystem using &lt;code&gt;n8n&lt;/code&gt;, &lt;code&gt;AWS Bedrock&lt;/code&gt;, &lt;code&gt;OpenAI&lt;/code&gt;, and &lt;code&gt;MCP&lt;/code&gt; servers—all with a no-code approach.&lt;/p&gt;

&lt;h2&gt;
  
  
  The idea
&lt;/h2&gt;

&lt;p&gt;Each AI agent is designed with its own dedicated model (optimized for its role) and separate memory storage (ensuring context persistence and isolation). By connecting agents to AWS documentation via MCP, custom AWS news feeds via &lt;code&gt;JSON&lt;/code&gt;, and enabling &lt;code&gt;agent-to-agent&lt;/code&gt; communication, we demonstrate how to create a flexible system that interacts directly based on user prompts.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is AI-agent and its parts
&lt;/h2&gt;

&lt;p&gt;An AI agent is not just a single model—it’s a structured system made up of several interconnected components. Think of&lt;br&gt;
it as a worker in a digital team, equipped with a brain, memory, and tools.&lt;/p&gt;
&lt;h3&gt;
  
  
  Core Components of an AI Agent
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fttfumdldnt4mv01vfrhc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fttfumdldnt4mv01vfrhc.png" alt="img.png" width="621" height="402"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;LLM Model: The reasoning engine. Can be &lt;code&gt;OpenAI GPT&lt;/code&gt;, &lt;code&gt;Anthropic Claude&lt;/code&gt; (via AWS Bedrock), &lt;code&gt;Amazon Titan&lt;/code&gt;, or any other&lt;br&gt;
large language model.&lt;br&gt;
Responsible for interpreting prompts, generating responses, and orchestrating tool usage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Memory: A storage layer where past interactions and context are recorded.&lt;br&gt;
Often implemented as a database table or key/value storage. Ensures continuity—so the agent doesn’t “forget” what was&lt;br&gt;
asked earlier.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Tools: Interfaces that extend the agent’s capabilities. Examples: &lt;code&gt;HTTP endpoints&lt;/code&gt;, &lt;code&gt;databases&lt;/code&gt;, &lt;code&gt;MCP services&lt;/code&gt;, or custom&lt;br&gt;
&lt;code&gt;APIs&lt;/code&gt;.&lt;br&gt;
Tools allow the agent to access real-time or domain-specific data beyond the model’s training cutoff.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Connecting to AWS documentation server as MCP
&lt;/h3&gt;

&lt;p&gt;AWS offers a free streaming service for official documentation, and we can integrate it into our agent through the Model&lt;br&gt;
Context Protocol (MCP).&lt;/p&gt;

&lt;p&gt;By registering MCP client tools, our AI agent gains the ability to search, read, and recommend content directly from AWS&lt;br&gt;
docs.&lt;/p&gt;
&lt;h3&gt;
  
  
  AWS MCP configuration settings:
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Parameter&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Endpoint&lt;/td&gt;
&lt;td&gt;&lt;a href="https://knowledge-mcp.global.api.aws" rel="noopener noreferrer"&gt;https://knowledge-mcp.global.api.aws&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Server Transport&lt;/td&gt;
&lt;td&gt;HTTP streamable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Authentication&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tools&lt;/td&gt;
&lt;td&gt;read, recommend, search&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foguk8ga3ym090gbb379g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foguk8ga3ym090gbb379g.png" alt="img_1.png" width="674" height="399"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With this setup, whenever a user asks something like “How do I configure DynamoDB streams?”, the agent can fetch the&lt;br&gt;
latest instructions directly from the AWS documentation server.&lt;/p&gt;
&lt;h3&gt;
  
  
  Adding AI agent tools for AWS news
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7z0llai2151vy6afajrt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7z0llai2151vy6afajrt.png" alt="img_2.png" width="800" height="345"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Besides documentation, agents can also consume custom news feeds. I maintain a curated set of AWS news in JSON format,&lt;br&gt;
hosted on &lt;code&gt;GitHub Pages&lt;/code&gt;. These feeds cover categories such as architecture, big data, and machine learning.&lt;/p&gt;

&lt;p&gt;By connecting the agent’s HTTP tool to these feeds, we can provide real-time AWS announcements and updates inside the&lt;br&gt;
agent’s workflow.&lt;/p&gt;
&lt;h3&gt;
  
  
  REST http tools setup:
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;details&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Tool info&lt;/td&gt;
&lt;td&gt;Makes an HTTP request and returns the latest aws news Training Certification&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tool endpoint&lt;/td&gt;
&lt;td&gt;&lt;a href="https://tsypuk.github.io/aws-news/news/training_certification.json" rel="noopener noreferrer"&gt;https://tsypuk.github.io/aws-news/news/training_certification.json&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;details&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Tool info&lt;/td&gt;
&lt;td&gt;Makes an HTTP request and returns the latest aws news Architecture&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tool endpoint&lt;/td&gt;
&lt;td&gt;&lt;a href="https://tsypuk.github.io/aws-news/news/architecture.json" rel="noopener noreferrer"&gt;https://tsypuk.github.io/aws-news/news/architecture.json&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The full list of feeds is available in my repo: tsypuk/aws-news&lt;/p&gt;

&lt;p&gt;With this integration, a user can ask: “What are the latest updates in AWS architecture?”, and the agent will pull fresh&lt;br&gt;
content directly from the &lt;code&gt;JSON feed&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;You can get and use the full list of news at my &lt;a href="https://github.com/tsypuk/aws-news?tab=readme-ov-file" rel="noopener noreferrer"&gt;aws news repo&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Full Agent2Agent connection and communication
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkfg54ygy68a6u4a7waxl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkfg54ygy68a6u4a7waxl.png" alt="img_3.png" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once we have multiple agents ecosystem (&lt;code&gt;Documentation Agent&lt;/code&gt; + &lt;code&gt;News Agent&lt;/code&gt;), we can introduce a third agent as&lt;br&gt;
orchestrator.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Orchestrator Agent decides which agent to query.&lt;/li&gt;
&lt;li&gt;The Documentation Agent connects via MCP to AWS docs.&lt;/li&gt;
&lt;li&gt;The News Agent pulls JSON feeds with announcements.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Example:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Prompt: “Explain the latest DynamoDB updates and show me how to configure them.”&lt;/li&gt;
&lt;li&gt;Orchestrator → News Agent: get latest DynamoDB announcements.&lt;/li&gt;
&lt;li&gt;Orchestrator → Documentation Agent: fetch relevant setup docs.&lt;/li&gt;
&lt;li&gt;Orchestrator synthesizes the results into a single, user-friendly answer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The beauty of such configuration that each individual agent can be tuned and configured separately, i.e. we can set&lt;br&gt;
different model for each agent - for some agent will be enough small model, so there is no need&lt;br&gt;
to use more expensive model; we can use a specific model that provides better performance for a particular domain or&lt;br&gt;
trained on a dedicated dataset.&lt;/p&gt;

&lt;p&gt;This separation lowers costs while improving domain accuracy.&lt;/p&gt;
&lt;h3&gt;
  
  
  Checking the Memory storage
&lt;/h3&gt;

&lt;p&gt;AI agents persist interaction history in Memory, which is critical for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Maintaining context across multi-turn conversations.&lt;/li&gt;
&lt;li&gt;Remembering what data was fetched previously.&lt;/li&gt;
&lt;li&gt;Enabling personalization (e.g., user preferences).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Memory storage is organized as a simple table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;create&lt;/span&gt; &lt;span class="k"&gt;table&lt;/span&gt; &lt;span class="n"&gt;n8n_chat_reseach_histories&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt;         &lt;span class="nb"&gt;serial&lt;/span&gt;
        &lt;span class="k"&gt;primary&lt;/span&gt; &lt;span class="k"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;session_id&lt;/span&gt; &lt;span class="nb"&gt;varchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;message&lt;/span&gt;    &lt;span class="n"&gt;jsonb&lt;/span&gt;        &lt;span class="k"&gt;not&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's check how it is populated with data based on different agent flows communication.&lt;/p&gt;

&lt;h3&gt;
  
  
  Request of flow through aws news agent:
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;what are the latest news in s3 storage?&lt;br&gt;
{: .prompt-tip }&lt;/p&gt;

&lt;p&gt;Here are the latest AWS blog items related to Amazon S3 storage, with source links:&lt;br&gt;
Amazon SageMaker introduces Amazon S3-based shared storage for enhanced project collaboration. Link: &lt;a href="https://aws.amazon.com/blogs/big-data/amazon-sagemaker-introduces-amazon-s3-based-shared-storage-for-enhanced-project-collaboration/" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/big-data/amazon-sagemaker-introduces-amazon-s3-based-shared-storage-for-enhanced-project-collaboration/&lt;/a&gt;&lt;br&gt;
Optimize Amazon EMR runtime for Apache Spark with EMR S3A. Link: &lt;a href="https://aws.amazon.com/blogs/big-data/optimize-amazon-emr-runtime-for-apache-spark-with-emr-s3a/" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/big-data/optimize-amazon-emr-runtime-for-apache-spark-with-emr-s3a/&lt;/a&gt;&lt;br&gt;
A scalable, elastic database and search solution for 1B+ vectors built on LanceDB and Amazon S3. Link: &lt;a href="https://aws.amazon.com/blogs/architecture/a-scalable-elastic-database-and-search-solution-for-1b-vectors-built-on-lancedb-and-amazon-s3/" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/architecture/a-scalable-elastic-database-and-search-solution-for-1b-vectors-built-on-lancedb-and-amazon-s3/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Sources:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/big-data/amazon-sagemaker-introduces-amazon-s3-based-shared-storage-for-enhanced-project-collaboration/" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/big-data/amazon-sagemaker-introduces-amazon-s3-based-shared-storage-for-enhanced-project-collaboration/&lt;/a&gt;&lt;br&gt;
&lt;a href="https://aws.amazon.com/blogs/big-data/optimize-amazon-emr-runtime-for-apache-spark-with-emr-s3a/" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/big-data/optimize-amazon-emr-runtime-for-apache-spark-with-emr-s3a/&lt;/a&gt;&lt;br&gt;
&lt;a href="https://aws.amazon.com/blogs/architecture/a-scalable-elastic-database-and-search-solution-for-1b-vectors-built-on-lancedb-and-amazon-s3/" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/architecture/a-scalable-elastic-database-and-search-solution-for-1b-vectors-built-on-lancedb-and-amazon-s3/&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjev9qnhzg7bd6nai1md0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjev9qnhzg7bd6nai1md0.png" alt="img4.png" width="800" height="221"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Param&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;sessionid&lt;/td&gt;
&lt;td&gt;15925d170d1e4ce3956667dbb9b172ba&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;action&lt;/td&gt;
&lt;td&gt;sendMessage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;chatInput&lt;/td&gt;
&lt;td&gt;what are the latest news in s3 storage?&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Next, we see that agent called &lt;code&gt;Postgres&lt;/code&gt; to load Memory, but since it is a first interaction, there is empty chat history.&lt;br&gt;
Also, we can trace agent to agent communications.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftx6kbzhfr0hxh8vkw62g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftx6kbzhfr0hxh8vkw62g.png" alt="img5.png" width="800" height="303"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Agent calls tools for bigdata and for architecture to get the latest info about &lt;code&gt;S3&lt;/code&gt;.&lt;br&gt;
Just before the response, results are persisted in the memory storage:&lt;/p&gt;

&lt;h3&gt;
  
  
  Data from main AI-agent memory
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;session_id&lt;/th&gt;
&lt;th&gt;message&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;15925d170d1e4ce3956667dbb9b172ba&lt;/td&gt;
&lt;td&gt;{"type": "human", "content": "what are the latest news in s3 storage?. Include links to all used sources.", "additional_kwargs": {}, "response_metadata": {}}&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;15925d170d1e4ce3956667dbb9b172ba&lt;/td&gt;
&lt;td&gt;{"type": "ai", "content": "Here are the latest AWS blog items related to Amazon S3 storage, with source links:\n\n- Amazon SageMaker introduces Amazon S3-based shared storage for enhanced project collaboration. Link: &lt;a href="https://aws.amazon.com/blogs/big-data/amazon-sagemaker-introduces-amazon-s3-based-shared-storage-for-enhanced-project-collaboration/%5C%5Cn%5C%5Cn-" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/big-data/amazon-sagemaker-introduces-amazon-s3-based-shared-storage-for-enhanced-project-collaboration/\\n\\n-&lt;/a&gt; Optimize Amazon EMR runtime for Apache Spark with EMR S3A. Link: &lt;a href="https://aws.amazon.com/blogs/big-data/optimize-amazon-emr-runtime-for-apache-spark-with-emr-s3a/%5C%5Cn%5C%5Cn-" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/big-data/optimize-amazon-emr-runtime-for-apache-spark-with-emr-s3a/\\n\\n-&lt;/a&gt; A scalable, elastic database and search solution for 1B+ vectors built on LanceDB and Amazon S3. Link: &lt;a href="https://aws.amazon.com/blogs/architecture/a-scalable-elastic-database-and-search-solution-for-1b-vectors-built-on-lancedb-and-amazon-s3/%5C%5Cn%5C%5CnSources:%5C%5Cn-" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/architecture/a-scalable-elastic-database-and-search-solution-for-1b-vectors-built-on-lancedb-and-amazon-s3/\\n\\nSources:\\n-&lt;/a&gt; &lt;a href="https://aws.amazon.com/blogs/big-data/amazon-sagemaker-introduces-amazon-s3-based-shared-storage-for-enhanced-project-collaboration/%5C%5Cn-" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/big-data/amazon-sagemaker-introduces-amazon-s3-based-shared-storage-for-enhanced-project-collaboration/\\n-&lt;/a&gt; &lt;a href="https://aws.amazon.com/blogs/big-data/optimize-amazon-emr-runtime-for-apache-spark-with-emr-s3a/%5C%5Cn-" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/big-data/optimize-amazon-emr-runtime-for-apache-spark-with-emr-s3a/\\n-&lt;/a&gt; &lt;a href="https://aws.amazon.com/blogs/architecture/a-scalable-elastic-database-and-search-solution-for-1b-vectors-built-on-lancedb-and-amazon-s3/" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/architecture/a-scalable-elastic-database-and-search-solution-for-1b-vectors-built-on-lancedb-and-amazon-s3/&lt;/a&gt;", "tool_calls": [], "additional_kwargs": {}, "response_metadata": {}, "invalid_tool_calls": []}&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Data from latest-news AI-agent memory
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;session_id&lt;/th&gt;
&lt;th&gt;message&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;73&lt;/td&gt;
&lt;td&gt;15925d170d1e4ce3956667dbb9b172ba&lt;/td&gt;
&lt;td&gt;{"type": "human", "content": "You are news facts agent. Should provide accurate response based on latest news, in the response add section with links to all sources that you have used.\nPlease provide the latest news related to Amazon S3 storage, with links to all sources.", "additional_kwargs": {}, "response_metadata": {}}&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;74&lt;/td&gt;
&lt;td&gt;15925d170d1e4ce3956667dbb9b172ba&lt;/td&gt;
&lt;td&gt;{"type": "ai", "content": "Here are the latest AWS blog items related to Amazon S3 storage:\n\n- Amazon SageMaker introduces Amazon S3 based shared storage for enhanced project collaboration. Link: &lt;a href="https://aws.amazon.com/blogs/big-data/amazon-sagemaker-introduces-amazon-s3-based-shared-storage-for-enhanced-project-collaboration/%5C%5Cn%5C%5Cn-" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/big-data/amazon-sagemaker-introduces-amazon-s3-based-shared-storage-for-enhanced-project-collaboration/\\n\\n-&lt;/a&gt; Optimize Amazon EMR runtime for Apache Spark with EMR S3A. Link: &lt;a href="https://aws.amazon.com/blogs/big-data/optimize-amazon-emr-runtime-for-apache-spark-with-emr-s3a/%5C%5Cn%5C%5Cn-" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/big-data/optimize-amazon-emr-runtime-for-apache-spark-with-emr-s3a/\\n\\n-&lt;/a&gt; A scalable, elastic database and search solution for 1B+ vectors built on LanceDB and Amazon S3. Link: &lt;a href="https://aws.amazon.com/blogs/architecture/a-scalable-elastic-database-and-search-solution-for-1b-vectors-built-on-lancedb-and-amazon-s3/%5C%5Cn%5C%5CnSources%5C%5Cn-" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/architecture/a-scalable-elastic-database-and-search-solution-for-1b-vectors-built-on-lancedb-and-amazon-s3/\\n\\nSources\\n-&lt;/a&gt; &lt;a href="https://aws.amazon.com/blogs/big-data/amazon-sagemaker-introduces-amazon-s3-based-shared-storage-for-enhanced-project-collaboration/%5C%5Cn-" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/big-data/amazon-sagemaker-introduces-amazon-s3-based-shared-storage-for-enhanced-project-collaboration/\\n-&lt;/a&gt; &lt;a href="https://aws.amazon.com/blogs/big-data/optimize-amazon-emr-runtime-for-apache-spark-with-emr-s3a/%5C%5Cn-" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/big-data/optimize-amazon-emr-runtime-for-apache-spark-with-emr-s3a/\\n-&lt;/a&gt; &lt;a href="https://aws.amazon.com/blogs/architecture/a-scalable-elastic-database-and-search-solution-for-1b-vectors-built-on-lancedb-and-amazon-s3/" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/architecture/a-scalable-elastic-database-and-search-solution-for-1b-vectors-built-on-lancedb-and-amazon-s3/&lt;/a&gt;", "tool_calls": [], "additional_kwargs": {}, "response_metadata": {}, "invalid_tool_calls": []}&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Amazing that n8n stores history of executions and allows visually represent flows execution - what path the &lt;code&gt;ai-agent&lt;/code&gt; has chosen, what tools were invoked, etc.:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fav08fz2rkngm78r086sa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fav08fz2rkngm78r086sa.png" alt="img6.png" width="800" height="434"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Request of flow through documentation agent
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;what is the supported knowledge base storages for aws bedrock?&lt;br&gt;
{: .prompt-tip }&lt;/p&gt;

&lt;p&gt;Here are the storage options AWS Bedrock documents list as supported for Bedrock Knowledge Bases (for Retrieval Augmented Generation), with the official source links. Where possible, I note the exact integration concept and point you to the corresponding AWS docs for details on configuration and fields.&lt;/p&gt;

&lt;p&gt;Amazon OpenSearch Serverless (vector store)&lt;br&gt;
What it is: A vector search collection in OpenSearch Serverless used as Bedrock’s vector store for embeddings.&lt;br&gt;
How it integrates with Bedrock Knowledge Bases: Bedrock connects to an OpenSearch Serverless vector collection and stores embeddings plus text chunks and Bedrock metadata; mappings/fields are defined in the knowledge-base setup flow.&lt;br&gt;
Key docs/fields: Bedrock knowledge-base-setup guidance covers how to configure the OpenSearch Serverless integration.&lt;br&gt;
Source: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html&lt;/a&gt;&lt;br&gt;
Amazon OpenSearch Service (Managed Clusters) as vector store&lt;br&gt;
...&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Data from main AI-agent memory
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;session_id&lt;/th&gt;
&lt;th&gt;message&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;15925d170d1e4ce3956667dbb9b172ba&lt;/td&gt;
&lt;td&gt;{"type": "human", "content": "what is the supported knowledge base storages for aws bedrock. Include links to all used sources.", "additional_kwargs": {}, "response_metadata": {}}&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;15925d170d1e4ce3956667dbb9b172ba&lt;/td&gt;
&lt;td&gt;{"type": "ai", "content": "Here are the storage options AWS Bedrock documents list as supported for Bedrock Knowledge Bases (for Retrieval Augmented Generation), with the official source links. Where possible, I note the exact integration concept and point you to the corresponding AWS docs for details on configuration and fields.\n\n1) Amazon OpenSearch Serverless (vector store)\n- What it is: A vector search collection in OpenSearch Serverless used as Bedrock’s vector store for embeddings.\n- How it integrates with Bedrock Knowledge Bases: Bedrock connects to an OpenSearch Serverless vector collection and stores embeddings plus text chunks and Bedrock metadata; mappings/fields are defined in the knowledge-base setup flow.\n- Key docs/fields: Bedrock knowledge-base-setup guidance covers how to configure the OpenSearch Serverless integration.\n- Source: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5Cn2%5C" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\n2\&lt;/a&gt;) Amazon OpenSearch Service (Managed Clusters) as vector store\n- What it is: A managed OpenSearch Service domain used as a vector store for Bedrock Knowledge Bases.\n- How it integrates with Bedrock Knowledge Bases: Create a vector index on the OpenSearch domain and connect it to the Bedrock KB; mappings include vector index name, vector field, and metadata field, plus Bedrock’s text chunk fields.\n- Key docs/fields: Bedrock knowledge-base-setup guidance (OpenSearch Clusters section).\n- Source: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5Cn3%5C" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\n3\&lt;/a&gt;) Amazon S3 Vectors (vector store in S3)\n- What it is: Bedrock-supported vector storage that stores embeddings in S3-based vector structures.\n- How it integrates with Bedrock Knowledge Bases: Configure an S3 vector bucket/index; Bedrock stores embeddings and metadata and uses defined mappings for fields.\n- Key docs/fields: Bedrock knowledge-base-setup guidance (S3 Vectors section).\n- Source: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5Cn4%5C" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\n4\&lt;/a&gt;) Amazon Aurora (RDS) as vector store\n- What it is: Aurora database (PostgreSQL-compatible) used as the relational vector store for Bedrock Knowledge Bases.\n- How it integrates with Bedrock Knowledge Bases: Bedrock expects a relational table with embedding/vector, text chunks, and metadata columns; Bedrock maps UI/API fields to the table schema.\n- Key docs/fields: Bedrock knowledge-base-setup guidance (Aurora section).\n- Source: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5Cn5%5C" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\n5\&lt;/a&gt;) Neptune Analytics graphs (GraphRAG)\n- What it is: Neptune Analytics graphs used with GraphRAG as a Bedrock knowledge base storage option.\n- How it integrates with Bedrock Knowledge Bases: Create a Neptune Analytics graph and connect it to Bedrock KB; configure a vector index on the graph and map Bedrock embeddings/text/metadata to graph fields.\n- Key docs/fields: Bedrock knowledge-base-setup guidance (Neptune Analytics/GraphRAG section).\n- Source: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5Cn6%5C" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\n6\&lt;/a&gt;) Pinecone (vector store)\n- What it is: Pinecone vector index service used as a Bedrock Knowledge Bases vector store.\n- How it integrates with Bedrock Knowledge Bases: Provide Pinecone index name/endpoint and credentials (via Secrets Manager); Bedrock stores embeddings and metadata and maps fields accordingly.\n- Key docs/fields: Bedrock knowledge-base-setup guidance (Pinecone section).\n- Source: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5Cn7%5C" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\n7\&lt;/a&gt;) Redis Enterprise Cloud (vector store)\n- What it is: Redis Enterprise Cloud used as a Bedrock Knowledge Bases vector store.\n- How it integrates with Bedrock Knowledge Bases: Configure endpoint, vector index, vector field, metadata field; credentials via Secrets Manager and TLS settings as needed.\n- Key docs/fields: Bedrock knowledge-base-setup guidance (Redis Enterprise Cloud section).\n- Source: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5Cn8%5C" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\n8\&lt;/a&gt;) MongoDB Atlas (vector store)\n- What it is: MongoDB Atlas used as a Bedrock Knowledge Bases vector store.\n- How it integrates with Bedrock Knowledge Bases: Connect to Atlas, specify database/collection and vector index; Bedrock stores embeddings, text chunks, and metadata in the Atlas collection.\n- Key docs/fields: Bedrock knowledge-base-setup guidance (MongoDB Atlas section).\n- Source: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5Cn9%5C" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\n9\&lt;/a&gt;) Kendra GenAI index (special integration path)\n- What it is: Build a Bedrock knowledge base from an Amazon Kendra GenAI index to combine Kendra’s retrieval with Bedrock’s generation.\n- How it integrates with Bedrock Knowledge Bases: Create a KB from a Kendra GenAI index to reuse indexed content with Bedrock GenAI.\n- Key docs/fields: Bedrock knowledge-base-build-kendra-genai-index guidance.\n- Source: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build-kendra-genai-index.html%5C%5Cn%5C%5CnAdditional" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build-kendra-genai-index.html\\n\\nAdditional&lt;/a&gt; context and related docs\n- Overview and retrieval concepts for Bedrock Knowledge Bases: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html&lt;/a&gt; and &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/kb-how-retrieval.html%5C%5Cn-" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/kb-how-retrieval.html\\n-&lt;/a&gt; Prerequisites and setup for vector-store usage: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-prereq.html%5C%5Cn-" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-prereq.html\\n-&lt;/a&gt; Build or connect a knowledge base (general flow): &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build.html%5C%5Cn-" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build.html\\n-&lt;/a&gt; Kendra GenAI integration path (detailed steps): &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build-kendra-genai-index.html%5C%5Cn%5C%5CnWhat%E2%80%99s" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build-kendra-genai-index.html\\n\\nWhat’s&lt;/a&gt; New / related announcements (contextual confirmations)\n- OpenSearch-based vector storage support for Bedrock Knowledge Bases (GA/availability note): \n  &lt;a href="https://aws.amazon.com/about-aws/whats-new/2025/03/amazon-bedrock-knowledge-bases-opensearch-cluster-vector-storage/%5C%5Cn-" rel="noopener noreferrer"&gt;https://aws.amazon.com/about-aws/whats-new/2025/03/amazon-bedrock-knowledge-bases-opensearch-cluster-vector-storage/\\n-&lt;/a&gt; MongoDB Atlas vector storage support for Bedrock Knowledge Bases:\n  &lt;a href="https://aws.amazon.com/about-aws/whats-new/2024/05/knowledge-bases-amazon-bedrock-mongodb-atlas-vector-storage/%5C%5Cn%5C%5CnIf" rel="noopener noreferrer"&gt;https://aws.amazon.com/about-aws/whats-new/2024/05/knowledge-bases-amazon-bedrock-mongodb-atlas-vector-storage/\\n\\nIf&lt;/a&gt; you’d like, I can distill this into a compact quick-reference table (one line per option with integration point and official doc URL), or provide a per-option summary with exact field names as shown in the Bedrock setup docs. All information above is drawn from official AWS Bedrock documentation and related AWS What’s New posts linked here.", "tool_calls": [], "additional_kwargs": {}, "response_metadata": {}, "invalid_tool_calls": []}&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Data from aws documentation AI-agent memory
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;session_id&lt;/th&gt;
&lt;th&gt;message&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;15925d170d1e4ce3956667dbb9b172ba&lt;/td&gt;
&lt;td&gt;{"type": "human", "content": "You are research agent. Should provide accurate response based on documentation and statements, in the response add section with links to all sources that you have used.\nPlease provide the official AWS Bedrock documentation on supported knowledge base storages. Specifically, list the storage options (e.g., vector stores or document stores) that Bedrock can use for retrieval-augmented generation, or knowledge bases, including any required formats, integrations (such as S3, Kendra, etc.), and limitations. Include links to all used sources with exact URLs. If multiple AWS services are involved (Kendra, S3, etc.), include their relevant Bedrock integration docs. Provide only verifiable information from AWS docs.", "additional_kwargs": {}, "response_metadata": {}}&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;15925d170d1e4ce3956667dbb9b172ba&lt;/td&gt;
&lt;td&gt;{"type": "ai", "content": "Below is a consolidated, source-backed view of the official AWS Bedrock documentation on supported knowledge base storages (the vector stores and document stores Bedrock Knowledge Bases can use for Retrieval Augmented Generation, or knowledge bases). For each option, I note what it is, how it integrates with Bedrock Knowledge Bases, any required formats or fields, and any limitations or special notes found in AWS docs. Links to the exact AWS sources are provided at the end.\n\nWhat Bedrock Knowledge Bases can use for storage (summary)\n- Bedrock Knowledge Bases support storing and indexing data in multiple vector stores (and related data stores) to power RAG. Data from your sources is converted into vector embeddings and stored in a chosen vector store; Bedrock then retrieves relevant vectors and uses them to augment generation.\n- You can either connect to a data source directly (unstructured or structured) or use Bedrock’s built-in/managed vector stores; you can also create a knowledge base by connecting to a data source, or build a knowledge base with a Kendra GenAI index.\n- Important notes:\n  - Multimodal data (text plus images, charts, etc.) is supported only with Amazon S3 and custom data sources.\n  - Some vector stores support binary embeddings, others only floating-point embeddings; the available options and capabilities depend on the store.\n  - Vector stores require index/collection setup, including mapping fields for embeddings, text chunks, and metadata, as described in each store’s setup.\n  - There are integration previews/GA status and regional availability notes in various sources (see specific entries below).\n- Relevant Bedrock doc sections and integration pages include guides on building and using knowledge bases, setting up vector stores, and special integration options with Kendra GenAI.\n\nDetailed storage options (Bedrock Knowledge Bases) with integration details\n\n1) Amazon OpenSearch Serverless (vector storage)\n- What it is: A vector store option via Amazon OpenSearch Serverless that Bedrock Knowledge Bases can use for vector search indexing of embeddings.\n- How it integrates with Bedrock:\n  - You configure a vector search collection in OpenSearch Serverless and connect it to the Bedrock knowledge base as the vector store.\n  - You must align the vector embedding dimensions with the embeddings model you use.\n  - In Bedrock’s knowledge base setup, you map Bedrock to OpenSearch Serverless, including fields for embeddings, text chunks, and Bedrock-managed metadata.\n- Required formats and fields:\n  - Vector index configuration in OpenSearch Serverless with a vector field (embeddings) and metadata/text fields (e.g., AMAZON_BEDROCK_TEXT_CHUNK, AMAZON_BEDROCK_METADATA) as part of the index mapping.\n  - The embedding space is configured to use the selected embedding model; the vector type is typically a knn_vector with engine faiss and a suitable distance metric (euclidean commonly recommended for floating-point embeddings).\n- Limitations / notes:\n  - Guidance includes specific dimension recommendations and how to map Bedrock’s data into the vector index.\n  - Documentation notes the OpenSearch Serverless integration as a supported vector store; see the knowledge base setup materials for exact mapping details.\n- Source:\n  - Build a knowledge base by connecting to a data source (OpenSearch Serverless section) and general knowledge base setup (knowledge-base-setup.html)\n  - Knowledge Base setup text references OpenSearch Serverless as a vector store option and provides detailed steps and field mappings\n  - Knowledge-base-setup URL: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5Cn2%5C" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\n2\&lt;/a&gt;) Amazon OpenSearch Service Managed Clusters (vector storage)\n- What it is: A vector store option via Amazon OpenSearch Service domains (managed OpenSearch clusters) used as a vector store for Bedrock Knowledge Bases.\n- How it integrates with Bedrock:\n  - You create a vector index in the OpenSearch Service domain and connect it to the Bedrock knowledge base.\n  - Bedrock provides mappings for domain ARN, domain endpoint, vector index name, vector field name, and metadata field name.\n- Required formats and fields:\n  - Mappings include vectorIndexName, vectorField (embedding field), and metadataField, plus Bedrock-specific text chunk fields.\n- Limitations / notes:\n  - Prerequisites and permissions for using OpenSearch Managed Clusters are documented; there are guidance notes on permissions, encryption, and indexing requirements.\n  - Similar to OpenSearch Serverless, there are dimensionality requirements and embedding-field configurations you must align with your embeddings model.\n- Source:\n  - Knowledge-base-setup.html (OpenSearch Clusters section)\n  - knowledge-base-setup page content includes both OpenSearch Serverless and OpenSearch Managed Clusters guidance\n  - Knowledge-base-setup URL: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5Cn3%5C" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\n3\&lt;/a&gt;) Amazon S3 Vectors (vector storage in S3)\n- What it is: A vector store option that stores vector embeddings in S3 Vectors (S3-based vector storage). This is a Bedrock-supported way to keep embeddings in S3 with vector indexing.\n- How it integrates with Bedrock:\n  - You set up an S3 vector bucket and a corresponding vector index; Bedrock stores embeddings and metadata in S3-based vector structures.\n- Required formats and fields:\n  - Vector bucket ARN, vector index ARN, vector index name, and vector field name in Bedrock’s knowledge base setup (as fields to fill when creating the knowledge base).\n  - Metadata fields (e.g., AMZON_BEDROCK_TEXT, AMAZON_BEDROCK_METADATA) and the ability to attach non-filterable metadata.\n  - Dimensions: embedding dimension must be between 1 and 4096; S3 Vectors only supports floating-point embeddings.\n  - The knowledge base builder provides a sample of how to configure the vector index with fields for embeddings and metadata in a JSON-like mapping.\n- Limitations / notes:\n  - S3 Vectors integration is noted as a non-trivial, scalable vector storage solution; it has limits on the vector dimension, supports only floating-point embeddings, and provides metadata handling and filtering.\n  - Metadata limits and filtering behavior are described (e.g., 40 KB per vector metadata with 2 KB filterable portion, etc., in the general guidance). See the detailed guidance in the S3 vectors section.\n  - The integration can be in preview or general depending on the time/region; the Bedrock doc references the S3 Vectors integration as a supported store.\n- Source:\n  - Knowledge-base-setup.html (S3 Vectors section)\n  - Knowledge-base-setup HTML content explicitly describes S3 Vectors, vector bucket/ARN, vector index, dimension limits, and metadata handling\n  - Knowledge-base-setup URL: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5Cn4%5C" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\n4\&lt;/a&gt;) Amazon Aurora (RDS) vector store\n- What it is: A Bedrock-supported vector store using Amazon Aurora (PostgreSQL-compatible) as the data store for vector embeddings.\n- How it integrates with Bedrock:\n  - Bedrock expects a relational table to store embeddings, chunks (text), and metadata; you create a table with specific column names for embedding vectors and text chunks, plus a metadata column.\n  - You need to map Bedrock’s UI/API fields to the table’s schema when creating the knowledge base.\n- Required formats and fields:\n  - Columns including embedding (vector), chunks (text), and metadata (Bedrock-managed, plus optional custom metadata as needed).\n  - You must create a DB index on the vector column and text column; optional GIN index on metadata if using custom metadata.\n- Limitations / notes:\n  - The Aurora cluster must reside in the same AWS account as the Bedrock knowledge base.\n  - The table schema is fixed in Bedrock’s guidance; you must provide those fields when creating the knowledge base, and they cannot be updated after creation.\n- Source:\n  - Knowledge-base-setup.html (Aurora section)\n\n5) Neptune Analytics graphs (GraphRAG)\n- What it is: Neptune Analytics graphs used with GraphRAG (a Neptune-based vector-augmented approach) as a Bedrock knowledge base storage option.\n- How it integrates with Bedrock:\n  - You create a Neptune Analytics graph and connect it to Bedrock Knowledge Bases; you configure a vector search index on the graph and map Bedrock’s embeddings/text/metadata to the graph’s fields.\n- Required formats and fields:\n  - Graph ARN, vector index dimensions, and Bedrock text/metadata field mappings.\n- Limitations / notes:\n  - The guidance describes how to set up the graph and the vector index, including dimensions matching the embeddings model.\n- Source:\n  - Knowledge-base-setup.html (Neptune Analytics/GraphRAG section)\n\n6) Pinecone\n- What it is: Pinecone as a vector store option for Bedrock Knowledge Bases.\n- How it integrates with Bedrock:\n  - You set up a Pinecone index, provide endpoint URL, and provide credentials (credentials secret ARN) to Bedrock via AWS Secrets Manager.\n- Required formats and fields:\n  - Vector index name, endpoint URL, credentials secret ARN, optional KMS key for decrypting credentials.\n  - Metadata handling: text field for raw chunk text, metadata field for source attribution metadata, optional text search index name, etc.\n  - You must supply a secret in Secrets Manager with the API key for the Pinecone index (and secret ARN for Bedrock to use).\n- Limitations / notes:\n  - Pinecone integration requires providing access credentials securely via Secrets Manager.\n  - You’ll supply metadata/text field mappings for Bedrock to store and retrieve vectors and associated data.\n- Source:\n  - Knowledge-base-setup.html (Pinecone section)\n\n7) Redis Enterprise Cloud\n- What it is: Redis Enterprise Cloud as a vector store option for Bedrock Knowledge Bases.\n- How it integrates with Bedrock:\n  - You configure Redis connection settings via Bedrock (endpoint URL, vector index name, vector field, and metadata field). You must provide credentials via Secrets Manager and TLS settings as part of the integration.\n- Required formats and fields:\n  - Endpoint URL, vector index name, vector field, metadata field, and Bedrock-managed metadata naming.\n  - Secrets Manager secret with credentials (username, password, and TLS details, if applicable).\n- Limitations / notes:\n  - TLS and secret configuration requirements are described; you must provide secret values in Secrets Manager for Bedrock to use.\n- Source:\n  - Knowledge-base-setup.html (Redis Enterprise Cloud section)\n\n8) MongoDB Atlas\n- What it is: MongoDB Atlas as a vector store option for Bedrock Knowledge Bases.\n- How it integrates with Bedrock:\n  - You connect to a MongoDB Atlas cluster, configure the database, collection, and vector index; Bedrock will store embeddings, text chunks, and metadata in the Atlas collection.\n- Required formats and fields:\n  - Endpoint URL, database name, collection name, credentials secret ARN for Atlas user, vector index name, vector field name, text field name, metadata field name, optional text search/index fields.\n  - Optional PrivateLink for AWS PrivateLink connectivity.\n- Limitations / notes:\n  - Metadata filtering may require manual configuration in MongoDB Atlas vector index settings; some features require explicit configuration beyond the Bedrock setup.\n- Source:\n  - Knowledge-base-setup.html (MongoDB Atlas section)\n\n9) Other notes on storage options (summary from knowledge base setup)\n- The Bedrock Knowledge Bases setup emphasizes you can connect to various data sources (unstructured or structured). It also notes:\n  - You can choose to “set up your own supported vector store” or let Bedrock automatically create a vector store (e.g., via the Console for OpenSearch Serverless).\n  - If you plan to use structured data stores, you can transform queries into structured data language queries (like SQL).\n- OpenAI-style “Kendra GenAI” integration option\n  - Build a Bedrock knowledge base with an Amazon Kendra GenAI index to reuse indexed content and combine Bedrock GenAI with Kendra retrieval capabilities.\n  - This integration is documented as a specific knowledge base-building path.\n- Source:\n  - knowledge-base.html (Overview of knowledge bases)\n  - knowledge-base-build-kendra-genai-index.html (Kendra GenAI integration)\n\nKendra GenAI integration (special case)\n- What it is: A Bedrock knowledge base built from an Amazon Kendra GenAI index to combine Bedrock’s GenAI capabilities with Kendra’s retrieval strengths.\n- How it integrates with Bedrock:\n  - You create a Bedrock knowledge base from a Kendra GenAI index to reuse content across Bedrock applications.\n  - It leverages Kendra’s indexing + Bedrock’s generation capabilities for improved accuracy and retrieval.\n- Source:\n  - knowledge-base-build-kendra-genai-index.html\n\nRelated AWS documentation notes (context and confirmations)\n- OpenSearch-based vector storage integration status and availability:\n  - AWS What’s New: Amazon Bedrock Knowledge Bases now supports Amazon OpenSearch Managed Cluster for vector storage (general availability note).\n  - URL: &lt;a href="https://aws.amazon.com/about-aws/whats-new/2025/03/amazon-bedrock-knowledge-bases-opensearch-cluster-vector-storage/%5C%5Cn-" rel="noopener noreferrer"&gt;https://aws.amazon.com/about-aws/whats-new/2025/03/amazon-bedrock-knowledge-bases-opensearch-cluster-vector-storage/\\n-&lt;/a&gt; MongoDB Atlas integration for Bedrock Knowledge Bases (vector storage):\n  - AWS What’s New: Knowledge Bases for Amazon Bedrock now supports MongoDB Atlas for vector storage (vector storage availability).\n  - URL: &lt;a href="https://aws.amazon.com/about-aws/whats-new/2024/05/knowledge-bases-amazon-bedrock-mongodb-atlas-vector-storage/%5C%5Cn-" rel="noopener noreferrer"&gt;https://aws.amazon.com/about-aws/whats-new/2024/05/knowledge-bases-amazon-bedrock-mongodb-atlas-vector-storage/\\n-&lt;/a&gt; Retrieval Augmented Generation (RAG) guidance and integration options (AWS Prescriptive Guidance):\n  - URL: &lt;a href="https://docs.aws.amazon.com/prescriptive-guidance/latest/retrieval-augmented-generation-options/introduction.html%5C%5Cn-" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/prescriptive-guidance/latest/retrieval-augmented-generation-options/introduction.html\\n-&lt;/a&gt; General Bedrock knowledge base docs (for knowledge bases, setup, build, retrieval):\n  - Knowledge Bases main overview: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html%5C%5Cn" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html\\n&lt;/a&gt;  - Build a knowledge base by connecting to a data source: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build.html%5C%5Cn" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build.html\\n&lt;/a&gt;  - Prerequisites for creating a Bedrock knowledge base with a unstructured data source: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-prereq.html%5C%5Cn" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-prereq.html\\n&lt;/a&gt;  - Retrieving information from data sources using Bedrock Knowledge Bases: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/kb-how-retrieval.html%5C%5Cn" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/kb-how-retrieval.html\\n&lt;/a&gt;  - Installing and using Kendra GenAI index with Bedrock Knowledge Bases: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build-kendra-genai-index.html%5C%5Cn" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build-kendra-genai-index.html\\n&lt;/a&gt;  - Prerequisites and setup for using a vector store (OpenSearch, S3 Vectors, etc.): &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5CnSources" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\nSources&lt;/a&gt; (exact URLs used)\n- &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html%5C%5Cn-" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html\\n-&lt;/a&gt; &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build.html%5C%5Cn-" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build.html\\n-&lt;/a&gt; &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-prereq.html%5C%5Cn-" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-prereq.html\\n-&lt;/a&gt; &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/kb-how-retrieval.html%5C%5Cn-" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/kb-how-retrieval.html\\n-&lt;/a&gt; &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn-" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n-&lt;/a&gt; &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build-kendra-genai-index.html%5C%5Cn-" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build-kendra-genai-index.html\\n-&lt;/a&gt; &lt;a href="https://aws.amazon.com/about-aws/whats-new/2025/03/amazon-bedrock-knowledge-bases-opensearch-cluster-vector-storage/%5C%5Cn-" rel="noopener noreferrer"&gt;https://aws.amazon.com/about-aws/whats-new/2025/03/amazon-bedrock-knowledge-bases-opensearch-cluster-vector-storage/\\n-&lt;/a&gt; &lt;a href="https://aws.amazon.com/about-aws/whats-new/2024/05/knowledge-bases-amazon-bedrock-mongodb-atlas-vector-storage/%5C%5Cn-" rel="noopener noreferrer"&gt;https://aws.amazon.com/about-aws/whats-new/2024/05/knowledge-bases-amazon-bedrock-mongodb-atlas-vector-storage/\\n-&lt;/a&gt; &lt;a href="https://docs.aws.amazon.com/prescriptive-guidance/latest/retrieval-augmented-generation-options/introduction.html%5C%5Cn%5C%5CnIf" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/prescriptive-guidance/latest/retrieval-augmented-generation-options/introduction.html\\n\\nIf&lt;/a&gt; you’d like, I can format this into a compact quick-reference with a side-by-side comparison table of each storage option (store name, Bedrock integration points, required Bedrock fields, and notable limitations) and include only the most relevant bullets for quick decision-making.", "tool_calls": [], "additional_kwargs": {}, "response_metadata": {}, "invalid_tool_calls": []}&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;15925d170d1e4ce3956667dbb9b172ba&lt;/td&gt;
&lt;td&gt;{"type": "human", "content": "You are research agent. Should provide accurate response based on documentation and statements, in the response add section with links to all sources that you have used.\nProvide official AWS Bedrock documentation on supported knowledge base storages. List all storage options Bedrock Knowledge Bases can use for Retrieval Augmented Generation (RAG), including vector stores or document stores, integrations (S3, Kendra GenAI, OpenSearch, etc.), required formats/fields, limitations, and exact URLs to AWS docs. Include separate sources for each option if possible.", "additional_kwargs": {}, "response_metadata": {}}&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;15925d170d1e4ce3956667dbb9b172ba&lt;/td&gt;
&lt;td&gt;{"type": "ai", "content": "Official AWS Bedrock documentation: Supported knowledge base storages for Retrieval Augmented Generation (RAG)\n\nThe Bedrock Knowledge Bases support a variety of storage options (vector stores and document stores) that Bedrock can use for RAG. Below is a structured list of each option, what it is, how it integrates with Bedrock Knowledge Bases, the required formats/fields, notable limitations, and the exact AWS docs that describe it. When possible, each item links to separate Bedrock-specific documentation pages; in some cases the same Bedrock page covers multiple options.\n\n1) Amazon OpenSearch Serverless (vector store)\n- What it is\n  - A vector search collection in Amazon OpenSearch Serverless used as the vector store for embedding vectors generated from your data.\n- How it integrates with Bedrock Knowledge Bases\n  - In Bedrock Knowledge Bases, you map to an OpenSearch Serverless vector collection. Bedrock stores embeddings in a vector field and maps accompanying text chunks and Bedrock-managed metadata.\n- Required formats/fields (Bedrock setup guidance)\n  - Create a vector index with:\n    - A vector field (e.g., embeddings) using the faiss engine and an appropriate dimension (dimensions depend on your embedding model; Euclidean distance is recommended for floating-point embeddings).\n    - Metadata fields to pair with vectors (e.g., text chunks and Bedrock metadata).\n  - Mapping examples discuss:\n    - Field for the vector embeddings\n    - Field for the text chunks\n    - Bedrock-managed metadata field\n- Limitations / notes (Bedrock doc context)\n  - OpenSearch Serverless is one of the supported options for vector storage with explicit guidance on how to map Bedrock data into the index.\n  - Requires configuring permissions and collection details in OpenSearch Serverless; Bedrock provides the mapping fields in the knowledge-base setup flow.\n- Bedrock doc source\n  - Knowledge Base setup (OpenSearch Serverless section) \n  - URL: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5Cn2%5C" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\n2\&lt;/a&gt;) Amazon OpenSearch Service (Managed Clusters) as vector store\n- What it is\n  - A managed OpenSearch cluster (OpenSearch Service domain) used as a vector store for Bedrock Knowledge Bases.\n- How it integrates with Bedrock Knowledge Bases\n  - You create a vector index on the OpenSearch domain and connect it to the Bedrock knowledge base. Bedrock requires mappings for:\n    - Domain ARN, domain endpoint\n    - Vector index name, vector field, and metadata field\n- Required formats/fields (Bedrock setup guidance)\n  - Mappings include:\n    - vectorIndexName\n    - vectorField (embedding field)\n    - metadataField\n    - Bedrock text chunk and Bedrock metadata fields\n- Limitations / notes\n  - Prerequisites include required IAM permissions and domain configuration. Guidance covers encryption, indexing requirements, and domain capacity considerations.\n  - Dimensionality and embedding-field configurations must align with your embedding model (including K-NN index considerations when supported).\n- Bedrock doc source\n  - Knowledge Base setup (OpenSearch Clusters section)\n  - URL: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5Cn3%5C" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\n3\&lt;/a&gt;) Amazon S3 Vectors (vector store in S3)\n- What it is\n  - Vector storage in Amazon S3 using S3 Vectors (Bedrock-supported) to hold embeddings and related metadata.\n- How it integrates with Bedrock Knowledge Bases\n  - You configure an S3 vector bucket and a corresponding vector index. Bedrock stores embeddings and metadata in S3-based vector structures and uses a defined mapping for fields.\n- Required formats/fields\n  - Vector bucket ARN, vector index ARN, vector index name, and vector field name\n  - Metadata fields (Bedrock-managed) and text chunk fields\n  - Embedding dimension constraints (1 to 4096); only floating-point embeddings are supported\n  - Sample mappings show fields for:\n    - embeddings field\n    - text chunk field\n    - metadata field\n- Limitations / notes\n  - S3 Vectors integration is noted as a supported (and scalable) vector store, but described with several constraints:\n    - Preview status (as of documentation) and ongoing availability notes\n    - Dimension limits and floating-point embeddings only\n    - Metadata handling and filtering limitations (e.g., 40 KB per vector metadata with 2 KB filterable portion)\n- Bedrock doc source\n  - Knowledge Base setup (S3 Vectors section)\n  - URL: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5Cn4%5C" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\n4\&lt;/a&gt;) Amazon Aurora (RDS) vector store\n- What it is\n  - A Bedrock-supported vector store using Amazon Aurora (PostgreSQL-compatible) as the data store for embeddings.\n- How it integrates with Bedrock Knowledge Bases\n  - Bedrock expects a relational table with:\n    - An embedding/vector column\n    - A text chunks column\n    - A metadata column (Bedrock-managed, plus optional custom metadata)\n  - Bedrock maps its UI/API fields to the table schema during knowledge base creation.\n- Required formats/fields\n  - Relational table with columns for:\n    - embedding vector\n    - text chunks\n    - metadata\n  - Optional metadata filtering/indexing (e.g., GIN index)\n- Limitations / notes\n  - Aurora cluster must be in the same AWS account as the Bedrock knowledge base.\n  - The table schema is fixed per Bedrock guidance and cannot be updated after creation.\n- Bedrock doc source\n  - Knowledge Base setup (Aurora section)\n  - URL: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5Cn5%5C" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\n5\&lt;/a&gt;) Neptune Analytics graphs (GraphRAG)\n- What it is\n  - Neptune Analytics graphs used with GraphRAG as a Bedrock knowledge base storage option.\n- How it integrates with Bedrock Knowledge Bases\n  - Create a Neptune Analytics graph and connect it to Bedrock Knowledge Bases; configure a vector index on the graph and map Bedrock embeddings/text/metadata to the graph’s fields.\n- Required formats/fields\n  - Graph ARN, vector index dimensions, and Bedrock text/metadata field mappings\n- Limitations / notes\n  - Guidance covers graph/vector index setup and dimension matching to embedding models.\n- Bedrock doc source\n  - Knowledge Base setup (Neptune Analytics/GraphRAG section)\n  - URL: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5Cn6%5C" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\n6\&lt;/a&gt;) Pinecone (vector store)\n- What it is\n  - Pinecone as a dedicated vector index service to store and query embeddings for Bedrock Knowledge Bases.\n- How it integrates with Bedrock Knowledge Bases\n  - Bedrock references a Pinecone index (name), endpoint URL, and credentials stored in AWS Secrets Manager (secret ARN; optional KMS key for decryption).\n- Required formats/fields\n  - Vector index name\n  - Endpoint URL\n  - Secrets Manager credentials secret ARN (and optional KMS decryption key)\n  - Metadata/text fields to store the raw chunk text and source metadata\n  - Optional text-search index name\n- Limitations / notes\n  - Credentials must be provided securely via Secrets Manager\n  - Metadata/text field mappings must be supplied to Bedrock for storage and retrieval\n- Bedrock doc source\n  - Knowledge Base setup (Pinecone section)\n  - URL: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5Cn7%5C" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\n7\&lt;/a&gt;) Redis Enterprise Cloud (vector store)\n- What it is\n  - Redis Enterprise Cloud as a vector store option for Bedrock Knowledge Bases.\n- How it integrates with Bedrock Knowledge Bases\n  - Bedrock connects to Redis, providing endpoint URL, vector index name, vector field, and metadata field; credentials via Secrets Manager and TLS settings are configured as part of the integration.\n- Required formats/fields\n  - Endpoint URL\n  - Vector index name\n  - Vector field\n  - Metadata field\n  - Bedrock-managed metadata naming\n  - Secrets Manager secret with credentials (and TLS details if applicable)\n- Limitations / notes\n  - TLS and secret configuration requirements; secrets must be provided to Bedrock\n- Bedrock doc source\n  - Knowledge Base setup (Redis Enterprise Cloud section)\n  - URL: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5Cn8%5C" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\n8\&lt;/a&gt;) MongoDB Atlas (vector store)\n- What it is\n  - MongoDB Atlas as a vector store option for Bedrock Knowledge Bases.\n- How it integrates with Bedrock Knowledge Bases\n  - Connect to a MongoDB Atlas cluster, configure database/collection and vector index; Bedrock stores embeddings, text chunks, and metadata in the Atlas collection.\n- Required formats/fields\n  - Endpoint URL\n  - Database name\n  - Collection name\n  - Credentials secret ARN for Atlas user\n  - Vector index name\n  - Vector field name\n  - Text field name\n  - Metadata field name\n  - Optional PrivateLink connectivity\n- Limitations / notes\n  - Metadata filtering may require additional Atlas vector index configuration; some features may require explicit setup beyond Bedrock’s default guidance\n- Bedrock doc source\n  - Knowledge Base setup (MongoDB Atlas section)\n  - URL: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn%5C%5Cn9%5C" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n\\n9\&lt;/a&gt;) Kendra GenAI index (special integration path for Bedrock)\n- What it is\n  - Build a Bedrock knowledge base from an Amazon Kendra GenAI index to combine Bedrock GenAI with Kendra’s retrieval capabilities.\n- How it integrates with Bedrock Knowledge Bases\n  - The Bedrock knowledge base reuses content indexed in a Kendra GenAI index; benefits include reusing indexed content and leveraging Bedrock’s generation features with Kendra’s retrieval strength.\n- Required formats/fields\n  - You build a Bedrock KB from a Kendra GenAI index; the process is documented in Bedrock’s KB build flow for Kendra GenAI\n- Limitations / notes\n  - This is a specialized integration path that leverages Kendra GenAI indexing for Bedrock KB creation\n- Bedrock doc source\n  - Knowledge Base build path for Kendra GenAI index\n  - URL: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build-kendra-genai-index.html%5C%5Cn%5C%5CnNotes" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build-kendra-genai-index.html\\n\\nNotes&lt;/a&gt; on broader context (RAG and Bedrock knowledge bases)\n- RAG in Bedrock relies on combining retrieved source content with generation. Bedrock provides operations such as Retrieve, RetrieveAndGenerate, and GenerateQuery to support the RAG workflow; see the Bedrock knowledge base retrieval overview for context.\n  - Bedrock retrieval and RAG overview: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/kb-how-retrieval.html%5C%5Cn" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/kb-how-retrieval.html\\n&lt;/a&gt;  - General knowledge base concepts (overview, building, prerequisites, etc.): &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html%5C%5Cn" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html\\n&lt;/a&gt;  - Prerequisites for unstructured data sources and vector stores: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-prereq.html%5C%5Cn" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-prereq.html\\n&lt;/a&gt;  - Build or connect a knowledge base (data source connections): &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build.html%5C%5Cn" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build.html\\n&lt;/a&gt;  - Build a knowledge base with Kendra GenAI index: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build-kendra-genai-index.html%5C%5Cn%5C%5CnSources" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build-kendra-genai-index.html\\n\\nSources&lt;/a&gt; (exact URLs used for the above options)\n- Knowledge Base setup (general, includes multiple storage options)\n  - &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html%5C%5Cn-" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html\\n-&lt;/a&gt; Knowledge Base build: Kendra GenAI index path\n  - &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build-kendra-genai-index.html%5C%5Cn-" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build-kendra-genai-index.html\\n-&lt;/a&gt; Knowledge Bases overview and retrieval capabilities\n  - &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html%5C%5Cn" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html\\n&lt;/a&gt;  - &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/kb-how-retrieval.html%5C%5Cn-" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/kb-how-retrieval.html\\n-&lt;/a&gt; Prerequisites for a Bedrock knowledge base with unstructured data (vector store setup)\n  - &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-prereq.html%5C%5Cn-" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-prereq.html\\n-&lt;/a&gt; Build a knowledge base with a Kendra GenAI index (detailed steps)\n  - &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build-kendra-genai-index.html%5C%5Cn%5C%5CnIf" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-build-kendra-genai-index.html\\n\\nIf&lt;/a&gt; you’d like a quick-reference comparison, I can format this into a compact side-by-side table listing: storage option, Bedrock integration point, required Bedrock fields, known limitations, and the exact source URL for each. All information above is drawn directly from the official Bedrock documentation pages cited.", "tool_calls": [], "additional_kwargs": {}, "response_metadata": {}, "invalid_tool_calls": []}&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Checking consumption of tokens and switching between different models
&lt;/h3&gt;

&lt;p&gt;Another practical concern is token usage.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Using a large model for every task is wasteful.&lt;/li&gt;
&lt;li&gt;Instead, agents can switch between models dynamically:&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;Nano GPT-5.1&lt;/code&gt; (cheap, fast) for small lookups.&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;AWS Bedrock Claude 3.5 Sonnet&lt;/code&gt; (accurate, large context) for reasoning-heavy prompts.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This hybrid approach optimizes both cost and performance.&lt;/p&gt;

&lt;h2&gt;
  
  
  The flow context data is fully customizable
&lt;/h2&gt;

&lt;p&gt;The extracted data and accumulated response with LLM can be routed to any other flow, custom node chat output, etc.&lt;br&gt;
Here, for any prompt besides providing the response to chat we are creating &lt;code&gt;rtf&lt;/code&gt; document and persist it on a file system&lt;br&gt;
for further review and usage.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fopkievslyrsyzxbls5p7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fopkievslyrsyzxbls5p7.png" alt="img7.png" width="800" height="291"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusions
&lt;/h2&gt;

&lt;p&gt;What we’ve built is a full end-to-end multi-agent ecosystem—all without writing a single line of code. By leveraging n8n’s no-code orchestration, the system allows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Natural interaction: Agents collaborate dynamically based on user prompts, without predefined rigid flows.&lt;/li&gt;
&lt;li&gt;Dedicated integrations: Each agent can connect to specialized MCP servers, custom tools, or even other agents, extending its knowledge far beyond the base LLM.&lt;/li&gt;
&lt;li&gt;Persistent memory: All interactions and context are stored in memory, so agents can build on previous sessions instead of starting from scratch.&lt;/li&gt;
&lt;li&gt;Transparency and control: With UI-based execution dumps, we can inspect how decisions were made, track history, and debug workflows visually.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This setup proves that multi-agent systems don’t have to be locked away in research papers—they can be practical, maintainable, and production-ready, combining Bedrock models, lightweight GPTs, and n8n’s no-code tools into a flexible AI ecosystem that feels less like a chatbot and more like a team of digital experts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Links:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/tsypuk/aws-news?tab=readme-ov-file" rel="noopener noreferrer"&gt;https://github.com/tsypuk/aws-news?tab=readme-ov-file&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cloud.google.com/discover/what-are-ai-agents" rel="noopener noreferrer"&gt;https://cloud.google.com/discover/what-are-ai-agents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/bedrock/" rel="noopener noreferrer"&gt;https://aws.amazon.com/bedrock/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://n8n.io/" rel="noopener noreferrer"&gt;https://n8n.io/&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>architecture</category>
      <category>aws</category>
      <category>automation</category>
    </item>
    <item>
      <title>AWS OpenSearch Full Documents Reindexing: When? Why? How?</title>
      <dc:creator>Roman Tsypuk</dc:creator>
      <pubDate>Mon, 22 Sep 2025 06:02:00 +0000</pubDate>
      <link>https://dev.to/aws-builders/aws-opensearch-full-documents-reindexing-when-why-how-20c3</link>
      <guid>https://dev.to/aws-builders/aws-opensearch-full-documents-reindexing-when-why-how-20c3</guid>
      <description>&lt;p&gt;We will explore AWS OpenSearch Service, how it was introduced and deep dive into challenge of documents reindexing.&lt;/p&gt;

&lt;h2&gt;
  
  
  OpenSearch Brief History: The Forking of Elasticsearch
&lt;/h2&gt;

&lt;p&gt;Elasticsearch is an open-source search engine developed by Elastic NV. It became incredibly popular due to its scalability, distributed nature, and powerful search capabilities. However, in 2021, Elastic NV changed the licensing model of Elasticsearch from &lt;strong&gt;Apache 2.0&lt;/strong&gt; to a &lt;strong&gt;Server Side Public License (SSPL)&lt;/strong&gt;.&lt;br&gt;
This move was made to prevent cloud providers from offering Elasticsearch as a managed service without contributing back to the open-source community.&lt;/p&gt;

&lt;p&gt;In response, AWS decided to fork the last &lt;strong&gt;Apache 2.0-licensed&lt;/strong&gt; version of Elasticsearch and create a new service called Amazon OpenSearch Service. This fork not only preserved the open-source nature of the software but also allowed AWS to continue offering a managed search service with full control over its development.&lt;/p&gt;
&lt;h2&gt;
  
  
  Licensing
&lt;/h2&gt;

&lt;p&gt;After the license change, Elasticsearch is now under the SSPL, which is not recognized as an open-source license by the Open Source Initiative (OSI). The SSPL imposes restrictions on how the software can be used, particularly for cloud services.&lt;/p&gt;

&lt;p&gt;ES provides Elastic Cloud installation that can be installed into any public cloud provider resources or on-prem.&lt;/p&gt;

&lt;p&gt;AWS OpenSearch: OpenSearch remains under the Apache 2.0 license, which is fully open-source. This means anyone can use, modify, and distribute the software without restrictions, making it more attractive for users who prefer open-source solutions.&lt;/p&gt;
&lt;h3&gt;
  
  
  Understanding AWS Elasticsearch: Internal Implementation of Indices and Reindexing
&lt;/h3&gt;

&lt;p&gt;Amazon Elasticsearch Service (Amazon ES), now known as Amazon OpenSearch Service, is a managed service that makes it easy to deploy, operate, and scale Elasticsearch clusters in the AWS Cloud. It’s&lt;br&gt;
widely used for real-time application monitoring, log analytics, full-text search, and more. One of the core components of Elasticsearch is its indices, which serve as the primary structure for&lt;br&gt;
storing and querying data. This article delves into the internal implementation of indices in AWS Elasticsearch and provides a guide on how to reindex a database within this environment.&lt;/p&gt;
&lt;h3&gt;
  
  
  What is an Index in Elasticsearch?
&lt;/h3&gt;

&lt;p&gt;An index in Elasticsearch is akin to a database in traditional relational databases. It contains a collection of documents that are stored and managed together. Each document is a JSON object, and&lt;br&gt;
each field within the document is a data point that can be searched and analyzed.&lt;/p&gt;
&lt;h3&gt;
  
  
  Key Components of an Elasticsearch Index:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Shards: An index is divided into smaller pieces called shards. Each shard is a self-contained, fully functional instance of Lucene, the underlying search engine for Elasticsearch.&lt;/li&gt;
&lt;li&gt;Replicas: For high availability and fault tolerance, Elasticsearch allows you to create replica shards. These are copies of the primary shards that can serve search requests in case the primary shard fails.&lt;/li&gt;
&lt;li&gt;Mappings: Mappings define the structure of the documents within an index, including the data types of fields and how they should be indexed and stored.&lt;/li&gt;
&lt;li&gt;Internal Implementation of Indices in AWS Elasticsearch&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AWS Elasticsearch handles indices similarly to a standard Elasticsearch deployment but with additional layers of management, security, and scaling capabilities provided by AWS.&lt;/p&gt;
&lt;h3&gt;
  
  
  Key Aspects of AWS Elasticsearch Indices:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Managed Clusters: AWS Elasticsearch manages the underlying infrastructure, including node provisioning, shard allocation, and index replication.&lt;/li&gt;
&lt;li&gt;Scaling: Indices can be scaled horizontally by adjusting the number of shards and vertically by increasing the instance size or adding more nodes.&lt;/li&gt;
&lt;li&gt;Security: AWS integrates Elasticsearch with other AWS services like AWS IAM, AWS KMS for encryption, and VPC for network isolation, ensuring that your indices are secure and accessible only to
authorized users.&lt;/li&gt;
&lt;li&gt;Snapshots: AWS Elasticsearch provides automated snapshots for indices, which are stored in Amazon S3 and can be used for backup and recovery purposes.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Why Reindexing is Important
&lt;/h3&gt;

&lt;p&gt;Reindexing in Elasticsearch is the process of copying the data from one index to another. This is often necessary when you need to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Change the structure of the index, such as altering mappings (changing types).&lt;/li&gt;
&lt;li&gt;Improve performance by reconfiguring the number of shards or replicas (also merge/split indecies).&lt;/li&gt;
&lt;li&gt;Upgrade Elasticsearch versions that might require data format changes.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Introduce new fields in documents and make them available for search&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Reindexing can be a resource-intensive operation, and AWS Elasticsearch provides tools and best practices to ensure that it’s done efficiently without disrupting service availability.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Steps to Reindex in AWS Elasticsearch. Here’s a step-by-step guide to reindexing an index in AWS Elasticsearch:
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Step 1: Create the Target Index
&lt;/h3&gt;

&lt;p&gt;Before you start reindexing, you need to create the target index with the desired mappings, settings, and shard configuration.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;PUT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/new-index&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"settings"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"number_of_shards"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"number_of_replicas"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mappings"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"properties"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"field1"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"text"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"field2"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"date"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Use the Reindex API
&lt;/h3&gt;

&lt;p&gt;Elasticsearch provides a _reindex API that allows you to copy data from the source index to the target index.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;POST&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/_reindex&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"index"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"old-index"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"dest"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"index"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"new-index"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Monitor the Reindexing Process
&lt;/h3&gt;

&lt;p&gt;Reindexing can take time depending on the size of your data. You can monitor the progress using the Task API.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;GET&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/_tasks?detailed=&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="err"&gt;&amp;amp;actions=*reindex&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Update Aliases (Optional)
&lt;/h3&gt;

&lt;p&gt;Once reindexing is complete, you may want to switch an alias to point to the new index.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;POST&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/_aliases&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"actions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"remove"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"index"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"old-index"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"alias"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"my-alias"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"add"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"index"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"new-index"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"alias"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"my-alias"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5: Delete the Old Index (Optional)
&lt;/h3&gt;

&lt;p&gt;After verifying that the new index is functioning correctly, you can delete the old index to free up resources.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;DELETE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/old-index&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Also, during indexing it is possible to update structure of documents (by adding script section):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;POST&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;_reindex&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"index"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"source"&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"dest"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"index"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"destination"&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"script"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"lang"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"painless"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"ctx._account.number++"&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  No indices' migration technic (acutal for smaller clusters)
&lt;/h2&gt;

&lt;p&gt;If the Data volume is not high, once index template is updated, all new indices will be created from new template (with new fields available for search).&lt;br&gt;
For all existing indicies we need update index mapping (this will allow any update/adding documents to these indices to be in the latest schema).&lt;/p&gt;

&lt;p&gt;For older documents we can run scripted update with defined predicate to update needed documents with new fields default values:&lt;/p&gt;

&lt;h3&gt;
  
  
  Step1: Patch search template
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;PUT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;_template/&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;index&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;...full_body&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;with&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;new&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;mappings&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step2: Add to each index new mapping
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;PUT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;index&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;/_mappings&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"properties"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"new_field"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"null_value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"boolean"&lt;/span&gt;&lt;span class="w"&gt;

  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step3: Patch all documents with scripted update
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;POST&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;index&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;/_update_by_query&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"bool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"must_not"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"exists"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"field"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"hidden"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"script"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ctx._source.new_field = false"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"lang"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"painless"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Best Practices for Reindexing in AWS Elasticsearch
&lt;/h2&gt;

&lt;p&gt;Snapshot Before Reindexing: Always take a snapshot of your data before starting the reindexing process to safeguard against data loss.&lt;br&gt;
Monitor Cluster Health: Keep an eye on the cluster’s health during reindexing to avoid overwhelming the system.&lt;br&gt;
Use Aliases: Aliases can help minimize downtime by allowing you to switch indices without changing your application code.&lt;br&gt;
Test in Staging: Before reindexing in production, test the process in a staging environment to catch any potential issues.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Reindexing in AWS Elasticsearch is a crucial operation for maintaining and optimizing your search infrastructure. By understanding the internal implementation of indices and following best practices&lt;br&gt;
for reindexing, you can ensure that your Elasticsearch environment remains robust, scalable, and ready to meet the demands of your applications. AWS Elasticsearch, with its managed capabilities and&lt;br&gt;
tight integration with other AWS services, provides a powerful platform for deploying and managing your search workloads.&lt;/p&gt;

&lt;h2&gt;
  
  
  References (Links)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/opensearch-service/latest/developerguide/remote-reindex.html" rel="noopener noreferrer"&gt;Amazon OpenSearch Indices migration&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>database</category>
      <category>aws</category>
      <category>opensource</category>
      <category>devops</category>
    </item>
    <item>
      <title>AWS-nuke controlled resources cleanup in multiple aws accounts</title>
      <dc:creator>Roman Tsypuk</dc:creator>
      <pubDate>Mon, 22 Sep 2025 06:01:29 +0000</pubDate>
      <link>https://dev.to/aws-builders/aws-nuke-controlled-resources-cleanup-in-multiple-aws-accounts-313d</link>
      <guid>https://dev.to/aws-builders/aws-nuke-controlled-resources-cleanup-in-multiple-aws-accounts-313d</guid>
      <description>&lt;p&gt;AWS-Nuke is an open-source tool, designed to automate the removal of AWS resources from an account. It can be used to quickly and safely delete all resources in a given AWS account, or it can be customized to delete specific types of resources based on user-defined rules.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is AWS-Nuke?
&lt;/h3&gt;

&lt;p&gt;Whether you’re spinning up test environments or decommissioning an old project, AWS-Nuke is particularly useful for keeping AWS accounts clean, ensuring that old, unused resources do not pile up, leading to cost overruns, security risks, and unnecessary complexity.&lt;/p&gt;

&lt;p&gt;AWS-Nuke is a command-line tool that supports multiple AWS services and can be run from any machine that has access to AWS credentials.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Use AWS-Nuke?
&lt;/h3&gt;

&lt;p&gt;Before diving into the technical details of AWS-Nuke, it’s important to understand why this tool is useful.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Cost Management: One of the biggest challenges in cloud environments is managing costs. AWS charges based on the resources you use, and sometimes resources that are no longer in use are forgotten and left running. By using AWS-Nuke, you can regularly clean up old resources, ensuring that you only pay for what you actively use.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Environment Cleanup: In development and test environments, resources are often created, modified, and left without proper deletion. AWS-Nuke can help by regularly purging these environments of unnecessary resources, making it easier to manage infrastructure.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Security: Resources left running for extended periods can pose security risks. For instance, unused EC2 instances, security groups, or S3 buckets might expose your environment to attacks. AWS-Nuke helps minimize such risks by cleaning up unused resources.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Automation: AWS-Nuke offers automation capabilities, making it easier to integrate into continuous integration (CI) pipelines or scheduled jobs for periodic account cleanups.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Simplification: With hundreds of AWS services and resources to manage, AWS-Nuke simplifies the task of cleaning up resources by providing a centralized and automated way to perform deletions.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  How AWS-Nuke Works
&lt;/h3&gt;

&lt;p&gt;AWS-Nuke works by scanning all the resources in your AWS account and then deleting them based on a configuration file. You can configure AWS-Nuke to delete specific types of resources or to leave certain critical resources untouched.&lt;/p&gt;

&lt;p&gt;AWS-Nuke uses the AWS SDK to list and delete resources. It can handle a wide range of AWS services, including but not limited to:&lt;/p&gt;

&lt;p&gt;EC2 Instances&lt;br&gt;
S3 Buckets&lt;br&gt;
RDS Instances&lt;br&gt;
IAM Users and Roles&lt;br&gt;
VPCs&lt;br&gt;
Elastic Load Balancers&lt;br&gt;
CloudWatch Alarms&lt;br&gt;
CloudFormation Stacks&lt;br&gt;
ECR Repositories&lt;/p&gt;
&lt;h3&gt;
  
  
  Configuration
&lt;/h3&gt;

&lt;p&gt;AWS-Nuke uses a YAML configuration file to specify what resources should or should not be deleted. This configuration file is highly customizable, allowing users to target specific AWS resources while leaving others intact.&lt;/p&gt;

&lt;p&gt;Here’s a simple example of an AWS-Nuke configuration:&lt;/p&gt;

&lt;p&gt;yaml&lt;br&gt;
Copy code&lt;br&gt;
regions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"us-east-1"&lt;/li&gt;
&lt;li&gt;"us-west-2"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;resource-types:&lt;br&gt;
excludes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;EC2::KeyPair&lt;/li&gt;
&lt;li&gt;IAM::Role&lt;/li&gt;
&lt;li&gt;IAM::User
This configuration specifies that AWS-Nuke should only run in the us-east-1 and us-west-2 regions, and it will exclude deleting EC2 Key Pairs, IAM Roles, and IAM Users.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Dry-Run Mode&lt;br&gt;
One of AWS-Nuke's most useful features is the dry-run mode, which allows users to see what resources will be deleted without actually deleting them. This is useful for reviewing the results of a cleanup before making any irreversible changes.&lt;/p&gt;

&lt;p&gt;To run AWS-Nuke in dry-run mode, simply use the --dry-run flag:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws-nuke &lt;span class="nt"&gt;--config&lt;/span&gt; config.yml &lt;span class="nt"&gt;--dry-run&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command will output a list of resources that AWS-Nuke will delete based on the configuration file, without actually performing the deletions.&lt;/p&gt;

&lt;p&gt;Running AWS-Nuke&lt;br&gt;
Once you’re confident in your configuration, running AWS-Nuke is simple. First, ensure you have the correct AWS credentials for the account you want to clean up. Then, use the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws-nuke &lt;span class="nt"&gt;--config&lt;/span&gt; config.yml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command will delete all resources in the account, except for the ones excluded in the configuration.&lt;/p&gt;

&lt;p&gt;Resource Dependencies&lt;br&gt;
AWS-Nuke also handles resource dependencies. For example, it will not delete an S3 bucket until all objects within the bucket have been deleted. This ensures that AWS resources are deleted in the correct order, preventing errors and failed deletions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;aws-nuke
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After checking version installed from brew repo is outdated:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt; aws-nuke &lt;span class="nt"&gt;-c&lt;/span&gt; config/nuke-config.yml &lt;span class="nt"&gt;--profile&lt;/span&gt; technology
Incorrect Usage: flag provided but not defined: &lt;span class="nt"&gt;-c&lt;/span&gt;

NAME:
   aws-nuke - remove everything from an aws account

USAGE:
   aws-nuke &lt;span class="o"&gt;[&lt;/span&gt;global options] &lt;span class="nb"&gt;command&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;command &lt;/span&gt;options]

VERSION:
   3.22.0

AUTHOR:
   Erik Kristensen &amp;lt;erik@erikkristensen.com&amp;gt;

FATA[0000] flag provided but not defined: &lt;span class="nt"&gt;-c&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So I will use the latest version installed from sources:&lt;/p&gt;

&lt;h2&gt;
  
  
  Contributing to aws-nuke
&lt;/h2&gt;

&lt;p&gt;When running locally aws-nuke with aws profile I got error.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws_nuke_v2 &lt;span class="nt"&gt;-c&lt;/span&gt; config/nuke-config.yaml &lt;span class="nt"&gt;--profile&lt;/span&gt; personal_profile
Error: failed get &lt;span class="nb"&gt;caller &lt;/span&gt;identity: NoCredentialProviders: no valid providers &lt;span class="k"&gt;in &lt;/span&gt;chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors

Process finished with the &lt;span class="nb"&gt;exit &lt;/span&gt;code 255
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And after checking the sources, I find out that profile is present in parameters for aws-nuke, but not implemented.&lt;br&gt;
So I have added this feature to aws-nuke:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Profile&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;awsSDKConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LoadDefaultConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TODO&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;awsSDKConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithSharedConfigProfile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Profile&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"unable to load SDK config, %v"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Infof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"using profile %v"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Profile&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="c"&gt;/// Extract credentials from the configuration&lt;/span&gt;
            &lt;span class="n"&gt;cred&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Credentials&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Retrieve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TODO&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"unable to retrieve credentials, %v"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="n"&gt;creds&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AccessKeyID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cred&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AccessKeyID&lt;/span&gt;
            &lt;span class="n"&gt;creds&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SecretAccessKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cred&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SecretAccessKey&lt;/span&gt;
            &lt;span class="n"&gt;defaultRegion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Region&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;creds&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HasKeys&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;creds&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HasProfile&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;creds&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AccessKeyID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"AWS_ACCESS_KEY_ID"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;creds&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SecretAccessKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"AWS_SECRET_ACCESS_KEY"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Also added Contribution PR to aws-nuke: &lt;a href="https://github.com/rebuy-de/aws-nuke/pull/1263/files" rel="noopener noreferrer"&gt;https://github.com/rebuy-de/aws-nuke/pull/1263/files&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now aws-nuke correctly process --profile cli argument and is ready to run&lt;/p&gt;

&lt;h3&gt;
  
  
  Working with implemented profile feature:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;nuke&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;yaml&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;profile&lt;/span&gt; &lt;span class="n"&gt;personal_account&lt;/span&gt;

&lt;span class="n"&gt;INFO&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0000&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="n"&gt;using&lt;/span&gt; &lt;span class="n"&gt;profile&lt;/span&gt; &lt;span class="n"&gt;personal_account&lt;/span&gt;
&lt;span class="n"&gt;aws&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;nuke&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="n"&gt;unknown&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;unknown&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;unknown&lt;/span&gt;

&lt;span class="n"&gt;Do&lt;/span&gt; &lt;span class="n"&gt;you&lt;/span&gt; &lt;span class="n"&gt;really&lt;/span&gt; &lt;span class="n"&gt;want&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;nuke&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;account&lt;/span&gt; &lt;span class="n"&gt;with&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;ID&lt;/span&gt; &lt;span class="m"&gt;123456789012&lt;/span&gt; &lt;span class="n"&gt;and&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;alias&lt;/span&gt; &lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;personal_account&lt;/span&gt;&lt;span class="err"&gt;'?&lt;/span&gt;
&lt;span class="n"&gt;Do&lt;/span&gt; &lt;span class="n"&gt;you&lt;/span&gt; &lt;span class="n"&gt;want&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="err"&gt;?&lt;/span&gt; &lt;span class="n"&gt;Enter&lt;/span&gt; &lt;span class="n"&gt;account&lt;/span&gt; &lt;span class="n"&gt;alias&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;personal_account&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tool will query existing resources, match them with filters that we have configured in exclusion and will do dry run.&lt;/p&gt;

&lt;h3&gt;
  
  
  Resources that we want to leave in account
&lt;/h3&gt;

&lt;p&gt;Let's say we do not want to affect IAM users/permissions/policies, also MFA devices and some route53 records.&lt;br&gt;
These all resources should be declared in aws-nuke config, so when running it will not touch them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;regions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;eu-east-1&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;global&lt;/span&gt;

&lt;span class="na"&gt;accounts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;span class="na"&gt;123456789012&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;span class="na"&gt;filters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;span class="na"&gt;PolicyName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;property&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;UserName"&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;aws-cli"&lt;/span&gt;
&lt;span class="na"&gt;IAMRole&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OrganizationAccountAccessRole"&lt;/span&gt;
&lt;span class="na"&gt;IAMRolePolicyAttachment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OrganizationAccountAccessRole"&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;property&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;arn&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arn:aws:iam::aws:policy/AdministratorAccess"&lt;/span&gt;
&lt;span class="na"&gt;IAMUser&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;admin"&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;aws-cli"&lt;/span&gt;
&lt;span class="na"&gt;IAMUserAccessKey&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;property&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;UserName&lt;/span&gt;
&lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;aws-cli"&lt;/span&gt;
&lt;span class="na"&gt;IAMUserPolicyAttachment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;property&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arn"&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arn:aws:iam::aws:policy/AdministratorAccess"&lt;/span&gt;
&lt;span class="na"&gt;IAMVirtualMFADevice&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;property&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;arn&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arn:aws:iam::123456789012:mfa/to.root.iphone"&lt;/span&gt;
&lt;span class="na"&gt;Route53HostedZone&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;property&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Name"&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;xxxxx.xx."&lt;/span&gt;

&lt;span class="na"&gt;account-blocklist&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1234567890&lt;/span&gt;

&lt;span class="na"&gt;resource-types&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;span class="na"&gt;excludes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Route53ResourceRecordSet&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Route53HostedZone&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Review resources in output
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;...
global - IAMPolicy - arn:aws:iam::123456789012:policy/terraform-20230725102234859300000005 - &lt;span class="o"&gt;[&lt;/span&gt;ARN: &lt;span class="s2"&gt;"arn:aws:iam::123456789012:policy/terraform-20230725102234859300000005"&lt;/span&gt;, Name: &lt;span class="s2"&gt;"terraform-20230725102234859300000005"&lt;/span&gt;, Path: &lt;span class="s2"&gt;"/"&lt;/span&gt;, PolicyID: &lt;span class="s2"&gt;"ANPAVFCI6B4SXG62NJLCB"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; - would remove

global - IAMRole - AmazonGrafanaServiceRole-Skz3Lbeo2 - &lt;span class="o"&gt;[&lt;/span&gt;CreateDate: &lt;span class="s2"&gt;"2023-03-21T16:02:03Z"&lt;/span&gt;, LastUsedDate: &lt;span class="s2"&gt;"2023-03-21T16:02:03Z"&lt;/span&gt;, Name: &lt;span class="s2"&gt;"AmazonGrafanaServiceRole-Skz3Lbeo2"&lt;/span&gt;, Path: &lt;span class="s2"&gt;"/service-role/"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; - would remove

123456789012&lt;span class="s2"&gt;", PolicyName: "&lt;/span&gt;terraform-20230725102234859300000005&lt;span class="s2"&gt;", RoleCreateDate: "&lt;/span&gt;2023-07-25T10:22:27Z&lt;span class="s2"&gt;", RoleLastUsed: "&lt;/span&gt;2023-07-25T10:22:27Z&lt;span class="s2"&gt;", RoleName: "&lt;/span&gt;terraform-20230725102226616700000002&lt;span class="s2"&gt;", RolePath: "&lt;/span&gt;/&lt;span class="s2"&gt;"] - filtered by config
global - IAMRolePolicy - terraform-20230725102226616700000003 -&amp;gt; terraform-20230725102227744300000004 - [PolicyName: "&lt;/span&gt;terraform-20230725102227744300000004&lt;span class="s2"&gt;", role:CreateDate: "&lt;/span&gt;2023-07-25T10:22:27Z&lt;span class="s2"&gt;", role:LastUsed: "&lt;/span&gt;2023-07-25T10:22:56Z&lt;span class="s2"&gt;", role:Path: "&lt;/span&gt;/&lt;span class="s2"&gt;", role:RoleID: "&lt;/span&gt;AROAVFCI6B4SWQNDRGVXF&lt;span class="s2"&gt;", role:RoleName: "&lt;/span&gt;terraform-20230725102226616700000003&lt;span class="s2"&gt;"] - would remove
Scan complete: 44 total, 13 nukeable, 31 filtered.

The above resources would be deleted with the supplied configuration. Provide --no-dry-run to actually destroy resources.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Perform cleanup
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;go run main.go &lt;span class="nt"&gt;-c&lt;/span&gt; config/nuke-config.yaml &lt;span class="nt"&gt;--profile&lt;/span&gt; personal_account &lt;span class="nt"&gt;--no-dry-run&lt;/span&gt;

INFO[0000] using profile personal_account
aws-nuke version unknown - unknown - unknown

Do you really want to nuke the account with the ID 123456789012 and the &lt;span class="nb"&gt;alias&lt;/span&gt; &lt;span class="s1"&gt;'personal_account'&lt;/span&gt;?
Do you want to &lt;span class="k"&gt;continue&lt;/span&gt;? Enter account &lt;span class="nb"&gt;alias &lt;/span&gt;to &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; personal_account

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First aws-nuke will scan as in dry-run, and also propose to confirm deletion.&lt;/p&gt;

&lt;p&gt;Tool has issues deleting IAMs - it has logic to retry but does not understand resources dependencies. For instance, it can not delete role with attached policies that it also should delete:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ERRO[0373] DeleteConflict: Cannot delete entity, must detach all policies first.
status code: 409, request id: 9506e3fd-5945-48b3-9031-826d28d2abec
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;so you have manually delete from CLI or aws console&lt;/p&gt;

&lt;h3&gt;
  
  
  Integration with CI/CD
&lt;/h3&gt;

&lt;p&gt;One of the best use cases for AWS-Nuke is integrating it into a CI/CD pipeline. You can use AWS-Nuke as part of your continuous integration (CI) process to clean up test environments after running tests. This ensures that no unnecessary resources remain after testing is complete, keeping your environment clean and your costs under control.&lt;/p&gt;

&lt;p&gt;Here’s an example of how you might integrate AWS-Nuke into a CI/CD pipeline:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Run AWS-Nuke in dry-run mode before deploying a new version of your application to ensure there are no unnecessary resources left over from previous tests.&lt;/li&gt;
&lt;li&gt;After the deployment and testing are complete, run AWS-Nuke in full mode to clean up all resources.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;AWS-Nuke is a powerful and versatile tool for cleaning up AWS environments. It helps prevent cost overruns, reduces security risks, and simplifies the management of AWS resources. Whether you’re managing a development environment, performing testing, or decommissioning an old project, AWS-Nuke provides an efficient way to automate the cleanup process.&lt;/p&gt;

&lt;p&gt;However, AWS-Nuke also requires careful configuration to avoid deleting critical resources. Always use the dry-run mode to review deletions before making any changes, and ensure that your configuration file accurately reflects your environment.&lt;/p&gt;

&lt;p&gt;In an age where cloud resources can grow exponentially, tools like AWS-Nuke are indispensable for keeping your infrastructure lean and efficient.&lt;/p&gt;

&lt;h3&gt;
  
  
  Links:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/rebuy-de/aws-nuke" rel="noopener noreferrer"&gt;https://github.com/rebuy-de/aws-nuke&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>tooling</category>
      <category>automation</category>
      <category>devops</category>
    </item>
    <item>
      <title>AWS CSM Mode: Advanced monitoring of AWS client</title>
      <dc:creator>Roman Tsypuk</dc:creator>
      <pubDate>Mon, 22 Sep 2025 06:01:02 +0000</pubDate>
      <link>https://dev.to/aws-builders/aws-csm-mode-advanced-monitoring-of-aws-client-ph7</link>
      <guid>https://dev.to/aws-builders/aws-csm-mode-advanced-monitoring-of-aws-client-ph7</guid>
      <description>&lt;p&gt;After exploring &lt;code&gt;aws boto3 core&lt;/code&gt; sources on &lt;strong&gt;github&lt;/strong&gt;, I found this interesting &lt;a href="https://github.com/boto/botocore/commit/14e0eab5c1e4aec437c3e558e6899de00fd5e98e" rel="noopener noreferrer"&gt;commit&lt;/a&gt;, that enables monitor mode called &lt;strong&gt;CSM&lt;/strong&gt;.&lt;br&gt;
Once exploring how it works I decided write a post, cause this feature is extremely useful but also start writing &lt;code&gt;aws-client-monitor&lt;/code&gt; toolbox on top of it.&lt;/p&gt;


&lt;h2&gt;
  
  
  Introduction to AWS CSM (Client-Side Monitoring) Mode
&lt;/h2&gt;

&lt;p&gt;AWS Client-Side Monitoring (CSM) is a powerful feature designed to track and analyze the performance of your AWS SDK calls. When enabled, it provides detailed metrics on API requests, response times,&lt;br&gt;
and error rates, helping developers gain a deeper understanding of their application's behavior when interacting with AWS services. This is crucial for debugging, optimizing performance, and ensuring&lt;br&gt;
that applications are running efficiently in production environments.&lt;/p&gt;

&lt;p&gt;CSM mode works by capturing information about SDK API calls and sending that data to a local monitoring agent. It helps you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Track API request latencies.&lt;/li&gt;
&lt;li&gt;Identify high failure rates in SDK requests.&lt;/li&gt;
&lt;li&gt;Gain visibility into the most frequently called AWS services.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this blog post, we'll explore AWS CSM mode in more detail, look at common use cases, and provide Golang code snippets to demonstrate how to implement it.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why Use AWS CSM Mode?
&lt;/h2&gt;

&lt;p&gt;As applications become increasingly reliant on cloud services, monitoring and optimizing the performance of these interactions becomes critical. AWS SDKs are widely used to interface with AWS services&lt;br&gt;
such as S3, DynamoDB, Lambda, and many others. However, managing and tracking these interactions can be challenging, especially when it comes to identifying latency issues or bottlenecks in the&lt;br&gt;
communication between your application and AWS.&lt;/p&gt;

&lt;p&gt;CSM provides a granular view of how SDK requests are performing. It allows you to gather metrics like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Latency: How long each request takes.&lt;/li&gt;
&lt;li&gt;Errors: Which AWS services are returning errors and why.&lt;/li&gt;
&lt;li&gt;Request Frequency: Which services are being called the most.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This data is invaluable for performance tuning, debugging, and capacity planning.&lt;/p&gt;
&lt;h2&gt;
  
  
  Common Use Cases for AWS CSM Mode
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Performance Optimization:
&lt;/h3&gt;

&lt;p&gt;By tracking the latency of AWS service calls, you can identify the API requests that are taking the longest to execute. This can help you optimize the application's performance, whether through&lt;br&gt;
caching, retries, or parallelizing requests.&lt;/p&gt;
&lt;h3&gt;
  
  
  Error Tracking:
&lt;/h3&gt;

&lt;p&gt;If your application experiences frequent errors while interacting with AWS services, CSM can help identify the root cause. For example, if a specific AWS service is returning a large number of 5xx&lt;br&gt;
errors, CSM will capture this information, enabling developers to troubleshoot quickly.&lt;/p&gt;
&lt;h3&gt;
  
  
  Capacity Planning:
&lt;/h3&gt;

&lt;p&gt;Monitoring the number of requests made to AWS services can help forecast capacity needs and adjust resources accordingly. For example, if you're making a large number of requests to DynamoDB, it may&lt;br&gt;
be time to scale your read/write capacity.&lt;/p&gt;
&lt;h3&gt;
  
  
  Debugging Production Issues:
&lt;/h3&gt;

&lt;p&gt;When something goes wrong in production, CSM can provide critical insights into which AWS services or API calls are causing issues, allowing for quick resolution.&lt;/p&gt;
&lt;h2&gt;
  
  
  Setting Up AWS CSM in Golang
&lt;/h2&gt;

&lt;p&gt;The AWS SDK for Go provides native support for client-side monitoring. To enable CSM, you need to configure the SDK to send data to the local CSM agent, which processes and forwards it to monitoring&lt;br&gt;
tools like Amazon CloudWatch.&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 1: Install the AWS SDK for Go
&lt;/h3&gt;

&lt;p&gt;First, you need to install the AWS SDK for Go, if you haven't already:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;go get &lt;span class="nt"&gt;-u&lt;/span&gt; github.com/aws/aws-sdk-go
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Enable CSM in the AWS SDK
&lt;/h3&gt;

&lt;p&gt;To enable CSM in the AWS SDK for Go, you need to configure the environment variables that control CSM behavior, or you can do this programmatically within your application.&lt;/p&gt;

&lt;p&gt;Here’s an example of how you can enable CSM using environment variables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AWS_CSM_ENABLED&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true
export &lt;/span&gt;&lt;span class="nv"&gt;AWS_CSM_HOST&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;127.0.0.1
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AWS_CSM_PORT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;31000
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AWS_CSM_CLIENT_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;my-client-id
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;AWS_CSM_ENABLED: Enables or disables CSM.&lt;/li&gt;
&lt;li&gt;AWS_CSM_HOST: The hostname where the CSM agent is running (usually localhost).&lt;/li&gt;
&lt;li&gt;AWS_CSM_PORT: The port where the CSM agent is listening.&lt;/li&gt;
&lt;li&gt;AWS_CSM_CLIENT_ID: A client identifier used to differentiate between clients.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 3: Configure AWS SDK to Use CSM
&lt;/h3&gt;

&lt;p&gt;In Golang, you configure the SDK to use CSM when making AWS API calls. Here’s a simple example where we create an S3 client and make a request with CSM enabled.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="s"&gt;"fmt"&lt;/span&gt;
  &lt;span class="s"&gt;"net"&lt;/span&gt;
  &lt;span class="s"&gt;"time"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;listenUDP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;port&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt; &lt;span class="k"&gt;chan&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;addr&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UDPAddr&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Port&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;IP&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ParseIP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"0.0.0.0"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ListenUDP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"udp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;addr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Error listening on UDP:"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UDPConn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Error closing UDP connection:"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="n"&gt;buffer&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;2048&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReadFromUDP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Error reading from UDP:"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;continue&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="c"&gt;// Send received data to channel&lt;/span&gt;
    &lt;span class="n"&gt;ch&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="n"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;writeToConsole&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ch&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Received from channel:"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;byteChannel&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="c"&gt;// Goroutine to listen on UDP and write to the channel&lt;/span&gt;
  &lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="n"&gt;listenUDP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;31000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;byteChannel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="c"&gt;// Goroutines to read from the channel&lt;/span&gt;
  &lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="n"&gt;writeToConsole&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;byteChannel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="c"&gt;// Prevent the main function from exiting&lt;/span&gt;
  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Invoke aws cli API
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3 &lt;span class="nb"&gt;ls&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step5: aws-client-monitor will display calls
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;aws-cli&lt;/code&gt; will as usual list the buckets but at the same time it will make 2 calls via &lt;code&gt;UDP&lt;/code&gt; to our server.&lt;/p&gt;

&lt;p&gt;For each AWS API call there 2 entities &lt;strong&gt;ApiCallAttempt&lt;/strong&gt; and &lt;strong&gt;ApiCall&lt;/strong&gt;, they have the following structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ClientId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"my-client-id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ApiCallAttempt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Service"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"S3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Api"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ListBuckets"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Timestamp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1728194484982&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"AttemptLatency"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;266&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Fqdn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"s3.eu-west-1.amazonaws.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"UserAgent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"aws-cli/1.27.92 md/Botocore#1.31.2 ua/2.0 os/macos#21.6.0 md/arch#x86_64 lang/python#3.10.14 md/pyimpl#CPython cfg/retry-mode#legacy botocore/1.31.2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"AccessKey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ASIAWFOD4FPxxx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Region"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"eu-west-1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"SessionToken"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"IQoJb3JpZxxx="&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"HttpStatusCode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"XAmzRequestId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"8K3P9AWACxxx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"XAmzId2"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"vZDGgBpIwz6Jfxxx="&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ClientId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"my-client-id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ApiCall"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Service"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"S3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Api"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ListBuckets"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Timestamp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1728194484981&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"AttemptCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Region"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"eu-west-1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"UserAgent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"aws-cli/1.27.92 md/Botocore#1.31.2 ua/2.0 os/macos#21.6.0 md/arch#x86_64 lang/python#3.10.14 md/pyimpl#CPython cfg/retry-mode#legacy botocore/1.31.2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"FinalHttpStatusCode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Latency"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;267&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"MaxRetriesExceeded"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Tracking errors of aws API
&lt;/h2&gt;

&lt;p&gt;We can also track error of AWS API, let's try to create bucket that exists:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3api create-bucket &lt;span class="nt"&gt;--bucket&lt;/span&gt; existing-bucket

An error occurred &lt;span class="o"&gt;(&lt;/span&gt;IllegalLocationConstraintException&lt;span class="o"&gt;)&lt;/span&gt; when calling the CreateBucket operation: The unspecified location constraint is incompatible &lt;span class="k"&gt;for &lt;/span&gt;the region specific endpoint this request was sent to.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same error message also present in &lt;code&gt;aws-client-monitor&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ClientId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"my-client-id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ApiCallAttempt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Service"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"S3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Api"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CreateBucket"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Timestamp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1728195105316&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"AttemptLatency"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;215&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Fqdn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"existing-bucket.s3.eu-west-1.amazonaws.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"UserAgent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"aws-cli/1.27.92 md/Botocore#1.31.2 ua/2.0 os/macos#21.6.0 md/arch#x86_64 lang/python#3.10.14 md/pyimpl#CPython cfg/retry-mode#legacy botocore/1.31.2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"AccessKey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ASIAWFODxxx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Region"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"eu-west-1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"SessionToken"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"IQoJxxx="&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"HttpStatusCode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"XAmzRequestId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"FPWTJWZC7114XQJE"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"XAmzId2"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"+G9yOxxx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"AwsException"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"IllegalLocationConstraintException"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"AwsExceptionMessage"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The unspecified location constraint is incompatible for the region specific endpoint this request was sent to."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ClientId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"my-client-id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ApiCall"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Service"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"S3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Api"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CreateBucket"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Timestamp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1728195105312&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"AttemptCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Region"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"eu-west-1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"UserAgent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"aws-cli/1.27.92 md/Botocore#1.31.2 ua/2.0 os/macos#21.6.0 md/arch#x86_64 lang/python#3.10.14 md/pyimpl#CPython cfg/retry-mode#legacy botocore/1.31.2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"FinalHttpStatusCode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"FinalAwsException"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"IllegalLocationConstraintException"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"FinalAwsExceptionMessage"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The unspecified location constraint is incompatible for the region specific endpoint this request was sent to."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Latency"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;219&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"MaxRetriesExceeded"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Metrics Extraction
&lt;/h2&gt;

&lt;p&gt;Following metrics can be extracted on UDP server side:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;API call duration (latency)&lt;/li&gt;
&lt;li&gt;History of app operations with AWS API (service name : operation)&lt;/li&gt;
&lt;li&gt;Success and failure rates&lt;/li&gt;
&lt;li&gt;Error messages&lt;/li&gt;
&lt;li&gt;Region&lt;/li&gt;
&lt;li&gt;User-agent&lt;/li&gt;
&lt;li&gt;IP&lt;/li&gt;
&lt;li&gt;clientName&lt;/li&gt;
&lt;li&gt;UserAgent&lt;/li&gt;
&lt;li&gt;AccessKey&lt;/li&gt;
&lt;li&gt;Request payload size.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We can use these insights to fine-tune our application's performance and catch potential bottlenecks or errors.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;AWS Client-Side Monitoring (CSM) is a valuable tool for gaining deep insights into how your application interacts with AWS services. It helps you monitor performance, identify errors, and optimize API&lt;br&gt;
usage. By integrating CSM with the AWS SDK for Go, developers can track API calls and gather real-time metrics, ultimately improving application performance and reliability.&lt;/p&gt;

&lt;p&gt;If you're working in a production environment where optimizing performance and troubleshooting issues is crucial, CSM is a feature worth enabling. With just a few configuration steps, you can gain&lt;br&gt;
access to a wealth of data that will help you fine-tune your AWS-based applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  Links:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/boto/botocore/commit/14e0eab5c1e4aec437c3e558e6899de00fd5e98e" rel="noopener noreferrer"&gt;https://github.com/boto/botocore/commit/14e0eab5c1e4aec437c3e558e6899de00fd5e98e&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/sdk-for-go/api/aws/csm/" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/sdk-for-go/api/aws/csm/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/tsypuk/aws-client-monitor/blob/main/main.go" rel="noopener noreferrer"&gt;https://github.com/tsypuk/aws-client-monitor/blob/main/main.go&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>python</category>
      <category>aws</category>
      <category>monitoring</category>
      <category>performance</category>
    </item>
    <item>
      <title>CDC in AWS: Content Data Capture from AWS RDS MySQL into AWS MSK Kafka topic using Debezium</title>
      <dc:creator>Roman Tsypuk</dc:creator>
      <pubDate>Mon, 22 Sep 2025 06:00:29 +0000</pubDate>
      <link>https://dev.to/aws-builders/cdc-in-aws-content-data-capture-from-aws-rds-mysql-into-aws-msk-kafka-topic-using-debezium-3kc4</link>
      <guid>https://dev.to/aws-builders/cdc-in-aws-content-data-capture-from-aws-rds-mysql-into-aws-msk-kafka-topic-using-debezium-3kc4</guid>
      <description>&lt;p&gt;Kafka connect is a powerful open-source platform for Change Data Capture (CDC), enabling real-time event streaming from databases like MySQL.&lt;br&gt;
In this post, we'll explore how to set up one of the most popular Open Source Debezium to capture changes from an AWS MySQL RDS instance and publish them to a Kafka topic.&lt;br&gt;
We'll also dive into the Debezium message format and explain the difference between standalone and cluster modes.&lt;/p&gt;


&lt;h2&gt;
  
  
  What problem does it solve
&lt;/h2&gt;

&lt;p&gt;Debezium allows you to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stream real-time changes from a MySQL RDS instance.&lt;/li&gt;
&lt;li&gt;Track inserts, updates, and deletes as they occur.&lt;/li&gt;
&lt;li&gt;Publish these changes to Kafka topics for downstream processing.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fksdp48wufeuoooj3bing.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fksdp48wufeuoooj3bing.png" alt="infra.png" width="479" height="100"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is particularly useful for building event-driven architectures, data pipelines, and synchronizing databases with other systems.&lt;/p&gt;

&lt;p&gt;Besides Debezium there are multiple opensource connectors available on confluent platform to provide intergration point with different &lt;strong&gt;sink&lt;/strong&gt; and &lt;strong&gt;source&lt;/strong&gt; systems like &lt;strong&gt;AWS S3&lt;/strong&gt;, &lt;strong&gt;ElasticSearch&lt;br&gt;
**, **MongoDB&lt;/strong&gt;, etc&lt;/p&gt;
&lt;h2&gt;
  
  
  Setting Up Debezium for MySQL RDS
&lt;/h2&gt;
&lt;h3&gt;
  
  
  1. Enable Binary Logging on MySQL RDS:
&lt;/h3&gt;

&lt;p&gt;Log in to your RDS instance and ensure binary logging is enabled in your parameter group.&lt;br&gt;
Configure the following parameters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;binlog_format = RAW&lt;/li&gt;
&lt;li&gt;&lt;p&gt;binlog_row_image = FULL&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Ensure binlog_retention_period is set to a sufficient duration for your use case.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SHOW&lt;/span&gt;
&lt;span class="n"&gt;VARIABLES&lt;/span&gt; &lt;span class="k"&gt;LIKE&lt;/span&gt; &lt;span class="s1"&gt;'binlog_format'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Variable_name&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;binlog_format&lt;/td&gt;
&lt;td&gt;RAW&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;SHOW VARIABLES LIKE 'log_bin%';&lt;br&gt;
SHOW VARIABLES LIKE 'binlog_format';&lt;br&gt;
SHOW VARIABLES LIKE 'binlog_row_image';&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="k"&gt;SESSION&lt;/span&gt; &lt;span class="n"&gt;binlog_format&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'ROW'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt;
&lt;span class="k"&gt;GLOBAL&lt;/span&gt; &lt;span class="n"&gt;binlog_format&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'ROW'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Run Debezium with Kafka Connect:
&lt;/h3&gt;

&lt;p&gt;Use Docker Compose to start Debezium and Kafka Connect:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;debezium&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;debezium/connect:2.7.3.Final&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8083:8083"&lt;/span&gt;
    &lt;span class="na"&gt;healthcheck&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CMD-SHELL"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;curl&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;-f&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;http://localhost:8083/&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;||&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;exit&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;1"&lt;/span&gt; &lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;kafka-cluster&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;BOOTSTRAP_SERVERS=b-4.kafka-dev-0.xxxx.c6.kafka.us-east-1.amazonaws.com:9092,b-3.kafka-dev-0.xxxx.c6.kafka.us-east-1.amazonaws.com:9092,b-1.kafka-dev-0.xxxx.c6.kafka.us-east-1.amazonaws.com:9092&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;GROUP_ID=MYSQL_1&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;CONFIG_STORAGE_TOPIC=debezium_connect_configs&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;OFFSET_STORAGE_TOPIC=debezium_connect_offsets&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;STATUS_STORAGE_TOPIC=debezium_source_connect_statuses&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;CONFIG_STORAGE_REPLICATION_FACTOR=1&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;OFFSET_STORAGE_REPLICATION_FACTOR=1&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;STATUS_STORAGE_REPLICATION_FACTOR=1&lt;/span&gt;
&lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;kafka-cluster&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;driver&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bridge&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;CONFIG_STORAGE_TOPIC, OFFSET_STORAGE_TOPIC, STATUS_STORAGE_TOPIC are topics in Kafka where connector will store its own configuration and perform synchronization.&lt;br&gt;
CONFIG_STORAGE_REPLICATION_FACTOR, OFFSET_STORAGE_REPLICATION_FACTOR, STATUS_STORAGE_REPLICATION_FACTOR should always be 1 in replication factor.&lt;/p&gt;
&lt;h3&gt;
  
  
  4. Detect server_id that will be used in connector configuration
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SHOW&lt;/span&gt;
&lt;span class="n"&gt;VARIABLES&lt;/span&gt; &lt;span class="k"&gt;LIKE&lt;/span&gt; &lt;span class="s1"&gt;'server_id'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  5. Configure the MySQL Connector
&lt;/h3&gt;

&lt;p&gt;Kafka connect exposes HTTP port to establish and monitor connectors.&lt;/p&gt;

&lt;p&gt;Create a MySQL source connector by sending a POST request to the Kafka Connect REST API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8083/connectors &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; @../payload.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mysql-moderation-comments-connector"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"connector.class"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"io.debezium.connector.mysql.MySqlConnector"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"tasks.max"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"database.hostname"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"xxxx.us-east-1.rds.amazonaws.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"database.port"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"3306"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"database.user"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"database.password"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"password"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"database.server.id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"this_is_mysql_server_id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"topic.prefix"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mysql-cdc"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"database.include.list"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"public"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"table.include.list"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"public.users"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"schema.history.internal.kafka.bootstrap.servers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"b-4.kafka-dev-0.xxxx.c6.kafka.us-east-1.amazonaws.com:9092,b-3.kafka-dev-0.xxxx.c6.kafka.us-east-1.amazonaws.com:9092,b-1.kafka-dev-0.xxxx.c6.kafka.us-east-1.amazonaws.com:9092"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"schema.history.internal.kafka.topic"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"schema-changes.moderation"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"include.schema.changes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"key.converter.schemas.enable"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"key.converter"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"org.apache.kafka.connect.json.JsonConverter"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"value.converter.schemas.enable"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"value.converter"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"org.apache.kafka.connect.json.JsonConverter"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"database.history.skip.unparseable.ddl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Parameter&lt;/th&gt;
&lt;th&gt;Details&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;connector.class&lt;/td&gt;
&lt;td&gt;Class name of connector from JAVA_PATH in running container. There are pre-built available connectors - "plugins", also we can add any exising plugin to container or write our own.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;tasks.max&lt;/td&gt;
&lt;td&gt;only 1 task should be operatable at time - proper order and handling of bin log Kafka connect service uses connectors to start 1 or more task that do the actual work and distributes running tasks across the cluster of Kafka connect services. If any of services stopped or crashed those tasks will be redistributed to other running services&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;database.hostname&lt;/td&gt;
&lt;td&gt;RDS endpoint IP or DNS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;database.port&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;database.user&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;database.password&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;database.server.id&lt;/td&gt;
&lt;td&gt;unique identifier of MySQL server - this is a master server ID&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;database.server.name&lt;/td&gt;
&lt;td&gt;logical ID of the server or cluster of services, used as prefix for all kafka topics&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;topic.prefix&lt;/td&gt;
&lt;td&gt;"mysql-server" - prefix that can be added to Kafka topic to distinguish it from other existing topics&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;database.include.list&lt;/td&gt;
&lt;td&gt;coma-separated list of DBs whose CDC should be captured&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;schema.history.internal.kafka.topic&lt;/td&gt;
&lt;td&gt;connector puts all DDL statements in this topic while reading the binlog. On restart the connector will recover the schema of the DB that existed in point in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;key.converter&lt;/td&gt;
&lt;td&gt;className of converter or transformer for Event&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;value.converter&lt;/td&gt;
&lt;td&gt;value converter&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;table.include.list&lt;/td&gt;
&lt;td&gt;name_of_the_table&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  6. Export from MySQL table
&lt;/h3&gt;

&lt;p&gt;Once connector is setup it will create a snapshot of data and will ingest into Kafka. After that connector will monitor for new records and updates in CDC.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;debezium-1 | INFO   MySQL|mysql-cdc|snapshot    Exported 496538 of 548957 records &lt;span class="k"&gt;for &lt;/span&gt;table &lt;span class="s1"&gt;'public.users'&lt;/span&gt; after 00:41:53.279   &lt;span class="o"&gt;[&lt;/span&gt;io.debezium.relational.RelationalSnapshotChangeEventSource]
debezium-1 | INFO   MySQL|mysql-cdc|snapshot    Exported 605082 of 548957 records &lt;span class="k"&gt;for &lt;/span&gt;table &lt;span class="s1"&gt;'public.users'&lt;/span&gt; after 00:53:07.497   &lt;span class="o"&gt;[&lt;/span&gt;io.debezium.relational.RelationalSnapshotChangeEventSource]
debezium-1 | INFO   MySQL|mysql-cdc|snapshot    Finished exporting 605274 records &lt;span class="k"&gt;for &lt;/span&gt;table &lt;span class="s1"&gt;'public.users'&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;1 of 1 tables&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; total duration &lt;span class="s1"&gt;'00:53:07.522'&lt;/span&gt;   &lt;span class="o"&gt;[&lt;/span&gt;io.debezium.relational.RelationalSnapshotChangeEventSource]
debezium-1 | INFO   MySQL|mysql-cdc|snapshot  Releasing table &lt;span class="nb"&gt;read &lt;/span&gt;lock to &lt;span class="nb"&gt;enable &lt;/span&gt;MySQL writes   &lt;span class="o"&gt;[&lt;/span&gt;io.debezium.connector.binlog.BinlogSnapshotChangeEventSource]
debezium-1 | INFO   MySQL|mysql-cdc|snapshot  Writes to MySQL tables prevented &lt;span class="k"&gt;for &lt;/span&gt;a total of 00:53:13.755   &lt;span class="o"&gt;[&lt;/span&gt;io.debezium.connector.binlog.BinlogSnapshotChangeEventSource]
debezium-1 | INFO   &lt;span class="o"&gt;||&lt;/span&gt;  WorkerSourceTask&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;mysql-moderation-comments-connector-0&lt;span class="o"&gt;}&lt;/span&gt; Committing offsets &lt;span class="k"&gt;for &lt;/span&gt;10242 acknowledged messages   &lt;span class="o"&gt;[&lt;/span&gt;org.apache.kafka.connect.runtime.WorkerSourceTask]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  7. Verify Kafka Topic
&lt;/h3&gt;

&lt;p&gt;Use kafka-console-consumer to check the topic for messages:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kafka-console-consumer &lt;span class="nt"&gt;--bootstrap-server&lt;/span&gt; localhost:9092 &lt;span class="nt"&gt;--topic&lt;/span&gt; mysql-cdc.your_database.your_table &lt;span class="nt"&gt;--from-beginning&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Debezium Data Format
&lt;/h2&gt;

&lt;p&gt;Debezium emits messages to Kafka in a JSON format with three main parts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;key (Identifies the specific database row.)&lt;/li&gt;
&lt;li&gt;value (Contains the actual change event, with fields like before, after, op, ts_ms, etc.)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Fields:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;before: State of the row before the change.&lt;/li&gt;
&lt;li&gt;after: State of the row after the change.&lt;/li&gt;
&lt;li&gt;op: Type of operation (c for create, u for update, d for delete).&lt;/li&gt;
&lt;li&gt;source: Metadata about the event source.&lt;/li&gt;
&lt;li&gt;ts_ms: Timestamp of the event.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"op"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"u"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ts_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ts_us"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ts_ns"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"before"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"userid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"bob"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"after"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"userid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"field2"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"alice"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Hide schema from payload
&lt;/h3&gt;

&lt;p&gt;There are multiple configurations and data formats that allow to transform event, hide not needed fields, also register custom SingleMessageTransformers etc.&lt;/p&gt;

&lt;p&gt;Here we are instructing Debezium to exclude schema from the payload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight properties"&gt;&lt;code&gt;&lt;span class="py"&gt;debezium.source.value.converter&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;org.apache.kafka.connect.json.JsonConverter&lt;/span&gt;
&lt;span class="py"&gt;debezium.source.value.converter.schemas.enable&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;false&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Debezium is an excellent choice for capturing data changes in real time from MySQL RDS and streaming them into Kafka.&lt;br&gt;
Its support for schema change tracking, rich message format, and scalability make it ideal for modern data pipelines.&lt;/p&gt;

&lt;p&gt;Whether you choose standalone mode for simplicity or cluster mode for fault tolerance, Debezium provides the flexibility to meet your needs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Links:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/USER_LogAccess.MySQL.BinaryFormat.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/USER_LogAccess.MySQL.BinaryFormat.html&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.confluent.io/platform/current/connect/references/restapi.html" rel="noopener noreferrer"&gt;https://docs.confluent.io/platform/current/connect/references/restapi.html&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>dataengineering</category>
      <category>mysql</category>
      <category>aws</category>
      <category>kafka</category>
    </item>
    <item>
      <title>Remote Debugging/Profiling NodeJS app running in private ECS cluster using AWS SSM and inspect protocol</title>
      <dc:creator>Roman Tsypuk</dc:creator>
      <pubDate>Mon, 22 Sep 2025 05:59:41 +0000</pubDate>
      <link>https://dev.to/aws-builders/remote-debuggingprofiling-nodejs-app-running-in-private-ecs-cluster-using-aws-ssm-and-inspect-2e11</link>
      <guid>https://dev.to/aws-builders/remote-debuggingprofiling-nodejs-app-running-in-private-ecs-cluster-using-aws-ssm-and-inspect-2e11</guid>
      <description>&lt;p&gt;Debugging a Node.js application running inside a container within an ECS cluster can be daunting. The challenge becomes especially significant in restricted environments like AWS ECS, where you lack direct control over the running workloads. Traditional debugging methods, such as SSH or exposing ports, are often not viable in these setups.&lt;/p&gt;

&lt;p&gt;AWS provides a secure and efficient solution through AWS Systems Manager (SSM). By leveraging SSM documents and port forwarding, you can securely establish connections to your Node.js containers and expose the debugger port without compromising the security of your infrastructure.&lt;/p&gt;

&lt;p&gt;In this post, we will walk through the steps to set up and perform remote debugging of a Node.js container using an SSM.&lt;/p&gt;

&lt;h2&gt;
  
  
  Debugging
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Node.js&lt;/strong&gt; debugging process using the &lt;code&gt;--inspect&lt;/code&gt; flag allows developers to attach a debugging client (such as Chrome DevTools) to the application and analyze its behavior in real-time.&lt;/p&gt;

&lt;p&gt;Node.js uses the Chrome DevTools Protocol (CDP) for debugging.&lt;/p&gt;

&lt;h3&gt;
  
  
  Capabilities of Node.js Inspector:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Breakpoints: Pause execution at specific lines to inspect the state of variables and objects.&lt;/li&gt;
&lt;li&gt;Heap Analysis: Capture and analyze snapshots of memory usage to identify leaks.&lt;/li&gt;
&lt;li&gt;Call Stack: View the sequence of function calls leading to a particular point in execution.&lt;/li&gt;
&lt;li&gt;Performance Profiling: Measure the performance of code execution and detect bottlenecks.&lt;/li&gt;
&lt;li&gt;Step Execution: Step through code one line at a time to observe how the application progresses.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If successful, you’ll see output indicating that the port has been forwarded and the session is waiting for connections.&lt;/p&gt;

&lt;h2&gt;
  
  
  AWS ECS specifics and limitations for debugger setup
&lt;/h2&gt;

&lt;p&gt;The main challenge in remote debugging is enabling port-forwarding between the application runtime and the machine where the debugger (e.g., Chrome DevTools) is running. This process is straightforward in local environments but complex in cloud-managed services like ECS.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key challenges include:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Lack of direct access to the underlying infrastructure: ECS manages the container instances, making traditional SSH-based debugging methods impossible.&lt;/li&gt;
&lt;li&gt;Maintaining security: Exposing debugger ports publicly can introduce vulnerabilities.&lt;/li&gt;
&lt;li&gt;Securely forwarding ports: Ensuring the debugger port is accessible only to authorized clients while retaining private IP configurations.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AWS SSM addresses these issues by allowing secure execution of commands and port forwarding on remote workloads, providing a robust solution for debugging applications running in ECS.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Prepare the Node.js Application
&lt;/h2&gt;

&lt;p&gt;To debug the Node.js application, it must be started with the --inspect flag enabled. This exposes the debugging interface on a specified port.&lt;/p&gt;

&lt;p&gt;Modify the ECS task definition to include the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"node"&lt;/span&gt;, &lt;span class="s2"&gt;"--inspect=0.0.0.0:9229"&lt;/span&gt;, &lt;span class="s2"&gt;"app.js"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/usr/src/node-red &lt;span class="c"&gt;# node --inspect index.js&lt;/span&gt;
Debugger listening on ws://127.0.0.1:9229/9d82bc40-6fe9-4148-b1e8-8089b3351b9f
For &lt;span class="nb"&gt;help&lt;/span&gt;, see: https://nodejs.org/en/docs/inspector
Data
Starting memory leak simulation...
Memory usage: 3 MB
Memory usage: 3 MB
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is a test sample application that emulates memory leak for analysis:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;memoryLeakArray&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;createLeak&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;leakObject&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;alloc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;// Allocate 1MB of memory&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="nx"&gt;memoryLeakArray&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;leakObject&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;printMemoryUsage&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;used&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;memoryUsage&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Memory usage: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;used&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;heapUsed&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt; MB`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;runLoop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;createLeak&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nf"&gt;printMemoryUsage&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;resolve&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt; &lt;span class="c1"&gt;// 1 second delay&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Starting memory leak simulation...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nf"&gt;runLoop&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After making this change, redeploy the ECS task.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Use SSM to Connect to the Container
&lt;/h2&gt;

&lt;p&gt;AWS provides several predefined Systems Manager documents (SSM documents) for various tasks, and AWS-StartPortForwardingSession is among them. This document allows secure port forwarding from an AWS resource (such as an ECS container) to your local machine.&lt;/p&gt;

&lt;p&gt;To forward the Node.js debugging port (9229) to your local machine, run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ssm start-session &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--target&lt;/span&gt; ecs:&amp;lt;cluster-name&amp;gt;_&amp;lt;task-id&amp;gt;_&amp;lt;container-runtime-id&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--document-name&lt;/span&gt; AWS-StartPortForwardingSession &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--parameters&lt;/span&gt; &lt;span class="s1"&gt;'{"portNumber":["9229"], "localPortNumber":["9229"]}'&lt;/span&gt;

Starting session with SessionId: xxxx-xxxx
Port 9229 opened &lt;span class="k"&gt;for &lt;/span&gt;sessionId xxxx-xxxx.
Waiting &lt;span class="k"&gt;for &lt;/span&gt;connections...

Connection accepted &lt;span class="k"&gt;for &lt;/span&gt;session &lt;span class="o"&gt;[&lt;/span&gt;xxxx-xxxx]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3: Debug the Application
&lt;/h2&gt;

&lt;p&gt;Open Chrome and navigate to chrome://inspect.&lt;/p&gt;

&lt;p&gt;Under “Remote Target,” locate the forwarded debugging session and click “Inspect.”&lt;/p&gt;

&lt;p&gt;By default, Chrome is configured with &lt;code&gt;9229&lt;/code&gt; port, but you can use any custom port that was decided to be used for debugging.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwxkpp2m3ffedvlydnts3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwxkpp2m3ffedvlydnts3.png" alt="Screenshot1.png" width="800" height="550"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Use Chrome DevTools to analyze and debug the Node.js process:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu9bfke1vfiffkl4qhjo2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu9bfke1vfiffkl4qhjo2.png" alt="Screenshot2.png" width="800" height="100"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At the same time in console output message of &lt;code&gt;remote debugger attached&lt;/code&gt; will be listed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Memory usage: 3 MB
Debugger attached.
Memory usage: 3 MB
Memory usage: 3 MB
Memory usage: 3 MB
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Monitor the event loop:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm2rls0h8hlqw0rya46sf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm2rls0h8hlqw0rya46sf.png" alt="Screenshot5.png" width="800" height="322"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Analyze heap snapshots to identify memory leaks:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frhyg0b0hg261mdrdezop.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frhyg0b0hg261mdrdezop.png" alt="Screenshot4.png" width="800" height="389"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Step through code execution to pinpoint issues:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbssfvbqvi485ql6lwpuq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbssfvbqvi485ql6lwpuq.png" alt="Screenshot3.png" width="800" height="206"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The beauty of this solution is a high security - the debugging port does not need to be exposed publicly, tunel is initiated dynamically between machine and container.&lt;br&gt;
Even having no inbound/outbound, security groups for cluster (fully private isolated one) we can access to workloads.&lt;br&gt;
And access to such Private Networks without additional network configuration decreases operational overhead.&lt;/p&gt;

&lt;h3&gt;
  
  
  Links:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://nodejs.org/en/learn/getting-started/debugging" rel="noopener noreferrer"&gt;https://nodejs.org/en/learn/getting-started/debugging&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>tutorial</category>
      <category>node</category>
      <category>devops</category>
    </item>
    <item>
      <title>Amazon ElastiCache Redis as a Vector Embeddings Storage for Semantic Search in AWS Community Blog posts</title>
      <dc:creator>Roman Tsypuk</dc:creator>
      <pubDate>Thu, 18 Sep 2025 09:08:37 +0000</pubDate>
      <link>https://dev.to/aws-builders/amazon-elasticache-redis-as-a-vector-embeddings-storage-for-semantic-search-in-aws-community-blog-4igj</link>
      <guid>https://dev.to/aws-builders/amazon-elasticache-redis-as-a-vector-embeddings-storage-for-semantic-search-in-aws-community-blog-4igj</guid>
      <description>&lt;h2&gt;
  
  
  Abstract
&lt;/h2&gt;

&lt;p&gt;The AWS Community Builders program has produced an enormous trove of insightful blog content over the years. These&lt;br&gt;
posts, authored by community members across the globe, capture deep technical expertise, innovative use cases, and&lt;br&gt;
countless hours of experience.&lt;br&gt;
But as this body of content grows, finding related information becomes increasingly difficult using traditional keyword&lt;br&gt;
search.&lt;/p&gt;
&lt;h2&gt;
  
  
  Challenge
&lt;/h2&gt;

&lt;p&gt;To tackle this challenge, I experimented with using &lt;strong&gt;Amazon ElastiCache for Redis&lt;/strong&gt; as a vector store to power semantic&lt;br&gt;
search across all blog posts starting 2021.&lt;/p&gt;

&lt;p&gt;This blog post outlines how I vectorized blog content, stored embeddings in Redis, and leveraged K-Nearest Neighbors (&lt;br&gt;
KNN) to discover content semantically related to phrases or topic of interest, discover posts, cluster posts, identify&lt;br&gt;
the trends etc.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why Use Redis for Vector Search?
&lt;/h2&gt;

&lt;p&gt;Traditionally, Redis is used for caching and fast key-value operations. However, with the introduction of the Redis&lt;br&gt;
Vector Similarity Search (VSS) module, Redis can now be used as a high-performance vector database. This makes it a&lt;br&gt;
powerful choice when you need millisecond-latency search over large datasets of embeddings, such as blog post vectors.&lt;/p&gt;

&lt;p&gt;Benefits of Redis VSS:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Speed: In-memory performance with extremely low latency.&lt;/li&gt;
&lt;li&gt;Simplicity: Store and retrieve vectors using Redis CLI or SDKs.&lt;/li&gt;
&lt;li&gt;Scalability: Redis clusters on ElastiCache scale with demand.&lt;/li&gt;
&lt;li&gt;Integration: Easy to integrate with Python NLP libraries and AWS services.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  What is an Embedding and How Does it Work?
&lt;/h2&gt;

&lt;p&gt;An embedding is a numerical representation of text that captures its meaning in a multi-dimensional space. Words,&lt;br&gt;
sentences, or even entire documents are transformed into fixed-size vectors of floating-point numbers using models like&lt;br&gt;
BERT or Sentence Transformers.&lt;/p&gt;

&lt;p&gt;These vectors are constructed such that semantically similar content lies close together in vector space. For example,&lt;br&gt;
posts discussing "AWS re:Invent keynote" and "major AWS announcements" would have similar embeddings even if they use&lt;br&gt;
different words.&lt;/p&gt;
&lt;h3&gt;
  
  
  Target Architecture:
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fki5zag7xr7vemjw2v0lk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fki5zag7xr7vemjw2v0lk.png" alt="arch.png" width="478" height="217"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each blog post description and content was passed through a transformer-based model.&lt;/li&gt;
&lt;li&gt;The output vector (e.g., 384 dimensions) was stored in Redis.&lt;/li&gt;
&lt;li&gt;When a query is made (e.g., "golden jacket"), it is also converted into an embedding.&lt;/li&gt;
&lt;li&gt;Redis performs a KNN search, returning the closest blog vectors by cosine similarity or Euclidean distance.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Prepare schema in Redis
&lt;/h2&gt;

&lt;p&gt;Once I have collected the &lt;strong&gt;raw&lt;/strong&gt; data, I created index in Redis, all documents will be injected as &lt;strong&gt;json&lt;/strong&gt; document&lt;br&gt;
also having postDescriptionEmbeddings defined as &lt;strong&gt;SchemaFieldTypes.VECTOR&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This vector will be calculated using same algorithm that will be used for the search.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;when creating index, the dimension should exactly match to the dimension of model that is used for embeddings&lt;/p&gt;

&lt;p&gt;Since Xenova/all-MiniLM-L6-v2 was used dimension 384. You can set any other model to check if accuracy is increased,&lt;/p&gt;

&lt;p&gt;but vector size can not be changed so new index creation and full data re-ingestion will be required.&lt;br&gt;
&lt;/p&gt;


&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/*    (RAW COMMAND)
          FT.DROPINDEX idx:posts

          FT.CREATE idx:posts
          ON JSON
              PREFIX 1 "posts:"
          SCHEMA
          '$.postDescriptionEmbeddings': {
            type: SchemaFieldTypes.VECTOR,
            TYPE: 'FLOAT32',
            ALGORITHM: VectorAlgorithms.FLAT,
            DIM: 384,
            DISTANCE_METRIC: 'L2',
            INITIAL_CAP: 111,
            BLOCK_SIZE: 111,
            AS: 'postDescriptionEmbeddings',
          },
          '$.postDescription': {
              type: SchemaFieldTypes.TEXT,
              NOSTEM: true,
              SORTABLE: true,
              AS: 'postDescription',
          },
          '$.link': {
            type: SchemaFieldTypes.TEXT,
            NOSTEM: true,
            SORTABLE: true,
            AS: 'link',
          },
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Ingesting Data into Vector-based storage
&lt;/h2&gt;

&lt;p&gt;I choose &lt;strong&gt;Xenova/all-MiniLM-L6-v2&lt;/strong&gt; for Embedding and vectorization, so once model is initialized, all data in a&lt;br&gt;
processing loop&lt;br&gt;
will be passed to it to augment blog post with on more field &lt;strong&gt;postDescriptionEmbeddings&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;modelName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Xenova/all-MiniLM-L6-v2&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;pipe&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;transformers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;feature-extraction&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;modelName&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;strippedText&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;stripHTML&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;description&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;_id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
&lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;postDescription&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;strippedText&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sentenceEmbedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;generateSentenceEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;strippedText&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;postDescriptionEmbeddings&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;sentenceEmbedding&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;nodeRedisClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;POSTS_KEY_PREFIX&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;post&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;generateSentenceEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;_sentence&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;vectorOutput&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;_sentence&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;pooling&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;mean&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="na"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;values&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;vectorOutput&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Ingested data in Redis:
&lt;/h2&gt;

&lt;p&gt;using &lt;code&gt;redis-cli&lt;/code&gt; we can &lt;code&gt;JSON:GET&lt;/code&gt; any random document and explore its fields:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;JSON.GET posts:0a1a9e88-32e3-43af-86ce-c70b27ef9482
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Besides standard data now it contains also &lt;code&gt;postDescriptionEmbeddings&lt;/code&gt; vector (here I'm cutting digits not to represent the full 384 size of it):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"link"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://dev.classmethod.jp/articles/awsbasics-s3-tables-athena-apache-iceberg/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"createdAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2025-07-20T02:02:36.821Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"postDescription"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"This article introduces AWS S3 Tables, a new service simplifying Apache Iceberg table management on S3. It shows creating tables via console/CLI, CRUD operations with Athena, and highlights automatic optimization, ACID transactions, and seamless integration."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"postDescriptionEmbeddings"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="mf"&gt;-0.05445345863699913&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="mf"&gt;-0.03639936074614525&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="mf"&gt;-0.09256941080093384&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="mf"&gt;-0.02023949660360813&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="mf"&gt;0.016897333785891533&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="mf"&gt;-0.05080459639430046&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="mf"&gt;-0.047063130885362625&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt;

    &lt;/span&gt;&lt;span class="mf"&gt;0.05633614584803581&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="mf"&gt;0.030490988865494728&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="mf"&gt;-0.035038333386182785&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="mf"&gt;-0.014955290593206882&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="mf"&gt;0.03450820595026016&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Congratulations!!! Now we have a digital footprint of all CloudBuiders blog posts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Algorithms to search vectors
&lt;/h2&gt;

&lt;p&gt;No using same model &lt;code&gt;Xenova/all-MiniLM-L6-v2&lt;/code&gt; we can convert any search phrase into vector:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;nodeRedisClient&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getNodeRedisClient&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;modelName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Xenova/all-MiniLM-L6-v2&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;pipe&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;transformers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;feature-extraction&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;modelName&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;searchTxtVectorArr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;generateSentenceEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;searchTxt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is example of search phrase &lt;code&gt;Golden Jacket&lt;/code&gt; that is transformed using Embedding model &lt;code&gt;Xenova/all-MiniLM-L6-v2&lt;/code&gt; into a Vector, this is what vector looks like (with dimension 384):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[
      -0.0519644059240818,    0.08905413746833801,    0.01799578033387661,
     0.049779586493968964,   0.035871658474206924,   0.020033041015267372,
      0.18083901703357697,   -0.05256553739309311,  0.0003853921080008149,
     -0.04665731266140938,    0.01234174519777298,   -0.02134784311056137,
      0.03652122616767883,  0.0066316151060163975,   0.001835306640714407,
      -0.0348852276802063,   0.026533348485827446,  -0.008266004733741283,
     0.023030489683151245,  -0.025720935314893723,   -0.04542113095521927,
   -0.0034457785077393055,   -0.00999429076910019,    0.05410130321979523,
     -0.07294181734323502,    0.04549195244908333,   -0.03297987952828407,
     0.039497267454862595,   -0.05792579427361488,   -0.08336731791496277,
     0.028136223554611206, -0.0004739225551020354,   -0.03645763173699379,
      0.02329356223344803,    -0.1444464474916458,   -0.02707810141146183,
     0.033310066908597946,  0.0071496316231787205,   -0.11640160530805588,
      0.10475143790245056,    -0.0539717860519886,  -0.039327893406152725,
     -0.08680122345685959,   -0.03737214207649231,   -0.02752811089158058,
     0.005382977891713381,  -0.037658315151929855,  -0.009982343763113022,
    -0.014092876575887203,     0.0367547869682312,   -0.07399916648864746,
     0.008599113672971725,   -0.07765820622444153,   0.013821756467223167,
      0.06757687777280807,    0.09657421708106995,  -0.008364601992070675,
     -0.05157032981514931,    -0.0255973469465971,   0.030314497649669647,
    -0.036268994212150574,   0.004683461040258408,   -0.05857793241739273,
     0.009458839893341064,     0.0592721663415432,     0.0403432659804821,
     -0.06441762298345566,    0.05017191544175148,   -0.05505743995308876,
     -0.04220070689916611,    0.05020679160952568,    0.02721044048666954,
  0.000019773140593315475,  0.0006230986909940839, -0.0004238316905684769,
       0.0330372080206871,  -0.040411293506622314,    0.02342337928712368,
      0.09403083473443985,  -0.030885351821780205,    0.03300361707806587,
     -0.06219857931137085,   -0.06710682064294815,    0.06534728407859802,
      0.02729918621480465,  -0.021849708631634712,    0.04849373549222946,
     -0.04060141369700432,   -0.01779020205140114,   0.015026750974357128,
      0.03931884467601776,  -0.016998404636979103,   0.020048541948199272,
      0.07344327867031097,  -0.027000991627573967,  -0.013987813144922256,
      0.07202176004648209,   -0.10249660909175873,   -0.07182241231203079,
        0.177199587225914,
  ... 284 more items
]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Having search vector and all posts descriptions vectorized and stored in Redis, using &lt;strong&gt;KNN&lt;/strong&gt; we will search the nearest neighbor similarities:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;searchQuery&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
  &lt;span class="s2"&gt;`*=&amp;gt;[KNN &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;_resultCount&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; @postDescriptionEmbeddings $searchBlob AS score]`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;nodeRedisClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ft&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;POSTS_INDEX_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;searchQuery&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;PARAMS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;searchBlob&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;float32Buffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;searchTxtVectorArr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;RETURN&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;score&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;description&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;createdAt&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;region&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;link&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;SORTBY&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;BY&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;score&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;DIALECT&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Let's find a CloudBuilders based on a particular topic
&lt;/h2&gt;

&lt;p&gt;Searching &lt;code&gt;Golden Jacket&lt;/code&gt; relevant topics. Got search output:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;total, number of found documents found with limit&lt;/li&gt;
&lt;li&gt;score, vectors compatibility for each document returned
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"total"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"documents"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"posts:79256de8-00aa-4127-b025-df410fb63414"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1.01139831543"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Although I cannot have a AWS 100% fully certified Golden Jacket now, I still buy myself a 25% fully certified Golden Hat 👒 for a little FUN 🤩 Never too late to learn Amazon Web Services (AWS) ! Start your cloud journeys TODAY ! You will become Fully Certified in the future."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"link"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.linkedin.com/posts/walterwlee_aws-cloud-future-activity-7020547672751169536-T3r8?utm_source=share&amp;amp;utm_medium=member_desktop"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"posts:e05e61f3-1cd9-4483-98c7-ad3681a1ea00"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1.03479576111"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Thanks a lot to the Amazon Web Services (AWS) community builder jacket ! 😀 very warm and high quality with YKK zipper 🥰"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"link"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.linkedin.com/posts/walterwlee_aws-awscommunity-awscommunitybuilders-activity-7010055038479859712-ahY-?utm_source=share&amp;amp;utm_medium=member_desktop"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"posts:2f63f16d-8080-4e27-baed-f5c582617106"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1.07405209541"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A blog post sharing my experience doing the AWS certifications for the Golden Jacket, for others who might be interested in going through the same experience."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"link"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://chamila.dev/blog/2024-10-28_completing-11-professional-certifications-in-8-weeks/"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"posts:554df865-951d-4937-abad-5ef84f6e67fe"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1.12301290035"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"How you can achieve all AWS certifications and get the golden jacket - Part 1 "&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"link"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://dev.to/electrokat/how-you-can-achieve-all-aws-certifications-and-get-the-golden-jacket-part-1-16dp"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"posts:9a772e6c-f3fe-419a-afe2-b5b0038dbdcb"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1.2975268364"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"How you can achieve all AWS certifications and get the golden jacket - Part 2 - How to study "&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"link"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://dev.to/electrokat/how-you-can-achieve-all-aws-certifications-and-get-the-golden-jacket-part-2-how-to-study-2b4c"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  search results for "Golden Jacket":
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Score&lt;/th&gt;
&lt;th&gt;Details&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1.01139831543&lt;/td&gt;
&lt;td&gt;Although I cannot have a AWS 100% fully certified Golden Jacket now, I still buy myself a 25% fully certified Golden Hat 👒 for a little FUN 🤩 Never too late to learn Amazon Web Services (AWS) ! Start your cloud journeys TODAY ! You will become Fully Certified in the future.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1.03479576111&lt;/td&gt;
&lt;td&gt;Thanks a lot to the Amazon Web Services (AWS) community builder jacket ! 😀 very warm and high quality with YKK zipper 🥰&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1.07405209541&lt;/td&gt;
&lt;td&gt;A blog post sharing my experience doing the AWS certifications for the Golden Jacket, for others who might be interested in going through the same experience.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1.12301290035&lt;/td&gt;
&lt;td&gt;How you can achieve all AWS certifications and get the golden jacket - Part 1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1.2975268364&lt;/td&gt;
&lt;td&gt;How you can achieve all AWS certifications and get the golden jacket - Part 2 - How to study&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  search results for "play games":
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Score&lt;/th&gt;
&lt;th&gt;Details&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1.49744784832&lt;/td&gt;
&lt;td&gt;🚀 Vibe coding a simple STEM Game with Amazon Q CLI —so I created this simple game to help make learning easier with the help of Amazon Q CLI 🤖✨.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1.54997515678&lt;/td&gt;
&lt;td&gt;Learn DevOps By Doing Projects&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1.56706357002&lt;/td&gt;
&lt;td&gt;I wrote a blog post about building a web-based game using only Amazon Q Developer. The post received 199 readers and 26 reactions from the community.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1.57633209229&lt;/td&gt;
&lt;td&gt;This article about my personal challenges as a sales and business developer to have interaction with tech. persons.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1.61025881767&lt;/td&gt;
&lt;td&gt;Join the AWS GenAI PartyRock hackathon!&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  What else can we do with vectors?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Discover similar blog posts and engage with authors.&lt;/li&gt;
&lt;li&gt;Track evolving themes across community content.&lt;/li&gt;
&lt;li&gt;Understand what topics resonate or need more visibility.&lt;/li&gt;
&lt;li&gt;Retrieve relevant document chunks or paragraphs to feed into an LLM (e.g., via Retrieval-Augmented Generation or RAG).&lt;/li&gt;
&lt;li&gt;Document Classification&lt;/li&gt;
&lt;li&gt;Posts with similar embeddings can be linked with a virtual “related-to” edge. You have read a post and enjoy it, vectorDB can recommend other posts that have similar vector&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Observations:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The majority of CPU intensive work is on the producer that is running the model and performs the transformation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The important caviat is that model must be instantiated only once and all data is fed through it. Otherwise, if model is&lt;br&gt;
created for&lt;br&gt;
every data entry, it will take a lot of time to finish the ingession plus unneeded memory alocation will be performed.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Redis instance based on htop info does not have spike during data ingestion, same when search queries are performed.&lt;/li&gt;
&lt;li&gt;On data retrieval, the most time-consuming operation (1sec) was request transformation into vector&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are some parameters that allow to adjust output:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;return results count — you can return different response batches or implement a cascaded system, that will do next
batch call and lower the threshold to establish data&lt;/li&gt;
&lt;li&gt;score threshold — the lower the value, the more closer vectors are. However, this is true for a mathematical model
that was used but sometimes not to common scence.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusions
&lt;/h2&gt;

&lt;p&gt;Using Elastic Cache Redis as a vector store offers a powerful, low-latency way to bring semantic search to the AWS&lt;br&gt;
Community Builder blog archive. By moving beyond keywords and enabling intelligent discovery, we help amplify the voices&lt;br&gt;
and contributions of community members everywhere.&lt;/p&gt;

&lt;p&gt;This architecture is not limited to AWS community content—it can be adapted for any developer-driven knowledge base. The&lt;br&gt;
future of search is vectorized, and Redis makes that future fast, scalable, and accessible.&lt;/p&gt;

&lt;h2&gt;
  
  
  Links:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://redis.io/docs/latest/develop/ai/search-and-query/query/" rel="noopener noreferrer"&gt;https://redis.io/docs/latest/develop/ai/search-and-query/query/&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>database</category>
      <category>ai</category>
      <category>aws</category>
      <category>machinelearning</category>
    </item>
  </channel>
</rss>
