<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Maxim</title>
    <description>The latest articles on DEV Community by Maxim (@maximcoding).</description>
    <link>https://dev.to/maximcoding</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F309160%2Fef2b2fcd-5b4a-41e7-89dd-b5191013b166.png</url>
      <title>DEV Community: Maxim</title>
      <link>https://dev.to/maximcoding</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/maximcoding"/>
    <language>en</language>
    <item>
      <title>Kafka, RabbitMQ, or Postgres? The Messaging Tools Comparison (2026)</title>
      <dc:creator>Maxim</dc:creator>
      <pubDate>Wed, 29 Apr 2026 10:28:40 +0000</pubDate>
      <link>https://dev.to/maximcoding/messaging-notification-tools-comparison-42j5</link>
      <guid>https://dev.to/maximcoding/messaging-notification-tools-comparison-42j5</guid>
      <description>&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PostgreSQL&lt;/strong&gt;: DB with Transactional Message Queue.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MQTT (Mosquitto)&lt;/strong&gt;: Lightweight IoT Messaging Protocol.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS SQS&lt;/strong&gt;: Managed Serverless Message Queue.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RabbitMQ&lt;/strong&gt;: Enterprise Message Broker (AMQP).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis Streams&lt;/strong&gt;: In-memory Append-only Event Log.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BullMQ (Redis on Nodejs)&lt;/strong&gt;: Distributed Task Management System.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NATS (on Go)&lt;/strong&gt;: Ultra-fast Cloud-Native Messaging.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ActiveMQ&lt;/strong&gt;: Classic Multi-protocol Enterprise Broker.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Apache Kafka&lt;/strong&gt;: High-throughput Distributed Event Store (Immutable Log).&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Choosing the right messaging tool depends on your needs and tradeoffs. Here is a breakdown of the most popular options, from database triggers to massive data streams.&lt;/p&gt;

&lt;p&gt;Before we jump into the comparison, I want to mention &lt;strong&gt;Disk Paging&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It's a technique used when RAM is full: the OS moves inactive pages to a "page file" on the Disk (&lt;strong&gt;paging out&lt;/strong&gt;) and brings them back only when needed (&lt;strong&gt;paging in&lt;/strong&gt;). In our case, this is the "safety net" that keeps a broker from crashing when the queue exceeds available memory capacity. It saves your server from an OOM (Out of Memory) crash, but the price is latency—reading from a disk is orders of magnitude slower than reading from RAM.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. PostgreSQL (LISTEN/NOTIFY)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;The most simple, no-setup required.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Max Message Size (Payload):&lt;/strong&gt; 8 KB — don't try to send the "cargo." Send the "tracking number" (ID). It’s built for notifications, not data transport.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Max Queue Capacity (Backlog):&lt;/strong&gt; Very Low (Shared Buffer)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Storage Strategy:&lt;/strong&gt; RAM (Volatile)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Scalability Type:&lt;/strong&gt; Vertical — Increasing the power of a single server (more CPU cores, more RAM, faster NVMe SSD).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;✅ Props:&lt;/strong&gt; Built into database. No extra installation. It is strictly transactional, and the message is &lt;strong&gt;ONLY&lt;/strong&gt; sent &lt;strong&gt;IF&lt;/strong&gt; your data is actually saved (committed).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;❌ Cons:&lt;/strong&gt; No storage. No Disk Paging technique. It is a "shout and forget" system. If your application is offline or restarting for one second, it misses the message forever. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🛡️ Security (4/5):&lt;/strong&gt; Very safe. It stays inside your database and uses existing DB users. Safe if your database is locked behind a firewall.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🎯 Use Case:&lt;/strong&gt; Instant cache invalidation, real-time UI updates (like WebSockets or Push Notifications), or internal database triggers.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Resource Usage:&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;🧠 RAM:&lt;/strong&gt; Very low. It only keeps a small list of active listeners in memory.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;💾 Disk:&lt;/strong&gt; 0%. It never writes notifications to the disk. They are not saved in logs or tables.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;⚙️ CPU:&lt;/strong&gt; Very low. The database just "pokes" the listener when a change happens.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;The "Memory-Only" Reality:&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Speed:&lt;/strong&gt; Incredibly fast because there is no disk "write" time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Volatile:&lt;/strong&gt; If the server restarts, all pending notifications disappear instantly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Capacity:&lt;/strong&gt; Designed for "pings" and signals, not for holding high-volume data streams.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  2. 💡 MQTT (Mosquitto)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;em&gt;The King of IoT &amp;amp; Real-time&lt;/em&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Have to mention: &lt;strong&gt;It's a protocol&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Standard Mosquitto (is single-threaded) — does not support native clustering. You scale it by giving the server a faster CPU and more RAM.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; If you need Horizontal scale, you have to move from Mosquitto to a "clustered" broker like EMQX or HiveMQ.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Max Payload:&lt;/strong&gt; 256 MB (Protocol limit).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Max Queue Capacity:&lt;/strong&gt; Thousands (RAM dependent).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage:&lt;/strong&gt; RAM or Disk (Persistence for restarts).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability Type:&lt;/strong&gt; Vertical (Single Node).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;✅ Props:&lt;/strong&gt; Fast and tiny. Saves battery and data. Great for unstable internet connections.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;❌ Cons:&lt;/strong&gt; Not designed for complex data processing. Weak default settings, no disk paging technique.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🛡️ Security (2/5):&lt;/strong&gt; Depends 100% on you. Usually "open" by default. You must manually add passwords and SSL/TLS to secure the connection.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🎯 Use Case:&lt;/strong&gt; IoT sensors, smart homes, and mobile apps with poor signal.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Small/IoT project?&lt;/strong&gt; Vertical (Mosquitto) is enough.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Global/Fleet project?&lt;/strong&gt; You must go Horizontal (EMQX/HiveMQ).&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  3. ☁️ AWS SQS
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;em&gt;The King of Serverless&lt;/em&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Consumers go offline for a weekend? These tools won't blink. For ActiveMQ or Kafka, just make sure you have enough disk space. For SQS? &lt;strong&gt;Just hold your wallet!&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;But it's true simplicity, no management and holds your messages safely for up to 14 days. Click a button -&amp;gt; get a URL -&amp;gt; and start sending messages. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Max Payload:&lt;/strong&gt; 1 MB (Updated in 2025/26)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Max Queue Capacity:&lt;/strong&gt; Virtually Unlimited&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage Strategy:&lt;/strong&gt; Managed Cloud&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability Type:&lt;/strong&gt; Infinite Horizontal — you just send more requests and get more (&lt;strong&gt;it depends on your wallet&lt;/strong&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;✅ Props:&lt;/strong&gt; Zero maintenance and No servers to manage, AWS runs the servers. It scales to high message volumes automatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;❌ Cons:&lt;/strong&gt; Locked into Amazon (Vendor lock-in). Can have higher latency compared to self-hosted tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🛡️ Security (5/5):&lt;/strong&gt; Highest safety out-of-the-box. AWS manages security patches and forces strict access rules (IAM).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🎯 Use Case:&lt;/strong&gt; Connecting cloud apps without managing any physical hardware or software installation.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. RabbitMQ
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;em&gt;The King of Routing&lt;/em&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In a Cluster all servers are in sync like "married." Every node knows exactly what is happening on every other node.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Max Payload:&lt;/strong&gt; 128 MB+&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Max Queue Capacity:&lt;/strong&gt; Millions (using Lazy Queues)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage:&lt;/strong&gt; RAM + Disk &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability Type:&lt;/strong&gt; Horizontal (Clustered)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;✅ Props:&lt;/strong&gt; Advanced routing logic (Exchanges). Reliable "mailbox" that waits for your app to be ready.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;❌ Cons:&lt;/strong&gt; Uses a lot of RAM in default mode. Horizontal scaling requires careful configuration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🛡️ Security (3/5):&lt;/strong&gt; Good, but you must change the default "guest" password and close the admin portal to the public.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🎯 Use Case:&lt;/strong&gt; Business tasks like processing orders, background jobs, or sending emails.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;The Two Modes:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1. Normal Queues (RAM-First):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;How it works:&lt;/strong&gt; Keeps messages in RAM for speed. Moves to Disk only if RAM is full (&lt;strong&gt;Disk Paging technique&lt;/strong&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;✅ Pros:&lt;/strong&gt; Extremely fast.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;❌ Cons:&lt;/strong&gt; High RAM usage. If RAM fills up, the system may lag while moving data to disk.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Lazy Queues (Disk-First):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;How it works:&lt;/strong&gt; Stores messages on the Disk immediately.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;✅ Pros:&lt;/strong&gt; Can store millions of messages without crashing the server. Very stable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;❌ Cons:&lt;/strong&gt; Slower because writing to Disk is not as fast as RAM.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  5. ⚡  Redis Streams For Events (Kafka Lite)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;The King of Speed&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Chatbot History / Live Feed, Agents, RAG systems 🔥&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The high-performance, in-memory log.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Redis 7.0+, messaging is no longer a 'toy.' With the introduction of &lt;strong&gt;Sharded Pub/Sub&lt;/strong&gt;, it learned how to 'breathe' in massive clusters without wasting bandwidth on unnecessary junk or redundant data. Redis Streams gives you 80% of Kafka's capabilities, but with ms latency and without the need to manage a massive Java-based infrastructure.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Max Payload:&lt;/strong&gt; 512 MB (The limit of a Redis string).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Max Backlog:&lt;/strong&gt; RAM-dependent. Unlike ActiveMQ, Redis lives in your memory. If backlog is too large, there is 2 scenarios: run out of RAM or start evicting old data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage Strategy:&lt;/strong&gt; In-Memory. While it has RDB/AOF persistence to survive a restart, it does not do "Disk Paging" in the middle of a live session to save RAM.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability Type:&lt;/strong&gt; Horizontal (Clustered/ Sharding).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You scale it using Redis Cluster. You shard the data so that different nodes handle different "keys" (Streams).&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;🛠 How it Works?&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Redis Streams is an append-only log data structure. Unlike a standard Redis List (LPUSH/RPOP), a Stream allows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Consumer Groups:&lt;/strong&gt; Just like Kafka, multiple consumers can join a group to split the work (this is a unique game-changer for Redis).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Acknowledge (ACK):&lt;/strong&gt; Redis tracks which consumer has processed which message. If a consumer dies, another can "claim" the pending messages.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Message IDs:&lt;/strong&gt; Every entry gets a unique ID based on a timestamp (e.g., &lt;code&gt;1626451200000-0&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  6.🐂 BullMQ (Task Queue)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;em&gt;Best for Tasks Management&lt;/em&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;LLM Inference, RAG Integration, AI Agents (Orchestration)  🔥&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Have to mention: It’s a Library Node.js, not a standalone server.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;BullMQ leverages Redis via Lua scripts (Runtime Logic) in RAM&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Think of it like this: Node.js says, 'Hey Redis, take this script, run the logic atomically inside your own memory, and just give me &lt;br&gt;
the result.&lt;/p&gt;

&lt;p&gt;This eliminates unnecessary network round-trips and guarantees that a task is never lost—even if a worker crashes mid-process—because the state is managed entirely within Redis.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Need a massive horizontal scale for the storage layer itself? Use BullMQ with Redis Cluster.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Max Payload: 512 MB (Redis limit), but it's by best practice it &amp;lt; 100 KB (pass IDs, not the entire data object).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Max Queue Capacity: Millions of tasks (dependent strictly on your Redis RAM).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Storage: Redis (In-memory) + Persistence on Disk with (RDB/AOF for recovery after restarts), RDB is a Snapshot , AOF Journal with every single write command (like Git History).&lt;br&gt;
Scalability Type: Horizontal (Spin up more Workers).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;✅ Props: manage states (States: Waiting, Active, Completed, Failed, Delayed). Features out-of-the-box: Parent-Child Job Flows, automatic Exponential Backoff Retries (Exponential delays, first time wait 1s, second 2s, third time wait 4s), and Task Prioritization.&lt;/p&gt;

&lt;p&gt;❌ Cons: Hard dependency on Redis. IF Redis RAM is full and "eviction" is misconfigured (LRU), the queue will break. Requires careful tuning of lockDuration for long-running tasks.&lt;/p&gt;

&lt;p&gt;🛡️ Security (4/5): Inherits Redis security features. Supports ACLs (Access Control Lists) and SSL/TLS. Data is usually not encrypted.&lt;/p&gt;

&lt;p&gt;🎯 Use Case: LLM integrations (handling long-running API requests), video/image processing, high-volume notification systems, and complex financial workflows.&lt;/p&gt;

&lt;p&gt;Standard background task? BullMQ + Single Redis node will be enough.&lt;/p&gt;

&lt;p&gt;Huge AI Pipeline? BullMQ Flows (Parent-Child) + Redis Cluster.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. NATS
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;em&gt;The King of Edge &amp;amp; Speed&lt;/em&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;History: It started in Ruby, then NATS was re-engineered in Go to maximize concurrency.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The connective tissue for modern distributed systems.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To understand NATS in 2026, you have to know it has two "modes":&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Strategy:&lt;/strong&gt; Pure RAM. It doesn’t even try to save messages to disk. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Catch:&lt;/strong&gt; If a subscriber isn't online, the message is lost forever (reminds you a Postgres?) &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Speed:&lt;/strong&gt; 10+ million messages per second. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case:&lt;/strong&gt; Real-time signals, heartbeats, and "I don't care if I miss one" data. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. NATS JetStream (The "Persistence" mode)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Strategy:&lt;/strong&gt; Smart, it adds Disk Persistence, Consumer Groups, and Acknowledgments. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage Strategy:&lt;/strong&gt; File or Memory. You can configure it to store messages on disk, acting as your "Disk Paging" safety net. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Weekend Outage Test:&lt;/strong&gt; When JetStream enabled, it won't blink if consumers go offline—it just stores the data in a stream.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  8. ActiveMQ
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;em&gt;The King of Enterprise&lt;/em&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You don't have one big super-server. Instead, you have multiple independent brokers connected by a "bridge" in a &lt;strong&gt;Network of Brokers&lt;/strong&gt; and the servers are "neighbors." They are independent (not like RabbitMQ). &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Max Payload:&lt;/strong&gt; Unlimited (Disk bound)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Max Queue Capacity:&lt;/strong&gt; Hundreds of Millions (The king)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage Strategy:&lt;/strong&gt; Disk Paging&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability Type:&lt;/strong&gt; Horizontal (Network)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Why use "Network" instead of "Cluster"?&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Geographic Distribution:&lt;/strong&gt; one broker in New York, one in London.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Store-and-Forward logic:&lt;/strong&gt; This allows the system to scale horizontally across different physical locations and unreliable networks.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;✅ Props:&lt;/strong&gt; Supports many protocols (JMS, AMQP, STOMP). Highly flexible for different programming languages.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;❌ Cons:&lt;/strong&gt; Slower than newer, lightweight tools. Can feel complex to manage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🛡️ Security (4/5):&lt;/strong&gt; Strong potential, but requires expert knowledge in setup because supporting many protocols means more ports to lock down.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🎯 Use Case:&lt;/strong&gt; Corporate Java apps and integrating various systems that use a bunch of protocols.&lt;/p&gt;




&lt;h2&gt;
  
  
  9. Apache Kafka
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;em&gt;The King of Throughput&lt;/em&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Massive data &amp;amp; Event streaming capabilities.&lt;/strong&gt;&lt;br&gt;
Go offline for a weekend, tool won't blink (same as SQS) but keeps data until the disk is full.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Max Payload:&lt;/strong&gt; 1 MB (Default) / 100 MB+&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Max Queue Size:&lt;/strong&gt; Petabytes (Log Retention)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage Strategy:&lt;/strong&gt; Disk (Immutable Log)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability Type:&lt;/strong&gt; Native Horizontal&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;✅ Props:&lt;/strong&gt; Huge throughput. It is a "distributed Log" system that saves everything, so you can "rewind" and read old data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;❌ Cons:&lt;/strong&gt; Complex and high learning curve. Requires significant infrastructure to run properly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🛡️ Security (4/5):&lt;/strong&gt; Can be the most secure, but carries a high risk of human error (&lt;strong&gt;a lot of complex security settings to manage&lt;/strong&gt;).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🎯 Use Case:&lt;/strong&gt; Event streaming, Analytics (real-time), Logs aggregation, and fraud detection.&lt;/p&gt;




&lt;h3&gt;
  
  
  The Messaging &amp;amp; Queuing Decision Matrix
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Choose PostgreSQL&lt;/strong&gt; for tasks that require absolute data safety and guaranteed database transactions (ACID).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Choose MQTT&lt;/strong&gt; for IoT devices and mobile apps that need to stay connected over slow or unstable networks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Choose AWS SQS&lt;/strong&gt; for serverless projects, infinitely scalable queue without managing any servers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Choose RabbitMQ&lt;/strong&gt; for complex business logic that requires smart message routing and flexible delivery rules.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Choose Redis Streams&lt;/strong&gt; for real-time AI chatbots and RAG systems that need ultra-fast, in-memory event history.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Choose BullMQ&lt;/strong&gt; for Node.js AI pipelines that need to manage complex task states and automatic retries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Choose NATS&lt;/strong&gt; for ultra-fast microservices that require the lowest possible latency for instant communication.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Choose ActiveMQ&lt;/strong&gt; for large enterprise systems that need to connect old Java apps using different messaging protocols.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Choose Apache Kafka&lt;/strong&gt; for massive data streams where you need a permanent, history of every events (Immutable Logs)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;🔗 Connect with me: &lt;a href="https://github.com/maximcoding" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;, &lt;a href="https://linkedin.com/in/maxim-livshitz" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>fullstack</category>
      <category>programming</category>
      <category>devops</category>
    </item>
    <item>
      <title>От векторов 70-х к современным AI-агентам 🧩</title>
      <dc:creator>Maxim</dc:creator>
      <pubDate>Mon, 27 Apr 2026 13:05:52 +0000</pubDate>
      <link>https://dev.to/maximcoding/-pazl-slozhilsia-ot-viektorov-70-kh-k-sovriemiennym-ai-aghientam-7pb</link>
      <guid>https://dev.to/maximcoding/-pazl-slozhilsia-ot-viektorov-70-kh-k-sovriemiennym-ai-aghientam-7pb</guid>
      <description>&lt;p&gt;Я, как всегда, со своим скрупулезным подходом — решил залезть под капот и разобраться, как это всё работает на самом деле. А там... старый добрый &lt;strong&gt;pipeline&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Идеи сходства (similarity) и поиска ближайших соседей (nearest neighbors) — это вообще не ново. Эти концепции десятилетиями использовались в поиске, computer vision, распознавании образов и исследовательских системах еще задолго до публичного AI-бума.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Новое здесь — только обертка:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Vector DB:&lt;/strong&gt; Назвали их «новым» типом баз для массового рынка (Pinecone и другие), хотя по сути это просто эффективные индексы для векторов.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API и Cloud services:&lt;/strong&gt; Удобный способ доставки и масштабирования.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RAG-пайплайны и LLM:&lt;/strong&gt; Просто интерфейс для конечного пользователя.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Векторные представления (Embeddings) были стандартом еще в 70x! «Новизна» лишь в том, что теперь мы можем ворочать огромными объемами данных через API.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. Что под капотом? (The Pipeline) ⚙️
&lt;/h3&gt;

&lt;p&gt;Когда мы пишем запрос (prompt), запускается тот самый pipeline:&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Tokenizer&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Текст режется на куски (токены). Слово или символ превращается в ID. По сути, это простая структура данных (&lt;strong&gt;Map&lt;/strong&gt;) и алгоритм нарезки.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;В open-source:&lt;/strong&gt; это файлы &lt;code&gt;tokenizer.json&lt;/code&gt;, &lt;code&gt;vocab.json&lt;/code&gt;, &lt;code&gt;merges.txt&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;В managed API:&lt;/strong&gt; это спрятано внутри runtime провайдера.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;userQuestion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;How does Vector DB work?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;// Процесс токенизации (упрощенно):&lt;/span&gt;
&lt;span class="c1"&gt;// input  -&amp;gt; ["How", " does", " Vector", " DB", " work", "?"]&lt;/span&gt;
&lt;span class="c1"&gt;// output -&amp;gt; [2437, 857, 12944, 6212, 990, 30]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Элементарный маппинг: каждый токен получает свой уникальный ID.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Embedding Layer&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Токены попадают в модель и заменяются на векторы (координаты в многомерном пространстве):&lt;br&gt;
&lt;code&gt;2437 -&amp;gt; [0.12, -0.03, 0.88, ...]&lt;/code&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  &lt;strong&gt;Attention (Механизм внимания)&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Здесь модель вычисляет, какие токены в контексте связаны друг с другом. В вопросе &lt;em&gt;"How does Vector DB work?"&lt;/em&gt; токен &lt;strong&gt;"work"&lt;/strong&gt; должен «внимательно» смотреть на &lt;strong&gt;"Vector DB"&lt;/strong&gt;, чтобы понять смысл. Это почти как поиск по ключевым словам (Keywords) для определения интента вопроса.&lt;/p&gt;
&lt;h4&gt;
  
  
  &lt;strong&gt;Matrix Multiplication&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Основная математика. Векторы умножаются на веса (&lt;strong&gt;weights&lt;/strong&gt;) уже обученной модели. &lt;br&gt;
&lt;strong&gt;Weights&lt;/strong&gt; — это как взвешенное среднее. Входные данные с бóльшим весом сильнее влияют на итоговый результат. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Важно понимать:&lt;/strong&gt; в момент запроса модель не учится. Она просто прогоняет данные через готовую функцию:&lt;br&gt;
$$weights \times vectors$$&lt;/p&gt;
&lt;h4&gt;
  
  
  &lt;strong&gt;Next Token Prediction&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;На выходе мы получаем вероятности для следующего токена. Выбираем один, добавляем в контекст и уходим на новый круг (&lt;strong&gt;Loop&lt;/strong&gt;):&lt;br&gt;
&lt;code&gt;Context -&amp;gt; Trained Function -&amp;gt; Prediction.&lt;/code&gt;&lt;br&gt;
И так токен за токеном, пока не соберется ответ.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Это мне как раз напомнило курс &lt;strong&gt;Andrew Ng&lt;/strong&gt;. Там параметры шли через функцию, мы считали ошибку (cost) и двигались к минимуму (loss). Только тут prediction — не class label, а следующий токен из словаря."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Кстати, курс Эндрю Ына вышел уже &lt;strong&gt;12 лет назад!&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;И тут я задумался: если эти идеи публично преподавались так давно, сколько закрытых систем использовали их раньше? &lt;/p&gt;

&lt;p&gt;Системы ПВО, спутниковая разведка, медицинские экспертные системы (вроде MYCIN из 70-х). &lt;br&gt;
Всё это живет на этих принципах десятилетиями!&lt;/p&gt;


&lt;h3&gt;
  
  
  3. Проблема «памяти» и RAG 🧠
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Context window&lt;/strong&gt; — это жесткий limit на количество токенов за один request. &lt;br&gt;
Модель не «забыла» информацию — просто если данных слишком много, приложение отрезает старое по принципу &lt;strong&gt;FIFO&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;countTokens&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;contextLimit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;shift&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;// Старый контекст просто не отправляется в новый запрос&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Чтобы использовать модель эффективно, нужно «подкармливать» её свежими данными через метод &lt;strong&gt;RAG (Retrieval-Augmented Generation)&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Как это работает:&lt;/strong&gt; Вопрос пользователя ⮕ Поиск по документам ⮕ Найденные фрагменты + Вопрос ⮕ Языковая модель ⮕ Ответ.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Зачем это нужно:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Устраняет «галлюцинации»:&lt;/strong&gt; Модель опирается на факты из поиска, а не на свою «память».&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Актуальность:&lt;/strong&gt; RAG позволяет использовать свежие данные без дорогостоящего переобучения и надежды на магию.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  4. Кто рулит этим процессом? (Оркестрация) 🎮
&lt;/h3&gt;

&lt;p&gt;Это не делает сама модель. Она не решает, какую историю чата взять и какие инструменты вызвать. Этим занимается &lt;strong&gt;orchestration layer&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Когда нам стало мало обычного чат-бота, то появились такие штуки как &lt;strong&gt;LangGraph&lt;/strong&gt; (&lt;strong&gt;State Machine&lt;/strong&gt;), которая работает вокруг модели. Оркестратор создает макро-flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Запрос пользователя ⮕ Слой оркестрации.&lt;/li&gt;
&lt;li&gt;Проверка состояния / памяти.&lt;/li&gt;
&lt;li&gt;Определение маршрута (Route).&lt;/li&gt;
&lt;li&gt;Поиск в Vector DB.&lt;/li&gt;
&lt;li&gt;Вызов инструментов (Tools).&lt;/li&gt;
&lt;li&gt;Формирование финального контекста ⮕ Вызов LLM.&lt;/li&gt;
&lt;li&gt;Валидация ответа ⮕ Конец или новый цикл.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Если LLM — это вероятностный движок, то LangGraph — это &lt;strong&gt;императивный контроль&lt;/strong&gt;. Он говорит: &lt;em&gt;"Сначала сходи в SQL за фактами, если мало — загляни в Vector DB, проверь результат, если хрень — иди на повторный цикл"&lt;/em&gt;.&lt;/p&gt;




&lt;h3&gt;
  
  
  Архитектура вместо магии 🏗️
&lt;/h3&gt;

&lt;p&gt;Современный AI-агент — это архитектура. Называть её «интеллектом» в полном смысле слова странно, но то, что мы сейчас все живем в статистической матрице — это факт!&lt;/p&gt;




</description>
      <category>ai</category>
      <category>softwareengineering</category>
      <category>programming</category>
      <category>fullstack</category>
    </item>
    <item>
      <title>From 70s Vectors to Modern AI Agents</title>
      <dc:creator>Maxim</dc:creator>
      <pubDate>Mon, 27 Apr 2026 11:40:27 +0000</pubDate>
      <link>https://dev.to/maximcoding/the-puzzle-is-complete-from-70s-vectors-to-modern-ai-agents-nothing-new-under-the-sun-2ll1</link>
      <guid>https://dev.to/maximcoding/the-puzzle-is-complete-from-70s-vectors-to-modern-ai-agents-nothing-new-under-the-sun-2ll1</guid>
      <description>&lt;h1&gt;
  
  
  🧩 AI Agent Architecture: From 70s Vectors to Modern RAG Pipelines
&lt;/h1&gt;

&lt;h3&gt;
  
  
  1. Nothing New Under the Sun ☀️
&lt;/h3&gt;

&lt;p&gt;As usual, with my meticulous approach, I decided to peek under the hood and figure out how it all actually works. And there it was... the &lt;strong&gt;pipeline&lt;/strong&gt;! Tadam !&lt;/p&gt;

&lt;p&gt;The ideas of similarity and nearest neighbor search are nothing new. These concepts have been used for decades in search, computer vision, pattern recognition, and research systems long before the public AI boom.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The only "new" thing is the wrapper:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Vector DB:&lt;/strong&gt; Marketed as a "new" type of database (&lt;a href="https://www.reddit.com/r/vectordatabase/" rel="noopener noreferrer"&gt;Pinecone&lt;/a&gt; and others), though they are essentially just efficient indices for vectors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API and Cloud services:&lt;/strong&gt; A convenient way to deliver and scale.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://www.reddit.com/r/Rag/rising/" rel="noopener noreferrer"&gt;RAG pipelines&lt;/a&gt; and LLMs:&lt;/strong&gt; Simply a high-level interface for the end user.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Vector representations—or Embeddings—are essentially data mapped as spatial coordinates, much like X, Y, and Z in 3D space. This was already a standard in the 70s, known back then as Vector Space Models. The only real 'novelty' today? We’re doing it in thousands of dimensions and processing massive datasets via an API.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. What's Under the Hood? (The Pipeline) ⚙️
&lt;/h3&gt;

&lt;p&gt;When we write a prompt query, that same pipeline kicks in:&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;The Tokenizer&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Text is sliced into chunks (tokens). A word or symbol becomes an ID. Essentially, it’s a simple data structure (&lt;strong&gt;Map&lt;/strong&gt;) and a slicing algorithm.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;In open-source:&lt;/strong&gt; these are &lt;code&gt;tokenizer.json&lt;/code&gt;, &lt;code&gt;vocab.json&lt;/code&gt;, and &lt;code&gt;merges.txt&lt;/code&gt; files.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;In managed APIs:&lt;/strong&gt; this is hidden inside the provider's runtime.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;userQuestion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;How does Vector DB work?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;// Tokenization process (simplified):&lt;/span&gt;
&lt;span class="c1"&gt;// input  -&amp;gt; ["How", " does", " Vector", " DB", " work", "?"]&lt;/span&gt;
&lt;span class="c1"&gt;// output -&amp;gt; [2437, 857, 12944, 6212, 990, 30]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Elementary mapping: each token gets its own unique ID.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Embedding Layer&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Tokens enter the model and are replaced by vectors (coordinates in a multi-dimensional space):&lt;br&gt;
&lt;code&gt;2437 -&amp;gt; [0.12, -0.03, 0.88, ...]&lt;/code&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  &lt;strong&gt;Attention Mechanism&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Here, the model calculates which tokens in the context are related to each other and how strongly. In the question &lt;em&gt;"How does Vector DB work?"&lt;/em&gt;, the token &lt;strong&gt;"work"&lt;/strong&gt; must "attentively" look at &lt;strong&gt;"Vector DB"&lt;/strong&gt; to understand the meaning. It’s almost like a keyword search for the intent of the question.&lt;/p&gt;
&lt;h4&gt;
  
  
  &lt;strong&gt;Matrix Multiplication&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;The fundamental math. Vectors are multiplied by the &lt;strong&gt;weights&lt;/strong&gt; of the pre-trained model. &lt;br&gt;
&lt;strong&gt;Weights&lt;/strong&gt; act like a weighted average. Input data with a higher weight influences the final result more strongly. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It’s important to understand:&lt;/strong&gt; the model does not learn at the moment of the request. It simply runs the data through a pre-set function:&lt;br&gt;
$$weights \times vectors \rightarrow \text{probabilities}$$&lt;/p&gt;
&lt;h4&gt;
  
  
  &lt;strong&gt;Next Token Prediction&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;On the output, we get probabilities for the next token. We choose one, add it to the context, and start a new circle (&lt;strong&gt;Loop&lt;/strong&gt;):&lt;br&gt;
&lt;code&gt;Context -&amp;gt; Trained Function -&amp;gt; Prediction.&lt;/code&gt;&lt;br&gt;
And so on, token by token, until the answer is complete.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"This reminded me exactly of &lt;strong&gt;Andrew Ng's&lt;/strong&gt; Machine Learning course on &lt;strong&gt;&lt;a href="https://www.coursera.org/specializations/deep-learning?utm_medium=sem&amp;amp;utm_source=gg&amp;amp;utm_campaign=b2c_emea_deep-learning_deeplearning-ai_ftcof_specializations_cx_dr_bau_gg_sem_pr_s1_en_m_hyb_23-12_x&amp;amp;campaignid=20858198821&amp;amp;adgroupid=156245836989&amp;amp;device=c&amp;amp;keyword=andrew%20ng%20deep%20learning&amp;amp;matchtype=p&amp;amp;network=g&amp;amp;devicemodel=&amp;amp;creativeid=684249171964&amp;amp;assetgroupid=&amp;amp;targetid=kwd-337350462138&amp;amp;extensionid=&amp;amp;placement=&amp;amp;gad_source=1&amp;amp;gad_campaignid=20858198821&amp;amp;gbraid=0AAAAADdKX6aIWHSLhYTbjp5gOFtWHdcEw&amp;amp;gclid=Cj0KCQjwkrzPBhCqARIsAJN460mKbKu2tQSWbj-FyUDXxMNRzZ_VLz0bLA4ySmVy1VHRPOEsbBwjuooaAh5REALw_wcB" rel="noopener noreferrer"&gt;Coursera&lt;/a&gt;&lt;/strong&gt;. Parameters went through a function, we calculated the error (cost), and moved toward the minimum. Only here, the prediction is not a class label, but the next token from the vocabulary."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;By the way, Andrew Ng's course was released &lt;strong&gt;over 12 years ago!&lt;/strong&gt; And then it really made me think: if these ideas were being taught publicly so long ago, how many closed systems were using them even earlier? &lt;/p&gt;

&lt;p&gt;Air defense systems, satellite intelligence, medical expert systems (like &lt;strong&gt;MYCIN&lt;/strong&gt; from the 70s). All of this has been living on these principles for decades!&lt;/p&gt;


&lt;h3&gt;
  
  
  3. The "Memory" Problem and RAG 🧠
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Context window&lt;/strong&gt; is a hard limit on the number of tokens per request. &lt;br&gt;
The model hasn't "forgotten" information—it's just that if there is too much data, the application cuts off the old stuff based on the &lt;strong&gt;FIFO&lt;/strong&gt; (First In, First Out) principle.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;countTokens&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;contextLimit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;shift&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;// Old context is simply dropped from the request&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To use the model effectively, you need to "feed" it fresh data using the &lt;strong&gt;&lt;a href="https://www.reddit.com/r/ClaudeCode/comments/1s2kdl9/claude_suddenly_eating_up_your_usage_here_is_what/" rel="noopener noreferrer"&gt;RAG (Retrieval-Augmented Generation)&lt;/a&gt;&lt;/strong&gt; method.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How it works:&lt;/strong&gt; User question ⮕ Document search ⮕ Found fragments + Question ⮕ Language model ⮕ Answer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it matters:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Eliminates "hallucinations":&lt;/strong&gt; The model relies on facts from the search rather than its own internal "memory."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Relevance:&lt;/strong&gt; RAG allows the model to use fresh data without expensive retraining or hoping for magic.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  4. Who Runs the Process? (Orchestration) 🎮
&lt;/h3&gt;

&lt;p&gt;The model doesn't manage the flow itself. It doesn't decide which chat history to keep or which tools to call. This is handled by the &lt;strong&gt;orchestration layer&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;When a simple chatbot isn't enough, we use tools like &lt;a href="https://www.reddit.com/r/LangGraph" rel="noopener noreferrer"&gt;&lt;strong&gt;LangGraph&lt;/strong&gt;&lt;/a&gt;—a &lt;strong&gt;State Machine&lt;/strong&gt; that works around the model. The orchestrator creates the macro-flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User query ⮕ Orchestration layer.&lt;/li&gt;
&lt;li&gt;Check state / memory.&lt;/li&gt;
&lt;li&gt;Determine route.&lt;/li&gt;
&lt;li&gt;Search Vector DB.&lt;/li&gt;
&lt;li&gt;Call tools.&lt;/li&gt;
&lt;li&gt;Build final context ⮕ Call LLM.&lt;/li&gt;
&lt;li&gt;Validate / post-process answer.&lt;/li&gt;
&lt;li&gt;Finish or loop again.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is a great reminder of the difference between &lt;a href="https://www.reddit.com/r/programming/comments/5lteo1/imperative_vs_declarative_programming/" rel="noopener noreferrer"&gt;imperative and declarative programming&lt;/a&gt; approaches&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Declarative (The LLM)&lt;/strong&gt;: You describe what you want (the goal), and the model predicts the path.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Imperative (The Orchestrator)&lt;/strong&gt;: You explicitly define how the agent must behave ("If SQL fails, try Vector DB; if that fails, ask for human help").&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It says: &lt;em&gt;"First, go to SQL for facts; if that's not enough, look into the Vector DB; check the result; if it's junk, go back for another cycle."&lt;/em&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Architecture Instead of Magic 🏗️
&lt;/h3&gt;

&lt;p&gt;A modern AI agent is architecture. Calling it "intelligence" in the full sense of the word is strange, but the fact that we are all now "vibe-coders" living in a &lt;strong&gt;statistical matrix&lt;/strong&gt; is a reality!&lt;/p&gt;

&lt;p&gt;--&lt;/p&gt;

</description>
      <category>fullstack</category>
      <category>aiengineer</category>
      <category>softwareengineering</category>
      <category>architecture</category>
    </item>
    <item>
      <title>The 2026 React Native Blueprint</title>
      <dc:creator>Maxim</dc:creator>
      <pubDate>Sat, 28 Mar 2026 16:50:48 +0000</pubDate>
      <link>https://dev.to/maximcoding/the-2026-react-native-blueprint-3765</link>
      <guid>https://dev.to/maximcoding/the-2026-react-native-blueprint-3765</guid>
      <description>&lt;p&gt;I didn’t write this post to chase trends. I wrote it to make sense of the noise, especially all the AI hype, and get back to what actually helps. Boilerplates still matter. They just need to be built with more care for the realities of modern development.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;we do need boilerplates&lt;/li&gt;
&lt;li&gt;but we need better ones&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At first, it felt fine. It helped us move quickly, and early on, that was exactly what we needed.&lt;/p&gt;

&lt;p&gt;As data has grown exponentially — driven by IoT — thousands of events, heavier logging, deeper syncing, more real-world complexity — that early convenience started to show its limits..&lt;br&gt;
The app felt sluggish, and dependencies upgrades became something to approach carefully rather than casually.&lt;/p&gt;

&lt;p&gt;Around that same time, I had a couple of months to step back and rethink what I wanted from a modern mobile stack. I explored newer tools, like &lt;strong&gt;Biome&lt;/strong&gt; for a more unified code-quality setup. And because my access to the database and cloud code was limited, I had to be more creative — which is what pushed me toward &lt;strong&gt;TanStack Query&lt;/strong&gt; and its built-in caching, background refetching, persistence support, and offline-friendly patterns.&lt;/p&gt;

&lt;p&gt;Eventually, I came to a simple conclusion:&lt;/p&gt;

&lt;p&gt;I didn’t want to keep extending an old foundation.&lt;br&gt;&lt;br&gt;
I wanted to build one that reflected how I want to develop now.&lt;/p&gt;

&lt;p&gt;That’s why I built &lt;a href="https://github.com/maximcoding/react-native-bare-starter" rel="noopener noreferrer"&gt;react-native-bare-starter&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Storage Shift: MMKV over AsyncStorage
&lt;/h2&gt;

&lt;p&gt;Storage is one of those things users never think about — until they feel it.&lt;/p&gt;

&lt;p&gt;If saving a preference causes a small pause, or if reading persisted data slows down the experience, the app starts to feel less polished than it should.&lt;/p&gt;

&lt;p&gt;That’s why I moved to &lt;strong&gt;MMKV&lt;/strong&gt; as it's widely known for being significantly faster than AsyncStorage, and more importantly, it feels better in the places that matter: startup time, persistence, reads, and those tiny moments where responsiveness shapes trust.&lt;/p&gt;

&lt;p&gt;I also wasn’t trying to build around the most experimental path possible. I chose a version that works well, with a setup that feels practical and stable.&lt;/p&gt;

&lt;p&gt;Because a starter should not be a demo. &lt;br&gt;
It should be a dependable base.&lt;/p&gt;




&lt;h2&gt;
  
  
  Agnostic by Design, Flexible on Purpose
&lt;/h2&gt;

&lt;p&gt;Most apps begin with one backend, but very few stay that simple forever.&lt;/p&gt;

&lt;p&gt;Maybe the first version uses &lt;strong&gt;Firebase&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
Maybe the company already has a &lt;strong&gt;REST API&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
Maybe the team prefers &lt;strong&gt;GraphQL&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Over time, needs change. Teams evolve. Infrastructure shifts.&lt;/p&gt;

&lt;p&gt;And when that happens, I don’t want the UI layer to absorb the cost.&lt;/p&gt;

&lt;p&gt;So this starter is built around an &lt;strong&gt;agnostic transport layer&lt;/strong&gt;. The goal is straightforward: feature logic should not care where data comes from.&lt;/p&gt;

&lt;p&gt;By using adapters, the app stays flexible without becoming chaotic.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Decoupled logic&lt;/strong&gt; keeps features independent from the underlying transport.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Swappable adapters&lt;/strong&gt; make it easier to move between Firebase, REST, GraphQL, WebSockets, or mock services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Minimal UI impact&lt;/strong&gt; means backend changes don’t force a rewrite of every screen.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It’s a way to build for the backend you have today without overcommitting the product to it forever.&lt;/p&gt;




&lt;h2&gt;
  
  
  Server State and Offline-First Thinking
&lt;/h2&gt;

&lt;p&gt;Offline support is easy to postpone when everything works nicely on stable Wi-Fi.&lt;/p&gt;

&lt;p&gt;But mobile apps don’t live in perfect conditions. People use them in elevators, on trains, in parking garages, and in places where connectivity is inconsistent at best.&lt;/p&gt;

&lt;p&gt;So in this starter, &lt;strong&gt;offline-first thinking is part of the foundation&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;By combining &lt;strong&gt;TanStack Query&lt;/strong&gt;, &lt;strong&gt;MMKV persistence&lt;/strong&gt;, and &lt;strong&gt;queue/replay logic&lt;/strong&gt;, the app is designed to handle temporary network loss with more grace.&lt;/p&gt;

&lt;p&gt;That matters because reliability on mobile is not just about features.&lt;br&gt;&lt;br&gt;
It’s about behavior when conditions are less than ideal.&lt;/p&gt;




&lt;h2&gt;
  
  
  Permission Mapping, So You Spend Less Time Hunting Docs (A Gift file)
&lt;/h2&gt;

&lt;p&gt;I also included a pre-mapped catalog of native permissions for Bare React Native.&lt;/p&gt;

&lt;p&gt;It’s a small addition, but it saves real time — especially when switching contexts between JavaScript and native configuration, check &lt;a href="https://github.com/maximcoding/react-native-bare-starter/blob/master/docs/permissions-bare-rn.md" rel="noopener noreferrer"&gt;**repo doc&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Under the Hood (v1.0.2)
&lt;/h2&gt;

&lt;p&gt;This starter currently includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bare React Native 0.82.1&lt;/strong&gt; — full native control means full!&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;React Navigation 7.x&lt;/strong&gt; — stacks, tabs, and modals already wired&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zustand 5.x&lt;/strong&gt; — lightweight global state&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MMKV Storage&lt;/strong&gt; — &lt;code&gt;react-native-mmkv&lt;/code&gt; via Nitro Modules&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;🔌 Pluggable transport&lt;/strong&gt; — adapters for REST, GraphQL, WebSocket, and Firebase, so you can change backend strategy without rewiring the app&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SVG automation&lt;/strong&gt; — scripted icon generation with &lt;code&gt;npm run gen:icons&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Biome 2.x&lt;/strong&gt; — a single source of truth for formatting and linting&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Theming&lt;/strong&gt; - system / light / dark&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;i18next 25.x&lt;/strong&gt; — typed translations with a custom &lt;code&gt;useT()&lt;/code&gt; hook&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BootSplash 6.x&lt;/strong&gt; — native splash screen setup&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As i mentioned above: the goal wasn’t to chase trends.&lt;/p&gt;

&lt;p&gt;It was to assemble a stack that feels durable, understandable, and easier to maintain as the app becomes more complex.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get the Code
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/maximcoding/react-native-bare-starter" rel="noopener noreferrer"&gt;&lt;strong&gt;maximcoding/react-native-bare-starter&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This starter is still being pressure-tested in real work, and I’m planning to carry it into my next few ideas as well. If you’d like to jump in, contribute, and help shape where it goes next, you’re more than welcome.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp7kwgjx89uhoumifjt6r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp7kwgjx89uhoumifjt6r.png" alt=" " width="800" height="1739"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>programming</category>
      <category>reactnative</category>
    </item>
  </channel>
</rss>
