<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Josiah Liciaga-Silva</title>
    <description>The latest articles on DEV Community by Josiah Liciaga-Silva (@jliciagasilva).</description>
    <link>https://dev.to/jliciagasilva</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F597604%2F274437b7-4c08-4286-a7eb-a6a8078ce054.jpg</url>
      <title>DEV Community: Josiah Liciaga-Silva</title>
      <link>https://dev.to/jliciagasilva</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jliciagasilva"/>
    <language>en</language>
    <item>
      <title>Differential Transformers Explained</title>
      <dc:creator>Josiah Liciaga-Silva</dc:creator>
      <pubDate>Tue, 15 Oct 2024 15:12:48 +0000</pubDate>
      <link>https://dev.to/jliciagasilva/differential-transformers-explained-3h2j</link>
      <guid>https://dev.to/jliciagasilva/differential-transformers-explained-3h2j</guid>
      <description>&lt;h2&gt;
  
  
  The Basics
&lt;/h2&gt;

&lt;p&gt;Before diving into the new Differential Transformer, let's go over how a traditional Transformer works. At its core, Transformers use an attention mechanism to allow a model to focus on specific parts of an input sequence. This attention is computed using a softmax function:&lt;/p&gt;

&lt;p&gt;

&lt;/p&gt;
&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;Attention(Q,K,V)=softmax(QKTdk)VAttention(Q,K,V) = softmax(\frac{QK^T}{\sqrt{d_k}})V&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;A&lt;/span&gt;&lt;span class="mord mathnormal"&gt;tt&lt;/span&gt;&lt;span class="mord mathnormal"&gt;e&lt;/span&gt;&lt;span class="mord mathnormal"&gt;n&lt;/span&gt;&lt;span class="mord mathnormal"&gt;t&lt;/span&gt;&lt;span class="mord mathnormal"&gt;i&lt;/span&gt;&lt;span class="mord mathnormal"&gt;o&lt;/span&gt;&lt;span class="mord mathnormal"&gt;n&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathnormal"&gt;Q&lt;/span&gt;&lt;span class="mpunct"&gt;,&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;K&lt;/span&gt;&lt;span class="mpunct"&gt;,&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;V&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;so&lt;/span&gt;&lt;span class="mord mathnormal"&gt;f&lt;/span&gt;&lt;span class="mord mathnormal"&gt;t&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ma&lt;/span&gt;&lt;span class="mord mathnormal"&gt;x&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen nulldelimiter"&gt;&lt;/span&gt;&lt;span class="mfrac"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord sqrt"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span class="svg-align"&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;k&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="hide-tail"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="frac-line"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;Q&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;K&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;T&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose nulldelimiter"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mord mathnormal"&gt;V&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;Where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;QQ&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;Q&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 is the query matrix&lt;/li&gt;
&lt;li&gt;
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;KK&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;K&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 is the key matrix&lt;/li&gt;
&lt;li&gt;
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;VV&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;V&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 is the value matrix&lt;/li&gt;
&lt;li&gt;
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;dkd_k&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;k&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 is the dimensionality of the key matrix&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This mechanism assigns weights to different input tokens based on their relevance. Despite its success, current Transformers tend to be very distracted. The standard softmax function tends to over-allocate attention to irrelevant parts of the context. In long-context sequences, the model can focus too broadly, leading to inefficient learning. This broad focus also negatively impacts in-context learning.&lt;/p&gt;

&lt;p&gt;The Differential Transformer addresses these challenges by introducing a new mechanism. Instead of relying on a single attention map, it calculates two distinct attention maps:&lt;/p&gt;


&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;Adiff=softmax(A1)−softmax(A2)A_{diff} = softmax(A_1) - softmax(A_2)&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;A&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;d&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;i&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;ff&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;so&lt;/span&gt;&lt;span class="mord mathnormal"&gt;f&lt;/span&gt;&lt;span class="mord mathnormal"&gt;t&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ma&lt;/span&gt;&lt;span class="mord mathnormal"&gt;x&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;A&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;so&lt;/span&gt;&lt;span class="mord mathnormal"&gt;f&lt;/span&gt;&lt;span class="mord mathnormal"&gt;t&lt;/span&gt;&lt;span class="mord mathnormal"&gt;ma&lt;/span&gt;&lt;span class="mord mathnormal"&gt;x&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;A&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;Yes, that's right, it's that simple. This approach effectively removes redundant or noisy attention, promoting sparser and more focused attention. In turn, this prevents over-allocating attention to irrelevant tokens and allows the model to better manage long sequences and complex in-context learning scenarios.&lt;/p&gt;

&lt;p&gt;Key Benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sparse Attention Patterns: By reducing redundant attention, the model can better focus on critical parts of the input sequence.&lt;/li&gt;
&lt;li&gt;Improved Long-Context Modeling: Differential Attention allows the model to handle longer contexts more effectively, improving tasks like document summarization and question answering.&lt;/li&gt;
&lt;li&gt;In-Context Learning: The differential attention mechanism dynamically adapts based on the input context, enhancing the model's ability to learn from examples within the input.&lt;/li&gt;
&lt;li&gt;Hallucination Mitigation: In generation tasks, the DIFF Transformer reduces hallucinations by focusing more accurately on relevant context, leading to more coherent outputs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The DIFF Transformer has broad applications, particularly in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Handling long texts while focusing on the core information in Text Summarization tasks&lt;/li&gt;
&lt;li&gt;Improved performance in QA systems which require nuanced understanding of context&lt;/li&gt;
&lt;li&gt;Robust Generation, mitigating hallucinations in current models (GPT, Claude, Llama, etc.)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Implementation
&lt;/h2&gt;

&lt;p&gt;Start by modifying the attention mechanism within a Transformer architecture. Instead of directly computing the attention using the standard softmax approach, compute two separate attention maps and subtract them to generate a differential attention map:&lt;/p&gt;

&lt;p&gt;Here's a high-level snapshot written in Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;diff_attention&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;K&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;V&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;     
    &lt;span class="n"&gt;A1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;softmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Q&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt; &lt;span class="n"&gt;K&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nf"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d_k&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;# first attention map    
&lt;/span&gt;    &lt;span class="n"&gt;A2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;softmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Q&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt; &lt;span class="n"&gt;K&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nf"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d_k&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;# second attention map    
&lt;/span&gt;    &lt;span class="n"&gt;diff&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;A1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;A2&lt;/span&gt;  &lt;span class="c1"&gt;# differential attention    
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;diff&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt; &lt;span class="n"&gt;V&lt;/span&gt;&lt;span class="err"&gt;`&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This approach allows you to integrate differential attention into any Transformer-based architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  That's a Wrap
&lt;/h2&gt;

&lt;p&gt;DIFF Transformers are a significant leap forward. By refining the attention mechanism, they address key weaknesses we all encounter in the traditional Transformer architecture, leading to more efficient, focused, and context-aware models. Implementing these ideas can enhance the performance of large-scale language models in your applications, from NLP tasks to GenAI.&lt;/p&gt;

&lt;p&gt;Thanks for reading!&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/pdf/2410.05258" rel="noopener noreferrer"&gt;ARXIV - Differential Transformer&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>learning</category>
    </item>
    <item>
      <title>The Basics of Event-Driven Architecture</title>
      <dc:creator>Josiah Liciaga-Silva</dc:creator>
      <pubDate>Thu, 15 Aug 2024 23:46:28 +0000</pubDate>
      <link>https://dev.to/jliciagasilva/the-basics-of-event-driven-architecture-5bf7</link>
      <guid>https://dev.to/jliciagasilva/the-basics-of-event-driven-architecture-5bf7</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;As a company evolves, its back-end system will often grow into many different small services called micro-services. Ensuring these all work smoothly is a challenge and there are many ways to solve said challenge. One of the methods often deployed to solve this problem is the fabled Event-Driven Architecture. Large companies like LinkedIn, Uber, and Amazon deploy this technique to build near real-time services. Providing the smooth user experience we all enjoy today. It's time to FAFO. &lt;/p&gt;

&lt;h2&gt;
  
  
  What is Event Driven Architecture?
&lt;/h2&gt;

&lt;p&gt;EDA in its basics is a software design pattern. If you are a front-end developer you already do this as this is exactly how the client works. An user presses a button in which it creates an event, and another component of the client handles said event. In other words, it is a system that produces, detects, and consumes (reacts) to events. &lt;/p&gt;

&lt;p&gt;Let's break this down further.&lt;/p&gt;

&lt;p&gt;An event is created by a producer. That event is then transmitted through an event channel. One or more consumers receive the event and react to it. &lt;/p&gt;

&lt;h2&gt;
  
  
  There are a few key components to this architecture
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Event:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;This is a notable occurrence or change in the state of a system. For example; a user registers, an order is placed, your apple watches sensor readings, the ever fluctuating price changes in a particular stock&lt;/li&gt;
&lt;li&gt;This typically contains an event type, timestamp, and the relevant data associated with the event. &lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  The Event Producer:
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;These are the components of our system that generate the events when something noteworthy occurs. &lt;/li&gt;
&lt;li&gt;These can be user actions, system processes, or external systems performing operations.&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Event Broker / Channel:
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;This acts as the intermediary for events. It is in charge of managing event routing between producer and consumers. &lt;/li&gt;
&lt;li&gt;They are often implemented as Message Queues or Event Streams.&lt;/li&gt;
&lt;li&gt;Apache Kafka, RabbitMQ, Amazon SNS/SQS are a few of the technologies we use to handle this feature. &lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Event Bus:
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;This is a system that handles event distributions across different parts of our system. &lt;/li&gt;
&lt;li&gt;This is what allows us to decouple the Event Producers from the Event Consumers.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Event Consumer / Processor:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;These are the components (services) in our system that listen for and react to specific events. &lt;/li&gt;
&lt;li&gt;This is the action step for our event, it can aggregate, filter, enrich, save, update, or destroy data, and generate new events. &lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Event Store:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;This is our persistence storage for our events. This is what enables us to replay events and reconstruct our system state. &lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  There are a few patterns in EDA
&lt;/h2&gt;

&lt;p&gt;In the wild, you will come across some of these...&lt;/p&gt;

&lt;h3&gt;
  
  
  The fabled Publish-Subscribe (Pub/Sub) Model
&lt;/h3&gt;

&lt;p&gt;Here Publishers emit events without knowledge of subscribers, the subscribers receive the events they're interested in and do their action. GraphQL uses this effectively. &lt;/p&gt;

&lt;h3&gt;
  
  
  Event Sourcing
&lt;/h3&gt;

&lt;p&gt;This pattern stores the state of the application as a sequence of events. This allows for event replay to reconstruct a system state and also allows us to complete an audit of trail changes. &lt;/p&gt;

&lt;h3&gt;
  
  
  Command Query Responsibility Segregation (CQRS)
&lt;/h3&gt;

&lt;p&gt;This pattern separates read and write operations for a data store. This is often used in conjunction with Event Sourcing and allow us to improve our scalability and performance for complex domains. &lt;/p&gt;

&lt;h3&gt;
  
  
  Event Stream Processing
&lt;/h3&gt;

&lt;p&gt;This pattern allows for the continuous processing of event streams in real-time, this pattern is used in analytics, monitoring, and reactive systems. &lt;/p&gt;

&lt;h2&gt;
  
  
  There are advantages to using EDA as well as challenges to be overcome...
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Advantages
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;EDA does allow us to scale horizontally by adding more Event Consumers. &lt;/li&gt;
&lt;li&gt;We benefit from Dependency Inversion &lt;/li&gt;
&lt;li&gt;EDA can handle high volumes of events concurrently&lt;/li&gt;
&lt;li&gt;It allows for loose coupling, as components interact through events, reducing direct dependencies 

&lt;ul&gt;
&lt;li&gt;This in turn makes it easier to modify, replace, or add new components to the system. &lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;New Event Consumers can be added without affecting existing ones.

&lt;ul&gt;
&lt;li&gt;Which in turn helps us evolve our business requirements with relative ease. &lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;EDA also enables real-time processing and reaction to events &lt;/li&gt;

&lt;li&gt;EDA also improves our systems ability to adapt to ever changing conditions (see point number 4 again for more details...)&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Challenges in EDA
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;The propagation of Events takes time, this leads to periods of temporary inconsistencies also known as Eventual Consistency. 

&lt;ul&gt;
&lt;li&gt;This adds a rather complex layer of intricacy, the system needs to be designed in a way that handles out-of-order events or duplicate events. &lt;/li&gt;
&lt;li&gt;This also requires you to create additional mechanisms like sequence numbers or idempotent consumers to create a robust processing semantic. &lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Changes to event structure must be managed carefully, you need to make sure that events are forward and backward compatible&lt;/li&gt;

&lt;li&gt;Distributed systems can make tracing and debugging much harder to debug and fix, you will need to implement a robust logging and monitoring solution to overcome this. &lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Best Practices to follow
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Events should be meaningful and atomic, include all the necessary data for consumers. Nothing more, nothing less. To the point. &lt;/li&gt;
&lt;li&gt;Implement Idempotent Consumers&lt;/li&gt;
&lt;li&gt;Use asynchronous processing, this much should be obvious, DECOUPLE the event production from consumption. This improves responsiveness. &lt;/li&gt;
&lt;li&gt;Version each event, plan for backward and forward compatibility. &lt;/li&gt;
&lt;li&gt;Implement Proper Error Handling, design for failure scenarios and network issues, use circuit breakers and retries where appropriate. &lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  That's a wrap....
&lt;/h2&gt;

&lt;p&gt;EDA's allow us to create incredible experiences for our users. In the age of speed and efficiency, this architecture enables the creation of real-time applications ranging from Uber, to Spotify, to Battlefield 1 (the goat), to Netflix. Be aware of the challenges and plan ahead. Also, don't start your unicorn side-project with this architecture in place. You won't get far. &lt;/p&gt;

&lt;p&gt;Til next time!&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
