<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Artem Gurtovoi</title>
    <description>The latest articles on DEV Community by Artem Gurtovoi (@temich).</description>
    <link>https://dev.to/temich</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2708971%2Fbac20274-170c-45c5-897f-8098e913d9a2.jpeg</url>
      <title>DEV Community: Artem Gurtovoi</title>
      <link>https://dev.to/temich</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/temich"/>
    <language>en</language>
    <item>
      <title>Log Ordering in Distributed Systems</title>
      <dc:creator>Artem Gurtovoi</dc:creator>
      <pubDate>Tue, 25 Nov 2025 07:16:41 +0000</pubDate>
      <link>https://dev.to/temich/log-ordering-in-distributed-systems-3bg0</link>
      <guid>https://dev.to/temich/log-ordering-in-distributed-systems-3bg0</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Extend trace context with a Lamport clock scoped to an execution branch.&lt;/p&gt;

&lt;h2&gt;
  
  
  Problem Definition
&lt;/h2&gt;

&lt;p&gt;Ordering log events produced across distributed systems is fundamentally constrained by the nature of independent physical clocks. Wall-clock timestamps cannot provide a reliable global sequence, because each machine maintains its own oscillator with unavoidable drift. Even under NTP or similar protocols, timestamp discrepancies accumulate continuously due to rate differences, network delays, and local scheduling effects.&lt;/p&gt;

&lt;p&gt;Any distributed operation introduces uncertainty in temporal order. Network latency, buffering, batching, and concurrency all contribute to the inability to determine whether two events across services occurred in a particular order when relying solely on physical time. The concept of temporal ordering is meaningful only within a single clock domain; across independent clocks, timestamps cannot be meaningfully compared.&lt;/p&gt;

&lt;p&gt;Attempts to merge logs by wall-clock timestamps therefore yield interleavings that may not reflect causality. The resulting timeline may hide the true execution structure of the system.&lt;/p&gt;

&lt;h2&gt;
  
  
  Limitations of Timestamp-Based Approaches
&lt;/h2&gt;

&lt;p&gt;Conventional log ingestion and aggregation systems typically sort events using:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the timestamp recorded by the service, or
&lt;/li&gt;
&lt;li&gt;the timestamp of ingestion by the collector.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both approaches suffer inherent limitations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Clock drift is indistinguishable from communication delay.&lt;/strong&gt; No observation can determine whether a later timestamp originated from drift or from slow delivery.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Timestamps encode no causal information.&lt;/strong&gt; An event with a greater timestamp may not depend on an event with a smaller one.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Collectors alter event order.&lt;/strong&gt; Batching and transport buffering introduce additional nondeterministic reordering unrelated to causality.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sorting heuristics obscure underlying issues.&lt;/strong&gt; When tools reorder logs using timestamps, clock drift becomes invisible; without reordering, the sequence appears chaotic.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because globally correct ordering cannot be derived from wall-clock time, most platforms expose ordering policies to users, implicitly acknowledging the limitations of timestamp-based ordering.&lt;/p&gt;

&lt;h2&gt;
  
  
  Solution: Logical Clock Propagation
&lt;/h2&gt;

&lt;p&gt;A more robust approach is to derive ordering from &lt;strong&gt;causality&lt;/strong&gt; rather than physical time. This is achieved by propagating two logical values—&lt;code&gt;branch&lt;/code&gt; and &lt;code&gt;sequence&lt;/code&gt;—along the execution path of each trace.&lt;/p&gt;

&lt;h3&gt;
  
  
  Overview
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv2cclx6xr52hc1g1krht.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv2cclx6xr52hc1g1krht.jpg" alt=" " width="800" height="803"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Each request carries:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;trace_id&lt;/code&gt; – uniquely identifies the execution.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;branch&lt;/code&gt; – a hierarchical identifier describing the request’s execution path through the system.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sequence&lt;/code&gt; – a monotonically increasing counter local to the current branch.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every log entry produced within that request includes &lt;code&gt;(trace_id, branch, sequence)&lt;/code&gt;, allowing deterministic reconstruction of execution order.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sequence Increment
&lt;/h3&gt;

&lt;p&gt;The sequence value increases with each causally significant step performed along the same branch: service-to-service calls, internal operations, or downstream interactions such as database access.&lt;/p&gt;

&lt;p&gt;Events within a branch are totally ordered by the sequence field.&lt;/p&gt;

&lt;h3&gt;
  
  
  Branch Creation
&lt;/h3&gt;

&lt;p&gt;When execution diverges into parallel or independent subpaths, a new branch identifier is created by extending the current branch:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;root branch: &lt;code&gt;/&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;first parallel branch: &lt;code&gt;/0/&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;second parallel branch: &lt;code&gt;/1/&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each new branch initializes its own sequence counter starting at zero.&lt;/p&gt;

&lt;p&gt;Sibling branches represent concurrent execution and therefore remain unordered relative to each other.&lt;/p&gt;

&lt;h2&gt;
  
  
  Observations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Causal Ordering
&lt;/h3&gt;

&lt;p&gt;The branch–sequence mechanism encodes &lt;strong&gt;happens-before&lt;/strong&gt; relations explicitly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Events within the same branch are totally ordered by &lt;code&gt;sequence&lt;/code&gt;.
&lt;/li&gt;
&lt;li&gt;Branch prefixes encode ancestry (e.g., &lt;code&gt;/1/2/&lt;/code&gt; descends from &lt;code&gt;/1/&lt;/code&gt;).
&lt;/li&gt;
&lt;li&gt;Sibling branches (e.g., &lt;code&gt;/0/&lt;/code&gt; and &lt;code&gt;/1/&lt;/code&gt;) represent concurrency and therefore remain unordered.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Practical Properties
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Unaffected by clock drift, NTP adjustments, or physical-time inconsistencies.
&lt;/li&gt;
&lt;li&gt;Represents execution structure explicitly, enabling reliable causal reconstruction.
&lt;/li&gt;
&lt;li&gt;Requires minimal instrumentation: two small metadata fields propagated through normal trace context.
&lt;/li&gt;
&lt;li&gt;Suitable both for operational debugging and post-incident analysis.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By deriving ordering from execution structure instead of timekeeping infrastructure, the branch–sequence method provides deterministic, causally accurate ordering within each distributed trace.&lt;/p&gt;

&lt;h2&gt;
  
  
  Considerations
&lt;/h2&gt;

&lt;p&gt;There are important considerations and trade-offs with using logical clocks for log ordering:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Causality over chronology:&lt;/strong&gt; The solution intentionally prioritizes causal order (the happens-before relation) over interpretations based on each server’s local clock readings. In diagnostic scenarios, the causal chain of events provides more reliable information than the timestamps generated by individual machines. For instance, if an error in Service C originates from a failure in Service A, the logical ordering will correctly place A’s events before C’s, even if a chronological sort based on local timestamps is distorted by clock drift, transient I/O delays, or other timing uncertainties.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Not a global order:&lt;/strong&gt; This mechanism does not produce a single total order of all events across the entire system in real time – nor is that typically needed. Each trace (request) is ordered internally, but events from different traces remain incomparable except by wall-clock time. This is acceptable, because unrelated requests do not have a causal ordering between them anyway. Even within one trace, if two branches execute in parallel, their events are concurrent and any ordering between them is somewhat arbitrary.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://martinfowler.com/articles/patterns-of-distributed-systems/lamport-clock.html" rel="noopener noreferrer"&gt;Lamport Clock&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/Logical_clock" rel="noopener noreferrer"&gt;Logical Clock&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>distributedsystems</category>
      <category>telemetry</category>
    </item>
    <item>
      <title>Distributed Peer Indexing</title>
      <dc:creator>Artem Gurtovoi</dc:creator>
      <pubDate>Wed, 19 Mar 2025 12:41:39 +0000</pubDate>
      <link>https://dev.to/temich/distributed-peer-indexing-5ckb</link>
      <guid>https://dev.to/temich/distributed-peer-indexing-5ckb</guid>
      <description>&lt;h2&gt;
  
  
  Problem
&lt;/h2&gt;

&lt;p&gt;Modulo partitioning algorithm &lt;code&gt;taks.id % replicas == index&lt;/code&gt; requires knowing the number of task processing instances running in the cluster and the own index of the current instance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Forces
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Static configuration is not an option (due to dynamic scaling / failover).&lt;/li&gt;
&lt;li&gt;In a distributed system, there is no concept of a &lt;em&gt;global current time&lt;/em&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Solution
&lt;/h2&gt;

&lt;p&gt;An algorithm that emits &lt;code&gt;(index, replicas)&lt;/code&gt; once per &lt;code&gt;interval&lt;/code&gt; seconds, using a common Redis key and atomic increment.&lt;/p&gt;

&lt;p&gt;Define the following parameters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;name&lt;/code&gt;: a name of the task processing (e.g. &lt;code&gt;mail-sender&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;interval&lt;/code&gt;: indexing interval that is deliberately greater than the expected clock skew among instances&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At the start of each &lt;code&gt;interval&lt;/code&gt; in &lt;a href="https://en.wikipedia.org/wiki/Unix_time" rel="noopener noreferrer"&gt;Unix epoch&lt;/a&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Calculate an ordinal number of the current interval: &lt;code&gt;number = ceil(now() / interval)&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Compose a &lt;code&gt;key&lt;/code&gt; as &lt;code&gt;{name}:{number}&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Atomically increment a &lt;code&gt;key&lt;/code&gt; in Redis (&lt;a href="https://redis.io/docs/latest/commands/incr/" rel="noopener noreferrer"&gt;INCR&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;If &lt;code&gt;index&lt;/code&gt; is defined (see 5)

&lt;ul&gt;
&lt;li&gt;Get the value of the previous key &lt;code&gt;{name}:{number-1}&lt;/code&gt; as &lt;code&gt;replicas&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;If the &lt;code&gt;replicas&lt;/code&gt; is defined, emit &lt;code&gt;(index, replicas)&lt;/code&gt; &lt;strong&gt;algorithm result&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Store the response (3) in &lt;code&gt;index&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Safe index transition
&lt;/h2&gt;

&lt;p&gt;If the &lt;code&gt;index&lt;/code&gt; or &lt;code&gt;replicas&lt;/code&gt; changes, the algorithm consumer must stop consuming new tasks and execute &lt;em&gt;safe index transition&lt;/em&gt;, to prevent task duplication or loss.&lt;/p&gt;

&lt;p&gt;The transition can be implemented in a manner similar to the described algorithm, using a dedicated &lt;code&gt;{name}:transition&lt;/code&gt; key. However, this process is considered outside the scope of this document.&lt;/p&gt;

&lt;h2&gt;
  
  
  Extension
&lt;/h2&gt;

&lt;p&gt;If the system clocks are too precisely synchronized (skew is less than a round-trip time to Redis), this may result in continuous index transitions.&lt;/p&gt;

&lt;p&gt;To mitigate this, the algorithm can be extended with a random delay:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Before starting the algorithm, define a random constant clock skew, significantly smaller than &lt;code&gt;interval&lt;/code&gt;: &lt;code&gt;skew = random() * (interval / 2)&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Start the algorithm at each &lt;code&gt;interval + skew&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Caveats
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The first result will become known between &lt;code&gt;interval&lt;/code&gt; and &lt;code&gt;interval × 2&lt;/code&gt; seconds.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>distributedsystems</category>
      <category>microservices</category>
    </item>
    <item>
      <title>Decentralized Request Throttling</title>
      <dc:creator>Artem Gurtovoi</dc:creator>
      <pubDate>Wed, 19 Mar 2025 12:30:21 +0000</pubDate>
      <link>https://dev.to/temich/decentralized-request-throttling-40b4</link>
      <guid>https://dev.to/temich/decentralized-request-throttling-40b4</guid>
      <description>&lt;h2&gt;
  
  
  Problem
&lt;/h2&gt;

&lt;p&gt;To prevent an unnecessary load, API requests should be throttled when made maliciously or due to an error.&lt;/p&gt;

&lt;h2&gt;
  
  
  Context
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;The &lt;a href="https://microservices.io/patterns/apigateway.html" rel="noopener noreferrer"&gt;API Gateway&lt;/a&gt; of a distributed system runs in multiple instances, each with an in-memory state that does not allow tracking the current quota usage.&lt;/li&gt;
&lt;li&gt;Precise quota enforcement (per request, per second) is not critical; the goal is to prevent significant overuse.&lt;/li&gt;
&lt;li&gt;Quota configuration is static.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Forces
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;No per request IO is allowed (centralized solutions do not fit).&lt;/li&gt;
&lt;li&gt;In a distributed system, there is no concept of a &lt;em&gt;global current time&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;Failure to retrieve the quota state should not result in Gateway failure.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Solution
&lt;/h2&gt;

&lt;p&gt;Implement in-memory quotas in each process, periodically synchronizing them asynchronously using Redis.&lt;/p&gt;

&lt;p&gt;Consider a basic throttling rule:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;No more than &lt;code&gt;MAX_REQUESTS&lt;/code&gt; within &lt;code&gt;INTERVAL&lt;/code&gt; time for any API route (&lt;code&gt;KEY&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;If the limit is exceeded, block requests to &lt;code&gt;KEY&lt;/code&gt; for &lt;code&gt;COOLDOWN&lt;/code&gt; seconds.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Concept
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Divide &lt;code&gt;INTERVAL&lt;/code&gt; into &lt;code&gt;N&lt;/code&gt; spans (&lt;code&gt;N &amp;gt;= 2&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;At the end of each span, atomically increment the value in Redis (&lt;a href="https://redis.io/docs/latest/commands/incrby/" rel="noopener noreferrer"&gt;INCRBY&lt;/a&gt;) by the number of requests received for each &lt;code&gt;KEY&lt;/code&gt;, using a key composed of the &lt;code&gt;KEY&lt;/code&gt; and the current ordinal number of &lt;code&gt;INTERVAL&lt;/code&gt;
in &lt;a href="https://en.wikipedia.org/wiki/Unix_time" rel="noopener noreferrer"&gt;Unix epoch&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;If the returned value exceeds &lt;code&gt;MAX_REQUESTS&lt;/code&gt;, block access to &lt;code&gt;KEY&lt;/code&gt;, remove after &lt;code&gt;COOLDOWN&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;If the write operation fails and &lt;code&gt;REQUESTS &amp;gt; MAX_REQUESTS / N&lt;/code&gt; (i.e., the quota is exceeded locally), block access to &lt;code&gt;KEY&lt;/code&gt;, remove after &lt;code&gt;COOLDOWN&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each API Gateway instance will generate the &lt;code&gt;KEY&lt;/code&gt; based on its own local time, which, in general, will lead to simultaneous writes (from Redis’s local time perspective) to different &lt;code&gt;KEY&lt;/code&gt;s. However, the total contribution of each node to each &lt;code&gt;KEY&lt;/code&gt; will correspond to the actual request rate experienced by that node.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ewt591lb6atkddhe5x3.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ewt591lb6atkddhe5x3.jpg" alt="Desynchronization" width="640" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Dividing the &lt;code&gt;INTERVAL&lt;/code&gt; into spans smooths the desynchronization effect.&lt;/p&gt;

&lt;h3&gt;
  
  
  Extension
&lt;/h3&gt;

&lt;p&gt;Introduce &lt;code&gt;nodes&lt;/code&gt;, the approximate number of active API Gateway instances, to improve the algorithm.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Initially, &lt;code&gt;nodes&lt;/code&gt; equals to &lt;code&gt;1&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;During each &lt;code&gt;INTERVAL&lt;/code&gt;, sum the total request count for each &lt;code&gt;KEY&lt;/code&gt;, keep current and previous values.&lt;/li&gt;
&lt;li&gt;At the end of each interval:&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Read the value from the &lt;code&gt;KEY&lt;/code&gt; for the previous interval. At this point, it is assumed that all nodes have switched to the next interval.&lt;/li&gt;
&lt;li&gt;Update the &lt;code&gt;nodes&lt;/code&gt; value by dividing response form Redis by the number of &lt;code&gt;REQUESTS&lt;/code&gt; counted for the previous interval.&lt;/li&gt;
&lt;li&gt;Set the previous value to the current value.&lt;/li&gt;
&lt;li&gt;Set the current value to &lt;code&gt;0&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;Update point 4 of the Concept: &lt;code&gt;REQUESTS * nodes &amp;gt; MAX_REQUESTS / N&lt;/code&gt;.
This will increase the precision of the local quota enforcement.&lt;/li&gt;
&lt;li&gt;Add a new step: if &lt;code&gt;REQUESTS * nodes &amp;gt; MAX_REQUESTS&lt;/code&gt;, block access to &lt;code&gt;KEY&lt;/code&gt;, remove after &lt;code&gt;COOLDOWN&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;When nodes are added or removed, the algorithm will adapt in the upcoming intervals.&lt;/p&gt;

&lt;h3&gt;
  
  
  Caveats
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;In the worst case scenario (really, really unlikely), the quota is exceeded by &lt;code&gt;MAX_REQUESTS / N&lt;/code&gt; in a span on each node.&lt;/li&gt;
&lt;li&gt;Time desynchronization between nodes should be insignificant for the selected &lt;code&gt;INTERVAL&lt;/code&gt; (i.e., &lt;code&gt;INTERVAL&lt;/code&gt; &amp;gt;&amp;gt; desync). See Extension point 3.&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>distributedsystems</category>
      <category>apigateway</category>
    </item>
  </channel>
</rss>
