<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Harshit Singhal</title>
    <description>The latest articles on DEV Community by Harshit Singhal (@harshitsinghal13).</description>
    <link>https://dev.to/harshitsinghal13</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3804069%2F146b276a-8438-4d36-89f7-1f3c7a772398.png</url>
      <title>DEV Community: Harshit Singhal</title>
      <link>https://dev.to/harshitsinghal13</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/harshitsinghal13"/>
    <language>en</language>
    <item>
      <title>This Concurrency Bug Stayed Hidden for a Year</title>
      <dc:creator>Harshit Singhal</dc:creator>
      <pubDate>Tue, 14 Apr 2026 19:42:59 +0000</pubDate>
      <link>https://dev.to/harshitsinghal13/this-concurrency-bug-stayed-hidden-for-a-year-2f6j</link>
      <guid>https://dev.to/harshitsinghal13/this-concurrency-bug-stayed-hidden-for-a-year-2f6j</guid>
      <description>&lt;p&gt;We had a background job that processed thousands of records in parallel.&lt;br&gt;
Each batch ran concurrently, and we kept track of total successful and failed records.&lt;/p&gt;

&lt;p&gt;Everything worked perfectly.&lt;/p&gt;

&lt;p&gt;For almost a year.&lt;/p&gt;

&lt;p&gt;Then one day, the totals started becoming… wrong.&lt;/p&gt;

&lt;p&gt;No exceptions.&lt;br&gt;
No crashes.&lt;br&gt;
Just incorrect numbers.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Setup
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Records processed in chunks&lt;/li&gt;
&lt;li&gt;Multiple chunks running concurrently&lt;/li&gt;
&lt;li&gt;Shared counters tracking totals&lt;/li&gt;
&lt;li&gt;Periodic database updates with progress&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All standard parallel batch processing.&lt;/p&gt;

&lt;p&gt;And yet — totals drifted.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Symptom
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Some runs showed fewer successful records than expected&lt;/li&gt;
&lt;li&gt;Re-running the same data produced different counts&lt;/li&gt;
&lt;li&gt;The issue appeared only in one environment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Classic signs of a concurrency issue.&lt;/p&gt;

&lt;p&gt;But the tricky part?&lt;/p&gt;

&lt;p&gt;We were already using thread-safe collections.&lt;/p&gt;


&lt;h2&gt;
  
  
  What Was Actually Happening
&lt;/h2&gt;

&lt;p&gt;Imagine two workers updating the same counter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Initial total = 10

Worker A reads total (10)
Worker B reads total (10)

Worker A increments → 11
Worker B increments → 11  (overwrites A)

Final total = 11  ❌ (should be 12)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No exception.&lt;br&gt;
No crash.&lt;br&gt;
Just a lost update.&lt;/p&gt;

&lt;p&gt;This is a race condition.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Buggy Code
&lt;/h2&gt;

&lt;p&gt;A simplified version looked like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;totalSuccess&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="n"&gt;Parallel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ForEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;record&lt;/span&gt; &lt;span class="err"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="err"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;Process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;record&lt;/span&gt;&lt;span class="err"&gt;))&lt;/span&gt;
    &lt;span class="err"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;totalSuccess&lt;/span&gt;&lt;span class="p"&gt;++;&lt;/span&gt; &lt;span class="c1"&gt;// not atomic&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;++&lt;/code&gt; is not atomic. It performs:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Read&lt;/li&gt;
&lt;li&gt;Increment&lt;/li&gt;
&lt;li&gt;Write&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Multiple threads interleaving these steps leads to lost updates.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why &lt;code&gt;volatile&lt;/code&gt; Alone Doesn't Fix It
&lt;/h2&gt;

&lt;p&gt;A common attempt is to use &lt;code&gt;volatile&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;volatile&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;totalSuccess&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This ensures visibility, but &lt;strong&gt;not atomicity&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Two threads can still:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;read the same value&lt;/li&gt;
&lt;li&gt;increment&lt;/li&gt;
&lt;li&gt;overwrite each other&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So &lt;code&gt;volatile&lt;/code&gt; alone does not solve the race.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why It Took a Year to Appear
&lt;/h2&gt;

&lt;p&gt;Concurrency bugs are timing dependent.&lt;/p&gt;

&lt;p&gt;The race condition existed from the beginning, but it didn’t surface consistently.&lt;br&gt;
In fact, it only appeared in one environment.&lt;/p&gt;

&lt;p&gt;Subtle runtime differences — thread scheduling, CPU contention, and execution timing — made overlapping updates more likely there, eventually exposing the issue.&lt;/p&gt;

&lt;p&gt;No code changes were required.&lt;br&gt;
Just different timing.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Fix: Atomic Counters
&lt;/h2&gt;

&lt;p&gt;We replaced non-atomic updates with atomic operations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;totalSuccess&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="n"&gt;Parallel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ForEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;record&lt;/span&gt; &lt;span class="err"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="err"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;Process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;record&lt;/span&gt;&lt;span class="err"&gt;))&lt;/span&gt;
    &lt;span class="err"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Interlocked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Increment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="n"&gt;totalSuccess&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This guarantees increments are atomic.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Real-World Fix: Snapshot-Based Progress Reporting
&lt;/h2&gt;

&lt;p&gt;We also had periodic progress updates.&lt;br&gt;
Multiple workers updated counters while one periodically persisted totals.&lt;/p&gt;

&lt;p&gt;The correct pattern was:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;finished&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Interlocked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Increment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="n"&gt;completedChunks&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;finished&lt;/span&gt; &lt;span class="p"&gt;%&lt;/span&gt; &lt;span class="n"&gt;maxConcurrency&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;successSnapshot&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Volatile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="n"&gt;totalSuccess&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;failureSnapshot&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Volatile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="n"&gt;totalFailed&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TotalSuccessfulRecords&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;successSnapshot&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TotalFailedRecords&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;failureSnapshot&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;UpdateJobProgress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why This Works
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Interlocked&lt;/code&gt; → atomic updates&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Volatile.Read&lt;/code&gt; → latest visible value&lt;/li&gt;
&lt;li&gt;Snapshot → consistent progress reporting&lt;/li&gt;
&lt;li&gt;Batched DB updates → reduced contention&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This eliminates inconsistent totals.&lt;/p&gt;




&lt;h2&gt;
  
  
  Additional Improvement: Local Aggregation
&lt;/h2&gt;

&lt;p&gt;To reduce contention further:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="n"&gt;Parallel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ForEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;localSuccess&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;localFailure&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="k"&gt;record&lt;/span&gt; &lt;span class="nc"&gt;in&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;Process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;record&lt;/span&gt;&lt;span class="err"&gt;))&lt;/span&gt;
            &lt;span class="nc"&gt;localSuccess&lt;/span&gt;&lt;span class="p"&gt;++;&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;
            &lt;span class="n"&gt;localFailure&lt;/span&gt;&lt;span class="p"&gt;++;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;Interlocked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="n"&gt;totalSuccess&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;localSuccess&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;Interlocked&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ref&lt;/span&gt; &lt;span class="n"&gt;totalFailed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;localFailure&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This minimizes shared writes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Thread-safe collections ≠ thread-safe logic&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;++&lt;/code&gt; is not atomic&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;volatile&lt;/code&gt; ensures visibility, not correctness&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;Interlocked&lt;/code&gt; for counters&lt;/li&gt;
&lt;li&gt;Snapshot values using &lt;code&gt;Volatile.Read&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Reduce shared mutable state&lt;/li&gt;
&lt;li&gt;Batch progress updates&lt;/li&gt;
&lt;li&gt;Concurrency bugs are timing dependent&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Takeaway
&lt;/h2&gt;

&lt;p&gt;If you're running parallel batch jobs and tracking totals:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use atomic counters&lt;/li&gt;
&lt;li&gt;Take snapshot reads for reporting&lt;/li&gt;
&lt;li&gt;Avoid frequent shared writes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Otherwise, everything may look fine…&lt;/p&gt;

&lt;p&gt;Until it doesn't.&lt;/p&gt;

</description>
      <category>csharp</category>
      <category>multithreading</category>
      <category>concurrency</category>
    </item>
    <item>
      <title>SSE vs WebSocket for One-Way Push: Runtime and Operational Tradeoffs</title>
      <dc:creator>Harshit Singhal</dc:creator>
      <pubDate>Thu, 05 Mar 2026 05:34:08 +0000</pubDate>
      <link>https://dev.to/harshitsinghal13/sse-vs-websocket-for-one-way-push-runtime-and-operational-tradeoffs-o28</link>
      <guid>https://dev.to/harshitsinghal13/sse-vs-websocket-for-one-way-push-runtime-and-operational-tradeoffs-o28</guid>
      <description>&lt;p&gt;When scaling server-to-client push systems, the choice between Server-Sent Events (SSE) and WebSocket determines your operational burden under production load. This analysis examines runtime behavior, memory patterns, and tail latency tradeoffs for unidirectional real-time delivery.&lt;/p&gt;

&lt;p&gt;For one-way server push, the protocol decision is mostly about failure shape, memory behavior, and latency tail control under runtime limits, not feature parity.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Executive Framing
&lt;/h2&gt;

&lt;p&gt;Teams often pick WebSocket by default, then spend quarters containing emergent behavior at high concurrency: memory ballooning, opaque backpressure, and unstable p99 latency during noisy-neighbor periods. For one-way server-to-client streams, SSE frequently delivers a lower operational burden with more predictable saturation behavior.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
The question is not which protocol is more capable. The question is which runtime path remains debuggable when runtimes are CPU-throttled and connection counts are high.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  2. Protocol and Runtime Architecture
&lt;/h2&gt;

&lt;p&gt;In async server stacks, SSE is an HTTP response stream over a long-lived request lifecycle. That keeps load balancer behavior, middleware, and instrumentation aligned with existing HTTP control planes. WebSocket introduces an upgrade lifecycle with explicit connection state, heartbeat policy, and bidirectional framing mechanics.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SSE lifecycle:&lt;/strong&gt; accept request, attach stream producer, flush events, close on client disconnect or server policy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WebSocket lifecycle:&lt;/strong&gt; HTTP upgrade, maintain protocol state machine, negotiate keepalive semantics, manage outbound and inbound frame buffers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Event loop impact:&lt;/strong&gt; SSE workloads are mostly write-path scheduling; WebSocket adds more state transitions per connection and tighter coupling to heartbeat cadence.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Under pressure, the simpler lifecycle tends to produce clearer failure modes. Complexity is not free when thousands of sockets share one loop.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Resource Cost Analysis: CPU, Memory, Connection State
&lt;/h2&gt;

&lt;p&gt;CPU and memory costs diverge meaningfully once concurrency rises beyond a modest baseline.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CPU:&lt;/strong&gt; WebSocket commonly pays additional protocol-management overhead, especially with active keepalive and bidirectional framing. SSE cost is usually concentrated in serialization and write flush cadence.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory:&lt;/strong&gt; WebSocket risk is memory amplification through per-connection buffers and unsignaled backlog growth. SSE can still leak memory if queues are unbounded, but the model is easier to constrain.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State surface:&lt;/strong&gt; More connection state means more edge cases during deploy rollouts, proxy restarts, and intermittent packet loss.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In production environments with resource isolation, these differences become visible quickly because hard memory and CPU limits turn soft inefficiencies into hard failure behavior.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Behavior Under Scale
&lt;/h2&gt;

&lt;p&gt;Throughput can remain stable while user experience degrades. The earliest signal is usually latency tail expansion, not median latency drift.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;p95 often starts rising before alerting thresholds fire on raw request volume.&lt;/li&gt;
&lt;li&gt;p99 is where buffer growth and scheduler contention become obvious.&lt;/li&gt;
&lt;li&gt;When CPU throttling increases, jitter in flush cadence widens, and tail latency becomes noisy.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The practical target is graceful degradation: bounded queues, deterministic shedding, and fast recovery when pressure falls.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Operational Risks and When Not to Use This Approach
&lt;/h2&gt;

&lt;p&gt;Prefer SSE only when traffic is truly one-way. Do not force SSE into workflows that require low-latency client-to-server signaling, custom binary framing, or tightly coupled duplex control loops.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Do not use SSE for interactive bidirectional sessions where client events are first-class.&lt;/li&gt;
&lt;li&gt;Do not rely on either protocol without explicit bounded buffering per connection.&lt;/li&gt;
&lt;li&gt;Do not ignore intermediary timeout behavior; default idle timers can cause reconnection storms.&lt;/li&gt;
&lt;li&gt;Do not scale solely on CPU for push workloads; memory and queue pressure must participate.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  6. Decision Matrix
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Criterion&lt;/th&gt;
&lt;th&gt;SSE&lt;/th&gt;
&lt;th&gt;WebSocket&lt;/th&gt;
&lt;th&gt;Preferred&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Server-to-client only feed&lt;/td&gt;
&lt;td&gt;Simple lifecycle, HTTP-native ops&lt;/td&gt;
&lt;td&gt;Works, but higher state surface&lt;/td&gt;
&lt;td&gt;SSE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bidirectional interactive control&lt;/td&gt;
&lt;td&gt;Awkward and fragmented&lt;/td&gt;
&lt;td&gt;Native duplex channel&lt;/td&gt;
&lt;td&gt;WebSocket&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;High concurrency under strict memory limits&lt;/td&gt;
&lt;td&gt;Typically easier to bound&lt;/td&gt;
&lt;td&gt;Higher risk of buffer-driven instability&lt;/td&gt;
&lt;td&gt;SSE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Protocol/tooling operability in HTTP infra&lt;/td&gt;
&lt;td&gt;Strong alignment&lt;/td&gt;
&lt;td&gt;Requires explicit tuning and observability work&lt;/td&gt;
&lt;td&gt;SSE&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  7. Monitoring and SLO Implications
&lt;/h2&gt;

&lt;p&gt;Instrument around saturation mechanics, not only request counters.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Latency:&lt;/strong&gt; track p50/p95/p99 for end-to-end event delivery, not only server response times.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connection health:&lt;/strong&gt; active connections, reconnect rate, disconnect reasons, and reconnection storm detection.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backpressure:&lt;/strong&gt; per-connection queue depth, dropped-event counters, and write timeout rates.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runtime pressure:&lt;/strong&gt; memory usage, OOM events, CPU throttled time, and GC pause distribution.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;SLO design should include tail-latency error budgets and controlled degradation policy. If p99 delivery drifts while queue depth climbs, treat it as user-visible failure even if throughput appears healthy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Minimal Dependency-Free Pattern with Bounded Queue
&lt;/h2&gt;

&lt;p&gt;The key property is bounded buffering with deterministic drop policy. This keeps memory predictable under transient spikes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AsyncIterator&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;event_stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client_disconnected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;AsyncIterator&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;maxsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;producer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;heartbeat&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;full&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_nowait&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# Drop oldest to preserve bounded memory.
&lt;/span&gt;            &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put_nowait&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;producer&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;client_disconnected&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;is_set&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="c1"&gt;# SSE frame format: one event per double newline.
&lt;/span&gt;            &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;n&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;finally&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cancel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  8. Engineering Conclusion
&lt;/h2&gt;

&lt;p&gt;For one-way push, SSE is often the stronger default because it reduces state complexity, constrains memory behavior more naturally, and keeps operational visibility aligned with HTTP tooling. Choose WebSocket when duplex interaction is a hard requirement, not as a reflex.&lt;/p&gt;

&lt;p&gt;The protocol decision should be validated against tail latency, runtime pressure, and recovery behavior under bursty load. Build for graceful failure first. Peak throughput follows from disciplined runtime design.&lt;/p&gt;

</description>
      <category>sse</category>
      <category>websocket</category>
    </item>
  </channel>
</rss>
