<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Krishna Tej Chalamalasetty</title>
    <description>The latest articles on DEV Community by Krishna Tej Chalamalasetty (@chkrishnatej).</description>
    <link>https://dev.to/chkrishnatej</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F78797%2F396fc7c7-c2ab-412e-a5f5-1d373e603a0b.jpg</url>
      <title>DEV Community: Krishna Tej Chalamalasetty</title>
      <link>https://dev.to/chkrishnatej</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/chkrishnatej"/>
    <language>en</language>
    <item>
      <title>The Boring Infrastructure That Breaks AI APIs: A Guide to Billing and Metering</title>
      <dc:creator>Krishna Tej Chalamalasetty</dc:creator>
      <pubDate>Sun, 05 Apr 2026 18:59:33 +0000</pubDate>
      <link>https://dev.to/chkrishnatej/the-boring-infrastructure-that-breaks-ai-apis-a-guide-to-billing-and-metering-4hb</link>
      <guid>https://dev.to/chkrishnatej/the-boring-infrastructure-that-breaks-ai-apis-a-guide-to-billing-and-metering-4hb</guid>
      <description>&lt;p&gt;Recently, Anthropic users ran into a frustrating pattern. Usage limits hit faster than expected. Credits appeared late. In some cases, the same request was billed twice. The forums and GitHub issues filled up fast.&lt;br&gt;
But stepping back from the frustration, have you ever thought about what it actually takes to build billing infrastructure for an AI API?&lt;/p&gt;

&lt;p&gt;It sounds simple. Count tokens, charge money. But the moment you add streaming responses, concurrent users, prepaid credits, multiple token types, and an async pipeline underneath, it becomes one of the harder problems a platform team will face. And when it breaks, it breaks visibly. Users notice billing errors faster than almost any other kind of bug.&lt;/p&gt;

&lt;p&gt;This article is about what that system looks like under the hood, where it tends to fail, and what engineers can do about it.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Anatomy of a Billing System
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx9pfijl57wa5nizuakqp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx9pfijl57wa5nizuakqp.png" alt="Anatomy of billing system" width="800" height="272"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If I were to break down how a billing system is structured, I would anchor it around three core layers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Event&lt;/strong&gt; is the starting point. When a request completes, the API server emits a usage event that contains information about who made the request, which model, how many tokens, and a unique ID. That unique ID does quiet but important work. It is the foundation for both auditability and idempotency, which we will get to shortly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Meter&lt;/strong&gt; is the core of billing and probably the hardest part to get right. It takes the stream of raw events and answers one question: how much has this account consumed, and do they have enough balance to continue? Doing this correctly at scale for millions of concurrent users, across multiple token types, with partial streaming responses in flight, this is where most of the interesting failures live.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ledger&lt;/strong&gt; is the authoritative record of every transaction that touched a balance. It is append-only by design. You never edit a past entry. You only add new ones. The current balance is the sum of all entries for that account. This gives you two things a single mutable balance column cannot: a full audit trail for disputes, and a natural place to enforce idempotency.&lt;/p&gt;

&lt;p&gt;These three layers are not just organizational. They represent distinct consistency boundaries. The event layer needs to be fast. The meter layer needs to be correct. The ledger layer needs to be durable. Mixing them into a single system is where things start to go wrong.&lt;/p&gt;


&lt;h2&gt;
  
  
  Token Counting Is Harder Than It Looks
&lt;/h2&gt;

&lt;p&gt;With the anatomy established, I want to talk about how AI labs actually meter usage. Token counting sounds mechanical. In practice, it has edge cases that are easy to underestimate.&lt;/p&gt;
&lt;h3&gt;
  
  
  Input and output tokens are not symmetric
&lt;/h3&gt;

&lt;p&gt;Token counting happens on both sides of the request. Input tokens are known before the model runs. Output tokens are not. The response is a streaming, non-deterministic outcome. The system does not know how long it will be until it finishes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsf7m4nodx0flgxk84c4p.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsf7m4nodx0flgxk84c4p.webp" alt="Input and output tokens asymmetry" width="800" height="262"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;One way AI labs handle this is by metering the chunked stream as it is produced. As each chunk arrives, the meter tracks cumulative output tokens. As long as quota remains, the stream continues. Once quota is exhausted, the stream is cut. This means the billing decision and the response generation are happening at the same time, not in sequence.&lt;/p&gt;
&lt;h3&gt;
  
  
  Metering is opaque to the client
&lt;/h3&gt;

&lt;p&gt;The token count is finalized on the server. The client cannot see it until the response is done. Given the non-deterministic nature of the output, users have no reliable way to predict what a request will cost before sending it. That opacity is a real trust problem, and it gets worse when users are on prepaid credits with hard limits.&lt;/p&gt;
&lt;h3&gt;
  
  
  Not all tokens are the same
&lt;/h3&gt;

&lt;p&gt;There are input tokens, output tokens, thinking tokens, and cached tokens. Each has a different price and, in some cases, a different metering path. All of them draw from the same pool, that is the account's available balance.   &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftgsrgbf4e9emdi0n2360.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftgsrgbf4e9emdi0n2360.png" alt="Not all tokens are same" width="800" height="287"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If a user has multiple parallel sessions running, each one is consuming from that shared pool at the same time. That is a concurrency problem on the balance. At the scale of millions of users, it is a serious one. This is where the next section begins.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Two Hardest Problems: Idempotency &amp;amp; The Credit Race
&lt;/h2&gt;

&lt;p&gt;These two problems sit at the intersection of distributed systems and financial correctness. Getting either one wrong produces charges that are duplicated or inaccurate. Both erode user trust fast.&lt;/p&gt;
&lt;h3&gt;
  
  
  Idempotency
&lt;/h3&gt;

&lt;p&gt;Idempotency means that performing the same operation multiple times produces the same result as performing it once. In billing, this is not a nice-to-have. It is load-bearing.&lt;/p&gt;

&lt;p&gt;Usage events travel through an async pipeline, typically a message queue like Kafka. These queues guarantee at-least-once delivery. That means a consumer will sometimes receive the same event more than once during retries, consumer restarts, or network hiccups.&lt;/p&gt;

&lt;p&gt;Without an idempotency guard on the ledger write, a redelivered event produces a second charge for the same request. This is the pattern behind the Anthropic dual-billing bug. Two write paths, one for API billing and one for prepaid credits, both fired for the same request with no shared dedup check between them.&lt;/p&gt;

&lt;p&gt;The fix at the ledger level is one line of SQL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;ledger&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;org_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'evt_abc123'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'org_A'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;03&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'usage_deduct'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;CONFLICT&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;DO&lt;/span&gt; &lt;span class="k"&gt;NOTHING&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The event_id column has a unique constraint. A duplicate event does nothing. The ledger's append-only structure makes this natural. You are inserting a row, not updating one, and the database enforces uniqueness for you.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;event_id&lt;/code&gt; is the unique key that is sent in the event. This is one of the key pillars that ensures idempotency. &lt;/p&gt;

&lt;h3&gt;
  
  
  The Credit Race
&lt;/h3&gt;

&lt;p&gt;The credit race is a concurrency problem. Two requests arrive at the same time for an account with $0.05 remaining. Each request costs $0.03. Both are individually affordable. Together they are not.&lt;/p&gt;

&lt;p&gt;Without an atomic check-and-deduct, both requests read the balance as sufficient, both proceed, and the account ends at -$0.01. That is a silent overdraft.&lt;/p&gt;

&lt;p&gt;A plain Redis &lt;code&gt;DECRBY&lt;/code&gt; that subtracts a value from a key atomically does not protect against this. It has no floor. It will go negative without complaint.&lt;/p&gt;

&lt;p&gt;The correct approach is an atomic check-and-deduct. In Redis, this is done with a Lua script that runs as a single uninterruptible operation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;balance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;tonumber&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'GET'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;KEYS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;cost&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;tonumber&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ARGV&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;balance&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;cost&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'DECRBY'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;KEYS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;cost&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;  &lt;span class="c1"&gt;-- insufficient balance, reject the request&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There is no gap between the read and the write. Another thread cannot slip in between them.&lt;/p&gt;

&lt;p&gt;In practice, most AI APIs might use both. Redis for the fast atomic balance gate on the hot path. Postgres as the durable ledger for audit and idempotency. They solve different problems. Redis buys speed. Postgres buys correctness.&lt;/p&gt;




&lt;h2&gt;
  
  
  Failure Modes Taxonomy
&lt;/h2&gt;

&lt;p&gt;Most billing failures are not random. They cluster around a small number of root causes. Recognizing the pattern matters more than patching individual bugs.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;strong&gt;Failure Class&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Real Example&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;What Actually Went Wrong&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Double charge&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Anthropic API + credit billed for same request&lt;/td&gt;
&lt;td&gt;Two write paths, no shared idempotency check&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Ghost block&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Credits exist, API returns 402 anyway&lt;/td&gt;
&lt;td&gt;The balance is sufficient in the ledger, but the meter is reading from a cached copy that hasn't been updated yet.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Silent overdraft&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;More tokens consumed than remaining balance&lt;/td&gt;
&lt;td&gt;Non-atomic check-then-deduct under concurrency&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Credit destruction&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Gift codes wiping each other on redemption&lt;/td&gt;
&lt;td&gt;Stripe proration logic unaware of credit stacking rules&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Consumption spike&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3–5x faster depletion after a model update&lt;/td&gt;
&lt;td&gt;No baseline drift detection on per-account meter&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Measurement loss&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Streaming request interrupted mid-response&lt;/td&gt;
&lt;td&gt;Token count finalized at stream end, partial streams need a separate accounting path&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Four of these six failures share one root cause: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Billing state living in more than one place without a single source of truth. &lt;/li&gt;
&lt;li&gt;Double charge happens when two systems both think they own the write. &lt;/li&gt;
&lt;li&gt;Ghost block happens when the cache disagrees with the ledger. &lt;/li&gt;
&lt;li&gt;Credit destruction happens when Stripe's model and the internal credit model diverge.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The fix is not patching each bug one by one. It is deciding once which system owns each piece of billing state, and making everything else read from it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Design Principles
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Idempotency is load-bearing, not defensive.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Engineers sometimes treat idempotency as a safety net added after the system works. In billing, it belongs in the design before the first line of code. Every ledger write needs a dedup key. Every event needs a stable, unique ID. At-least-once delivery will eventually betray you. The only safe response is to design for it upfront.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Credits are not money. Model them accordingly.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A balance column feels sufficient until you need expiry dates, stacking rules for different grant types, priority ordering between purchased and promotional credits, and shared wallets across teams. At that point, a single number is not enough. Credits need to be a ledger of typed entries, each with an amount, a source, an expiry, and a priority. The Anthropic gift code bug, where redeeming multiple codes destroyed existing credit value, is a direct result of treating credits as a simple balance rather than a structured set of grants.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Measure first, charge second.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is easy to say and consistently violated under deadline pressure. The metering pipeline should finalize the token count before any charge is recorded. Charging on estimated or in-flight counts, especially for streaming creates a class of errors that are nearly impossible to audit after the fact. The accounting path and the response path can run in parallel. But the ledger write should always wait for a confirmed count.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The Anthropic billing incidents were not exotic failures. They were textbook distributed systems problems that surfaced in a financial context. Duplicate writes without idempotency guards. Concurrent balance checks without atomic deduction. Credit state split across systems with no clear owner.&lt;/p&gt;

&lt;p&gt;What makes them worth studying is not that mistakes were made. Every system at scale makes mistakes. It is that these failure classes are predictable. They have names. They have known fixes. And they tend to appear in roughly the same order as a billing system grows.&lt;/p&gt;

&lt;p&gt;If you are building or reviewing a billing system, the most useful question to ask is not whether the happy path works. It is whether you have decided which system owns each piece of billing state, and whether everything else actually reads from it.&lt;/p&gt;

&lt;p&gt;Billing infrastructure rarely gets a design doc. It rarely gets a dedicated team until something goes wrong. It sits in the corner of the codebase, quietly doing its job, until one day it doesn't and suddenly it's the only thing anyone is talking about.&lt;/p&gt;

&lt;p&gt;The Anthropic incidents are a good reminder of that dynamic. The failures weren't in the model. They weren't in the API. They were in the plumbing that sits between a user's wallet and a completed request. The part nobody thought was interesting enough to design carefully.&lt;/p&gt;

&lt;p&gt;That's the thing about boring infrastructure. It doesn't announce itself when it's working. But when it breaks, it breaks in the most visible way possible on someone's credit card statement.&lt;/p&gt;

&lt;p&gt;Token counting, idempotency, credit ledgers, atomic deductions, none of this is glamorous work. But it is the work that determines whether users trust your platform. And trust, once lost over a billing error, is genuinely hard to get back.&lt;/p&gt;

&lt;p&gt;Build the boring parts like they matter. Because to your users, they matter more than almost anything else.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>distributedsystems</category>
      <category>architecture</category>
    </item>
    <item>
      <title>What is a Container? The OS-Level Truth Most Engineers Don't Know</title>
      <dc:creator>Krishna Tej Chalamalasetty</dc:creator>
      <pubDate>Sat, 04 Apr 2026 01:25:08 +0000</pubDate>
      <link>https://dev.to/chkrishnatej/what-is-a-container-the-os-level-truth-most-engineers-dont-know-3n2l</link>
      <guid>https://dev.to/chkrishnatej/what-is-a-container-the-os-level-truth-most-engineers-dont-know-3n2l</guid>
      <description>&lt;h3&gt;
  
  
  "You Keep Using That Word"
&lt;/h3&gt;

&lt;h3&gt;
  
  
  Dispelling Container Misconceptions at the OS Level
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;Before we write a single line of code, we need to kill the buzzword fog.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What a Container Actually Is
&lt;/h2&gt;

&lt;p&gt;The marketing definition you have heard a hundred times: &lt;strong&gt;"a container is an executable unit of software with its dependencies bundled together."&lt;/strong&gt; That is not wrong, but it tells you nothing useful about what is actually happening on the machine.&lt;/p&gt;

&lt;p&gt;Here is the OS-level truth: &lt;strong&gt;a container is a process (or a tree of processes) that the kernel runs with a restricted view of its own namespaces and a cgroup-enforced ceiling on the resources it can consume.&lt;/strong&gt; That is the entire trick. No hypervisor, no guest kernel, no virtualized hardware. Just a process with a carefully constructed set of constraints.&lt;br&gt;
Everything else in this article is evidence for that single claim.&lt;/p&gt;
&lt;h2&gt;
  
  
  Spin up a simple HTTPD container
&lt;/h2&gt;

&lt;p&gt;I used the following podman command to spin up the HTTPD container with limited amount of resources.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;podman run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; my-limited-httpd &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--replace&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--memory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;512m &lt;span class="se"&gt;\ &lt;/span&gt;       &lt;span class="c"&gt;# hard memory ceiling&lt;/span&gt;
  &lt;span class="nt"&gt;--memory-swap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;512m &lt;span class="se"&gt;\ &lt;/span&gt;  &lt;span class="c"&gt;# swap ceiling equal to memory = no swap allowed&lt;/span&gt;
  &lt;span class="nt"&gt;--cpus&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1.5 &lt;span class="se"&gt;\ &lt;/span&gt;          &lt;span class="c"&gt;# CPU quota: 1.5 cores worth of CPU time&lt;/span&gt;
  &lt;span class="nt"&gt;--cpu-shares&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;512 &lt;span class="se"&gt;\ &lt;/span&gt;    &lt;span class="c"&gt;# relative CPU weight during contention&lt;/span&gt;
  &lt;span class="nt"&gt;--pids-limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;100 &lt;span class="se"&gt;\ &lt;/span&gt;    &lt;span class="c"&gt;# max 100 processes/threads in this container&lt;/span&gt;
  &lt;span class="nt"&gt;--blkio-weight&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;500 &lt;span class="se"&gt;\ &lt;/span&gt;  &lt;span class="c"&gt;# relative block I/O weight&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; 8081:80 &lt;span class="se"&gt;\&lt;/span&gt;
  registry.access.redhat.com/ubi8/httpd-24
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This starts a container with the given resources. You can check&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;krish-local:~ &lt;span class="c"&gt;# pstree -pT&lt;/span&gt;
systemd&lt;span class="o"&gt;(&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;─┬─NetworkManager&lt;span class="o"&gt;(&lt;/span&gt;944&lt;span class="o"&gt;)&lt;/span&gt;
           ├─agetty&lt;span class="o"&gt;(&lt;/span&gt;1011&lt;span class="o"&gt;)&lt;/span&gt;
           ├─agetty&lt;span class="o"&gt;(&lt;/span&gt;1012&lt;span class="o"&gt;)&lt;/span&gt;
           ├─auditd&lt;span class="o"&gt;(&lt;/span&gt;835&lt;span class="o"&gt;)&lt;/span&gt;
           ├─chronyd&lt;span class="o"&gt;(&lt;/span&gt;1007&lt;span class="o"&gt;)&lt;/span&gt;
           &lt;span class="c"&gt;# ConMon is container monitor. HTTPD is the container we ran&lt;/span&gt;
           ├─conmon&lt;span class="o"&gt;(&lt;/span&gt;13669&lt;span class="o"&gt;)&lt;/span&gt;───httpd&lt;span class="o"&gt;(&lt;/span&gt;13684&lt;span class="o"&gt;)&lt;/span&gt;─┬─cat&lt;span class="o"&gt;(&lt;/span&gt;13727&lt;span class="o"&gt;)&lt;/span&gt; 
           │                              ├─cat&lt;span class="o"&gt;(&lt;/span&gt;13728&lt;span class="o"&gt;)&lt;/span&gt;
           │                              ├─cat&lt;span class="o"&gt;(&lt;/span&gt;13729&lt;span class="o"&gt;)&lt;/span&gt;
           │                              ├─cat&lt;span class="o"&gt;(&lt;/span&gt;13730&lt;span class="o"&gt;)&lt;/span&gt;
           │                              ├─httpd&lt;span class="o"&gt;(&lt;/span&gt;13731&lt;span class="o"&gt;)&lt;/span&gt;
           │                              ├─httpd&lt;span class="o"&gt;(&lt;/span&gt;13734&lt;span class="o"&gt;)&lt;/span&gt;
           │                              └─httpd&lt;span class="o"&gt;(&lt;/span&gt;35659&lt;span class="o"&gt;)&lt;/span&gt;
           ├─dbus-broker-lau&lt;span class="o"&gt;(&lt;/span&gt;854&lt;span class="o"&gt;)&lt;/span&gt;───dbus-broker&lt;span class="o"&gt;(&lt;/span&gt;856&lt;span class="o"&gt;)&lt;/span&gt;
           ├─dotd&lt;span class="o"&gt;(&lt;/span&gt;959&lt;span class="o"&gt;)&lt;/span&gt;───sleep&lt;span class="o"&gt;(&lt;/span&gt;3951905&lt;span class="o"&gt;)&lt;/span&gt;
           ├─irqbalance&lt;span class="o"&gt;(&lt;/span&gt;858&lt;span class="o"&gt;)&lt;/span&gt;
           ├─rsyslogd&lt;span class="o"&gt;(&lt;/span&gt;1159&lt;span class="o"&gt;)&lt;/span&gt;
           ├─sshd&lt;span class="o"&gt;(&lt;/span&gt;1562&lt;span class="o"&gt;)&lt;/span&gt;───sshd-session&lt;span class="o"&gt;(&lt;/span&gt;3949831&lt;span class="o"&gt;)&lt;/span&gt;───sshd-session&lt;span class="o"&gt;(&lt;/span&gt;3950094&lt;span class="o"&gt;)&lt;/span&gt;───bash&lt;span class="o"&gt;(&lt;/span&gt;3950096&lt;span class="o"&gt;)&lt;/span&gt;───pstree&lt;span class="o"&gt;(&lt;/span&gt;3954963&lt;span class="o"&gt;)&lt;/span&gt;
           ├─systemd&lt;span class="o"&gt;(&lt;/span&gt;3950077&lt;span class="o"&gt;)&lt;/span&gt;───&lt;span class="o"&gt;(&lt;/span&gt;sd-pam&lt;span class="o"&gt;)(&lt;/span&gt;3950079&lt;span class="o"&gt;)&lt;/span&gt;
           ├─systemd-journal&lt;span class="o"&gt;(&lt;/span&gt;621&lt;span class="o"&gt;)&lt;/span&gt;
           ├─systemd-logind&lt;span class="o"&gt;(&lt;/span&gt;890&lt;span class="o"&gt;)&lt;/span&gt;
           ├─systemd-udevd&lt;span class="o"&gt;(&lt;/span&gt;657&lt;span class="o"&gt;)&lt;/span&gt;
           └─wpa_supplicant&lt;span class="o"&gt;(&lt;/span&gt;891&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  From &lt;code&gt;podman run&lt;/code&gt; to a Process - What Actually Happens
&lt;/h3&gt;

&lt;p&gt;When you ran &lt;code&gt;podman run&lt;/code&gt;, Podman did not start &lt;code&gt;httpd&lt;/code&gt; directly. It handed the work to an OCI runtime. On SLES 16, that runtime is &lt;code&gt;crun&lt;/code&gt; (you can verify with &lt;code&gt;podman info | grep -i runtime&lt;/code&gt;). &lt;code&gt;crun&lt;/code&gt; is the thing that actually calls &lt;code&gt;clone()&lt;/code&gt; with the right namespace flags, writes the cgroup limits into &lt;code&gt;/sys/fs/cgroup&lt;/code&gt;, sets up the root filesystem from the image layers, and then calls &lt;code&gt;execve()&lt;/code&gt; to start your process. &lt;br&gt;
&lt;code&gt;runc&lt;/code&gt; does the same job but its original OCI reference runtime is written in Go. &lt;code&gt;crun&lt;/code&gt; is a C rewrite that is lighter and faster, and is now the default on most modern distros.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;conmon&lt;/code&gt; in the &lt;code&gt;pstree&lt;/code&gt; output is the container monitor that acts as a small supervisor process that holds the container's stdio open, watches for the OCI runtime to exit, and reports container exit codes back to Podman. It is not part of your workload. It is bookkeeping infrastructure.&lt;/p&gt;

&lt;p&gt;So the full chain is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;podman → crun → clone() + execve() → your process
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By the time &lt;code&gt;httpd&lt;/code&gt; appears in the process table, &lt;code&gt;crun&lt;/code&gt; has already exited. Its job was setup, not supervision.&lt;/p&gt;




&lt;h2&gt;
  
  
  Four Words the Industry Uses Interchangeably — and Shouldn't
&lt;/h2&gt;

&lt;p&gt;These four terms get collapsed into each other constantly. The confusion is not accidental. Vendors benefit from the blurring. But it costs engineers clarity when something breaks.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Term&lt;/th&gt;
&lt;th&gt;What it actually means&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Kernel&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The core software that mediates between hardware and every process running on the machine. It owns CPU scheduling, memory management, syscall handling, and namespace/cgroup enforcement.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Operating System&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The full stack: kernel plus user-space tooling (libc, shell, package manager, init system) that makes the machine usable by humans or services. RHEL is an OS. The kernel alone is not.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Process&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A running instance of a program. The kernel has allocated it a PID, some memory, and file descriptors. It exists in kernel memory; &lt;code&gt;/proc&lt;/code&gt; is a virtual filesystem that &lt;em&gt;exposes&lt;/em&gt; it, not where it lives.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Image&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Explained in full below.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  What an Image Actually Is
&lt;/h3&gt;

&lt;p&gt;An image is a blueprint. A container is one running instance of it. You can start ten containers from the same image simultaneously and each one gets its own thin writable layer on top of the shared read-only filesystem layers underneath. The image itself never runs. It is inert until a runtime turns it into a process.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;At the OCI spec level:&lt;/strong&gt; an image is a stack of read-only filesystem layers plus a JSON config manifest. When you run a container, the OCI runtime (crun on most modern Linux systems, runc historically) unpacks those layers into a root filesystem, applies the runtime constraints, and hands a process off to the kernel.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Image = what to run. &lt;/li&gt;
&lt;li&gt;Runtime spec = how to run it&lt;/li&gt;
&lt;li&gt;The container = the process that results.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The PID 1 Myth
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The belief:&lt;/strong&gt; there is a mini Linux inside the container, and PID 1 is its init system — the root of a private process universe.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The reality:&lt;/strong&gt; PID 1 is namespace-relative. The first process spawned inside a new PID namespace gets assigned PID 1 &lt;em&gt;within that namespace&lt;/em&gt;. It is simultaneously visible on the host with a completely different PID. One process, two identities, depending on which namespace you are observing from.&lt;/p&gt;

&lt;p&gt;Let us verify this directly. The &lt;code&gt;nsenter&lt;/code&gt; command lets us step into a process's namespaces and run a command from inside them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stepping into the host's namespaces (PID 1 = systemd):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Enter PID 1's PID namespace (-p) and mount namespace (-m, needed so&lt;/span&gt;
&lt;span class="c"&gt;# /proc reflects the target namespace) and run ps&lt;/span&gt;
krish-local:~ &lt;span class="c"&gt;# nsenter -t 1 -p -m ps -ef&lt;/span&gt;
UID          PID    PPID  C STIME TTY          TIME CMD
root           1       0  0 Mar29 ?        00:00:05 /usr/lib/systemd/systemd &lt;span class="nt"&gt;--switched-root&lt;/span&gt; &lt;span class="nt"&gt;--system&lt;/span&gt; &lt;span class="nt"&gt;--deserialize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;47
root           2       0  0 Mar29 ?        00:00:00 &lt;span class="o"&gt;[&lt;/span&gt;kthreadd]
root           3       2  0 Mar29 ?        00:00:00 &lt;span class="o"&gt;[&lt;/span&gt;pool_workqueue_release]
root           4       2  0 Mar29 ?        00:00:00 &lt;span class="o"&gt;[&lt;/span&gt;kworker/R-kvfree_rcu_reclaim]
root           5       2  0 Mar29 ?        00:00:00 &lt;span class="o"&gt;[&lt;/span&gt;kworker/R-rcu_gp]
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Stepping into the container's namespaces (container process host PID = 13684):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Same command, but targeting the container's PID namespace.&lt;/span&gt;
&lt;span class="c"&gt;# Inside this namespace, httpd was the first process spawned — so it gets PID 1.&lt;/span&gt;
krish-local:~ &lt;span class="c"&gt;# nsenter -t 13684 -p -m ps -ef&lt;/span&gt;
UID          PID    PPID  C STIME TTY          TIME CMD
default        1       0  0 01:47 ?        00:00:29 httpd &lt;span class="nt"&gt;-D&lt;/span&gt; FOREGROUND
default       38       1  0 01:47 ?        00:00:00 /usr/bin/coreutils &lt;span class="nt"&gt;--coreutils-prog-shebang&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; /usr/bin/cat
default       39       1  0 01:47 ?        00:00:00 /usr/bin/coreutils &lt;span class="nt"&gt;--coreutils-prog-shebang&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; /usr/bin/cat
default       40       1  0 01:47 ?        00:00:00 /usr/bin/coreutils &lt;span class="nt"&gt;--coreutils-prog-shebang&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; /usr/bin/cat
default       41       1  0 01:47 ?        00:00:00 /usr/bin/coreutils &lt;span class="nt"&gt;--coreutils-prog-shebang&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; /usr/bin/cat
default       42       1  0 01:47 ?        00:00:29 httpd &lt;span class="nt"&gt;-D&lt;/span&gt; FOREGROUND
default       45       1  0 01:47 ?        00:00:17 httpd &lt;span class="nt"&gt;-D&lt;/span&gt; FOREGROUND
root      615091       0  0 10:11 ?        00:00:00 ps &lt;span class="nt"&gt;-ef&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;httpd&lt;/code&gt; process sitting at PID 1 inside the container namespace? On the host, it is PID 13684. Same process, different lens.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Shared Kernel - &lt;code&gt;uname&lt;/code&gt; Does Not Lie
&lt;/h3&gt;

&lt;p&gt;If there were a mini Linux inside, it would have its own kernel, its own kernel version, its own architecture identity. Run &lt;code&gt;uname&lt;/code&gt; on the host and inside the container and compare:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Host:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;krish-local:~ &lt;span class="c"&gt;# uname -s &amp;amp;&amp;amp; uname -m &amp;amp;&amp;amp; uname -o &amp;amp;&amp;amp; uname -r&lt;/span&gt;
Linux
x86_64
GNU/Linux
6.12.0-160000.9-default
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Inside the container:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;krish-local:~ &lt;span class="c"&gt;# podman exec my-limited-httpd `uname -s &amp;amp;&amp;amp; uname -m &amp;amp;&amp;amp; uname -o &amp;amp;&amp;amp; uname -r`&lt;/span&gt;
Linux
x86_64
GNU/Linux
6.12.0-160000.9-default
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Identical output. The container is not running a different kernel but, it is sharing the host kernel. The UTS namespace gives it an isolated hostname, but kernel identity is not part of that isolation. There is no guest kernel to find.&lt;/p&gt;




&lt;h2&gt;
  
  
  Resource Isolation: cgroups Are the Real Enforcement Mechanism
&lt;/h2&gt;

&lt;p&gt;Namespaces control &lt;em&gt;what a process can see&lt;/em&gt;. cgroups control &lt;em&gt;what a process can consume&lt;/em&gt;. Together they are the two pillars of container isolation.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note on cgroup versions:&lt;/strong&gt; The demos below use cgroup v2 paths (&lt;code&gt;/sys/fs/cgroup/&amp;lt;scope&amp;gt;/memory.max&lt;/code&gt;), which is the unified hierarchy used by SLES 16 with kernel 6.12. If you are on an older distro still running cgroup v1, your paths will look different (&lt;code&gt;/sys/fs/cgroup/memory/&amp;lt;scope&amp;gt;/memory.limit_in_bytes&lt;/code&gt;).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Now compare cgroup limits for systemd (PID 1 on the host) versus the container process.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;systemd - no enforced ceiling:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;krish-local:~ &lt;span class="c"&gt;# CGROUP=$(cat /proc/1/cgroup | cut -d: -f3)&lt;/span&gt;
krish-local:~ &lt;span class="c"&gt;# echo $CGROUP&lt;/span&gt;
/init.scope
krish-local:~ &lt;span class="c"&gt;# cat /sys/fs/cgroup${CGROUP}/memory.max&lt;/span&gt;
max
krish-local:~ &lt;span class="c"&gt;# cat /sys/fs/cgroup${CGROUP}/pids.max&lt;/span&gt;
max
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;max&lt;/code&gt; means unlimited. The init process is not constrained.&lt;br&gt;
&lt;strong&gt;The container process — enforced ceiling:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;krish-local:~ &lt;span class="c"&gt;# PID=13684  # container's PID on the host&lt;/span&gt;
krish-local:~ &lt;span class="c"&gt;# CGROUP=$(cat /proc/$PID/cgroup | cut -d: -f3)&lt;/span&gt;
krish-local:~ &lt;span class="c"&gt;# echo $CGROUP&lt;/span&gt;
/machine.slice/libpod-a67892f3083285e34c738fd1e75cccd7eaadbda71f5a8c60a522e73546c0d5a2.scope
krish-local:~ &lt;span class="c"&gt;# cat /sys/fs/cgroup${CGROUP}/memory.max&lt;/span&gt;
536870912
krish-local:~ &lt;span class="c"&gt;# cat /sys/fs/cgroup${CGROUP}/pids.max&lt;/span&gt;
100
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;536870912&lt;/code&gt; bytes = 512 MiB. Exactly the &lt;code&gt;--memory=512m&lt;/code&gt; flag we passed at startup. The &lt;code&gt;pids.max&lt;/code&gt; of 100 matches &lt;code&gt;--pids-limit=100&lt;/code&gt;. The kernel is enforcing these budgets directly — not Podman, not any container runtime abstraction sitting above the kernel.&lt;/p&gt;




&lt;h2&gt;
  
  
  Putting It Together
&lt;/h2&gt;

&lt;p&gt;A container is a process. It shares the host kernel and &lt;code&gt;uname&lt;/code&gt; proves it. Its PID 1 is an artifact of namespace isolation, not evidence of a private OS and &lt;code&gt;nsenter&lt;/code&gt; proves it. Its resource limits are enforced by cgroups in the kernel, not by any runtime magic and &lt;code&gt;/sys/fs/cgroup&lt;/code&gt; proves it.&lt;/p&gt;

&lt;p&gt;The image is the blueprint. &lt;code&gt;crun&lt;/code&gt;/&lt;code&gt;runc&lt;/code&gt; is the assembly line. The running container is just another entry in the host's process table, one that happens to have a restricted worldview and a constrained resource budget.&lt;/p&gt;

&lt;p&gt;That is the mental model. Everything in Part 2 builds on top of it.&lt;/p&gt;

</description>
      <category>containers</category>
      <category>linux</category>
    </item>
    <item>
      <title>Scaling ID Generation with Redis</title>
      <dc:creator>Krishna Tej Chalamalasetty</dc:creator>
      <pubDate>Thu, 26 Mar 2026 08:05:32 +0000</pubDate>
      <link>https://dev.to/chkrishnatej/scaling-id-generation-with-redis-3h4e</link>
      <guid>https://dev.to/chkrishnatej/scaling-id-generation-with-redis-3h4e</guid>
      <description>&lt;h2&gt;
  
  
  It Started with a Simple Counter
&lt;/h2&gt;

&lt;p&gt;I work on a cloud-based document management platform used by large construction and engineering firms. Every document uploaded drawings, RFIs, approvals gets an unique ID following a tenant-defined schema:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PROJ-9012-1001
  │      │     │
  │      │     └── Sequence number (auto-incremented)
  │      └──────── Document type identifier
  └──────────────── Project code
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An internal microservice called &lt;code&gt;id-generator&lt;/code&gt; handled this. It worked fine for years, a closed system behind our portal, moderate traffic, no drama.&lt;/p&gt;

&lt;p&gt;Then we opened Public APIs so customers could automate their workflows. And the first large customer tried to migrate 200,000 documents in a single batch.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;id-generator&lt;/code&gt; was called once per document, sequentially. Each call was a network hop. The migration ran for over six hours. The customer was not pleased.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Sequential Doesn’t Scale: The Math
&lt;/h2&gt;

&lt;p&gt;A single ID generation involves a network round-trip to the id-generator (~100ms) plus the generator’s own processing and sequence counter commit (~50ms). Call it 150ms per ID.&lt;/p&gt;

&lt;p&gt;For 200,000 documents, sequential processing means:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;200,000 × 150ms = 30,000 seconds ≈ 8.3 hours
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;"Just add more app instances" doesn’t help. We can spread requests across three instances behind a load balancer, but all three call the &lt;strong&gt;same id-generator&lt;/strong&gt;. The generator processes one request at a time to guarantee sequential numbering. The bottleneck isn’t the app layer, it’s the single-threaded sequence generation.&lt;/p&gt;

&lt;p&gt;Even if I made the id-generator handle *&lt;em&gt;10 concurrent requests *&lt;/em&gt;(with database level locking on the counter):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;200,000 ÷ 10 concurrent × 150ms = 3,000 seconds ≈ 50 minutes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Better, but fragile. The id-generator becomes a high contention hotspot, and any slowdown cascades to every project uploading at the same time.&lt;/p&gt;

&lt;p&gt;I needed to &lt;strong&gt;decouple ID consumption from ID generation entirely&lt;/strong&gt;, serve IDs without waiting for the generator.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F28qd7wxixe7rqdzc07cl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F28qd7wxixe7rqdzc07cl.png" alt="The Bottleneck" width="800" height="633"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture
&lt;/h2&gt;

&lt;p&gt;This is a &lt;strong&gt;classic producer-consumer&lt;/strong&gt; problem. Thousands of clients across different projects and organizations fire upload requests through an API gateway. The gateway distributes traffic across multiple app instances. But all instances need IDs from the same sequence space per &lt;code&gt;(project, documentType)&lt;/code&gt;, and a single producer (the id-generator) feeds that space.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The insight:&lt;/strong&gt; if I pre-generate IDs in bulk and stash them in a shared store, the app instances become &lt;strong&gt;consumers&lt;/strong&gt; popping from a ready made pool, while the id-generator becomes a &lt;strong&gt;background producer&lt;/strong&gt; that refills the pool asynchronously. Consumers never wait for the producer. The pool is the buffer that decouples them.&lt;br&gt;
I chose Redis as that shared store.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcrnblt11vsffj0uus3vr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcrnblt11vsffj0uus3vr.png" alt="Producer Consumer Architecture" width="800" height="338"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  The Fix: Pre-populated ID Pools in Redis
&lt;/h2&gt;

&lt;p&gt;The idea is straightforward. Instead of generating an ID when a request arrives, generate them ahead of time and stash them in Redis. &lt;/p&gt;

&lt;p&gt;When a request comes in, pop one off the list. No waiting.&lt;br&gt;
Each &lt;code&gt;(project, documentType)&lt;/code&gt; combination gets its own Redis List:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Key:   id-pool:PROJ:9012
Value: &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"1001"&lt;/span&gt;, &lt;span class="s2"&gt;"1002"&lt;/span&gt;, &lt;span class="s2"&gt;"1003"&lt;/span&gt;, ..., &lt;span class="s2"&gt;"2000"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;   ← 1000 pre-generated IDs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;&lt;code&gt;LPOP&lt;/code&gt; gives us an atomic, &lt;code&gt;O(1)&lt;/code&gt; retrieval. One pop, one ID, sub-millisecond.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;But a pool drains. I needed a way to refill it before it runs dry.&lt;/p&gt;

&lt;h2&gt;
  
  
  Threshold-based Replenishment
&lt;/h2&gt;

&lt;p&gt;I used a watermark pattern, borrowed from stream processing systems like Kafka and Flink. You define a threshold level on a resource, and when usage crosses that mark, you trigger an action &lt;em&gt;before&lt;/em&gt; the resource is exhausted.&lt;/p&gt;

&lt;p&gt;Think of a water tank with a sensor at the 25% mark: when water drops below it, you automatically reorder before the tank runs dry.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In my case:&lt;/strong&gt; each time an ID is served, I check the pool size. When 75% of the pool has been consumed (250 or fewer remaining out of 1000), I kick off an async task that calls the &lt;code&gt;id-generator&lt;/code&gt; for a fresh batch and pushes them into the Redis list.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;fetchNextId&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;projectCode&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;docTypeId&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;poolKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"id-pool:"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;projectCode&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;":"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;docTypeId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="c1"&gt;// Atomic pop - O(1), sub-ms&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;sequence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redisTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;opsForList&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;leftPop&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;poolKey&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sequence&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Pool empty - synchronous fallback (discussed later)&lt;/span&gt;
        &lt;span class="n"&gt;sequence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;idGenerator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;generateSingle&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;projectCode&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;docTypeId&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;checkAndReplenish&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;poolKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;projectCode&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;docTypeId&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;projectCode&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;"-"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;docTypeId&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;"-"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;sequence&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The replenishment runs on a bounded thread pool with &lt;code&gt;CallerRunsPolicy&lt;/code&gt;. If the pool and queue are saturated, the request thread itself does the refill. This applies natural backpressure instead of silently dropping work. Or so I thought.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fryqj8u319nmvgphqh0g4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fryqj8u319nmvgphqh0g4.png" alt="Pool Architecture — Happy Path" width="800" height="927"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This worked beautifully in testing. Sub millisecond ID retrieval. Invisible background replenishment. We shipped it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Production Outage: OOM from a Thundering Herd
&lt;/h2&gt;

&lt;p&gt;Within a week of going live with the large customer migration, the service started crashing with &lt;code&gt;java.lang.OutOfMemoryError&lt;/code&gt;: Java heap space. Repeatedly.&lt;/p&gt;

&lt;p&gt;I pulled a heap dump and started analysing it. The heap was full of hundreds of &lt;code&gt;ArrayList&amp;lt;String&amp;gt;&lt;/code&gt; instances, each holding a thousand generated IDs. They were all alive simultaneously.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The root cause was a race condition in the replenishment logic.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Here’s what happened: during a bulk upload, thousands of concurrent requests hit the service. Multiple requests for the &lt;em&gt;same&lt;/em&gt; &lt;code&gt;(project, docType)&lt;/code&gt; would check the pool level at nearly the same instant, all see "below 75%", and all independently trigger &lt;code&gt;replenishAsync()&lt;/code&gt;. Multiply this across hundreds of &lt;code&gt;(project, docType)&lt;/code&gt; combinations, and you get hundreds of async tasks, each generating and holding a 1000 element list in memory.&lt;/p&gt;

&lt;p&gt;The heap couldn’t take it.&lt;/p&gt;

&lt;p&gt;Counterintuitively, the &lt;code&gt;CallerRunsPolicy&lt;/code&gt; I'd chosen for backpressure made things worse. It was supposed to prevent &lt;code&gt;RejectedExecutionException&lt;/code&gt; when the executor's queue filled up. It did by making the request-handling threads also run generation tasks, adding even more large lists to the heap. The policy solved thread pool rejection but amplified the memory problem.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzvsq3s2jx9zty6svwkxm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzvsq3s2jx9zty6svwkxm.png" alt="The Thundering Herd" width="800" height="418"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Fix: Redis Distributed Locks
&lt;/h2&gt;

&lt;p&gt;The problem was clear: nothing prevented duplicate replenishment for the same pool key. I needed a mutex, but a Java &lt;code&gt;ReentrantLock&lt;/code&gt; or &lt;code&gt;synchronized&lt;/code&gt; block only protects a single JVM. Our service runs on multiple instances behind a load balancer. Instance A's lock means nothing to Instance B.&lt;/p&gt;

&lt;p&gt;I needed a distributed lock. Redis gives you one with a single command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;&lt;span class="n"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;lock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;pool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;PROJ&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;9012&lt;/span&gt; &lt;span class="s2"&gt;"a1b2c3-uuid"&lt;/span&gt; &lt;span class="n"&gt;NX&lt;/span&gt; &lt;span class="n"&gt;EX&lt;/span&gt; &lt;span class="mi"&gt;120&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;NX&lt;/code&gt; means "only set if it doesn't exist", atomic check-and-acquire. &lt;code&gt;EX 120&lt;/code&gt; means "auto-expire after 120 seconds", prevents deadlocks if the holder crashes.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Subtle Bug: Safe Lock Release
&lt;/h2&gt;

&lt;p&gt;My first implementation used a simple &lt;code&gt;DELETE&lt;/code&gt; in the &lt;code&gt;finally&lt;/code&gt; block. This has a nasty race condition:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;t&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0s    Instance A acquires lock &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;TTL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;60s&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;t&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;65s   Lock auto-expires &lt;span class="o"&gt;(&lt;/span&gt;A&lt;span class="s1"&gt;'s generation was slow)
t=66s   Instance B acquires the lock
t=70s   Instance A finishes → DELETE → deletes B'&lt;/span&gt;s lock
&lt;span class="nv"&gt;t&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;71s   Instance C sees no lock → acquires → duplicate work
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; store a UUID as the lock value, and use a Lua script for atomic compare-and-delete. An instance only deletes the lock if the value still matches its own UUID.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why Lua?&lt;/strong&gt; Redis doesn’t offer a native "&lt;strong&gt;delete if value equals X&lt;/strong&gt;" command. A &lt;code&gt;GET&lt;/code&gt; followed by a conditional &lt;code&gt;DELETE&lt;/code&gt; in application code has a race window another instance could acquire the lock between the &lt;code&gt;GET&lt;/code&gt; and &lt;code&gt;DELETE&lt;/code&gt;. Lua scripts execute atomically inside Redis, eliminating that gap.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Async&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"idGenExecutor"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;replenishAsync&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;projectCode&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;docTypeId&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;lockKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"lock:id-pool:"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;projectCode&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;":"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;docTypeId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;lockValue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;UUID&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;randomUUID&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="c1"&gt;// Atomic acquire: SET NX EX&lt;/span&gt;
    &lt;span class="nc"&gt;Boolean&lt;/span&gt; &lt;span class="n"&gt;acquired&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redisTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;opsForValue&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setIfAbsent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lockKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lockValue&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Duration&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofSeconds&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Boolean&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;FALSE&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;acquired&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// Someone else is handling it&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Double-check pool level under lock - another instance&lt;/span&gt;
        &lt;span class="c1"&gt;// may have already refilled between our check and acquire&lt;/span&gt;
        &lt;span class="nc"&gt;Long&lt;/span&gt; &lt;span class="n"&gt;currentSize&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redisTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;opsForList&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;poolKey&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;currentSize&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="no"&gt;POOL_SIZE&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// Already refilled&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;newIds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;idGenerator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;generateBatch&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;projectCode&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;docTypeId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;POOL_SIZE&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;currentSize&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;intValue&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;redisTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;opsForList&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;rightPushAll&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;poolKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;newIds&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Lua script: delete ONLY if value matches our UUID&lt;/span&gt;
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;lua&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 
            &lt;span class="s"&gt;"if redis.call('get',KEYS[1]) == ARGV[1] then "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
            &lt;span class="s"&gt;"  return redis.call('del',KEYS[1]) "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
            &lt;span class="s"&gt;"else return 0 end"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;redisTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;execute&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;DefaultRedisScript&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;lua&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Long&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
            &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lockKey&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="n"&gt;lockValue&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The double-check after acquiring the lock is the distributed equivalent of double-checked locking between the time I decided to replenish and the time I got the lock, another instance may have already done the work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A note on Redisson’s RLock:&lt;/strong&gt; it offers automatic lock renewal via a watchdog thread and reentrancy, which sounds appealing. I chose raw &lt;code&gt;SET NX EX&lt;/code&gt; + Lua because my replenishment task has a predictable duration (200-500ms) a 120-second TTL gives 200x headroom. If generation ever takes longer than that, something is seriously wrong with the id-generator and I want the lock to expire. Adding Redisson for a single lock pattern felt like pulling in a heavy dependency for a problem I didn't have.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flo7bklnpy00cttn4qjm9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flo7bklnpy00cttn4qjm9.png" alt="Distributed Lock Flow" width="800" height="1229"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After this fix, OOM crashes dropped to zero. Each pool key gets exactly one concurrent replenishment, regardless of how many instances or threads are contending.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem I Didn’t See Coming: Lost IDs
&lt;/h2&gt;

&lt;p&gt;With performance solved, I hit a subtler issue during failure testing.&lt;/p&gt;

&lt;p&gt;When a request pops an ID from the pool but the downstream document persistence fails network timeout, storage error, app exception that ID is gone. It was consumed from the pool, never attached to a document, and on retry the client gets a different ID. This creates sequence gaps and, worse, potential duplicates if the original write eventually succeeds after a timeout.&lt;/p&gt;

&lt;p&gt;I considered several approaches: client-supplied idempotency keys (requires client cooperation unreliable with external API consumers), a reservation-commit pattern (extra Redis round-trips per request), and pre-assigned batch reservations (good for migrations, overkill for single uploads).&lt;/p&gt;

&lt;h2&gt;
  
  
  My Choice: The Outbox Pattern
&lt;/h2&gt;

&lt;p&gt;I went with the Outbox pattern because it solves the problem entirely on the server side. No API contract changes. No client cooperation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The idea:&lt;/strong&gt; instead of popping an ID and then separately persisting the document, I pop the ID and write it along with the document metadata to an &lt;strong&gt;outbox table&lt;/strong&gt; in a single database transaction. If the transaction fails, neither the ID assignment nor the record exists atomic.&lt;/p&gt;

&lt;p&gt;A separate background processor handles the actual storage. The client gets their ID back immediately; the real persistence happens seconds later.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Transactional&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;assignId&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;projectCode&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;docTypeId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                        &lt;span class="nc"&gt;DocumentMetadata&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;docId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;idPoolService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fetchNextId&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;projectCode&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;docTypeId&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="c1"&gt;// Single transaction: ID + record exist together or not at all&lt;/span&gt;
    &lt;span class="n"&gt;outboxRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;save&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;OutboxEntry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;documentId&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docId&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;tenantCode&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;projectCode&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;entityTypeId&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docTypeId&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;payload&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;OutboxStatus&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;PENDING&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;retryCount&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;docId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// Client gets the ID immediately&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Outbox Processor: Retries, Backoff, and Dead Letters
&lt;/h2&gt;

&lt;p&gt;The processor is a &lt;code&gt;@Scheduled&lt;/code&gt; method that polls the outbox table every 2 seconds, picks up &lt;code&gt;PENDING&lt;/code&gt; entries, and tries to persist each document to the target store.&lt;/p&gt;

&lt;p&gt;On the happy path, it persists the document and marks the entry COMMITTED. But when storage fails with network timeout, S3 returning 500, disk full things get interesting.&lt;/p&gt;

&lt;p&gt;A naive approach retries the entry on the next poll, 2 seconds later. If storage is down, I’m hammering it every 2 seconds and blocking the processor from making progress on other entries.&lt;/p&gt;

&lt;p&gt;I used exponential backoff instead. Each outbox entry has a &lt;code&gt;next_retry_at&lt;/code&gt; timestamp:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;next_retry_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;^&lt;/span&gt; &lt;span class="n"&gt;retry_count&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Retry 1 waits 2 seconds. Retry 2 waits 4. Retry 3 waits 8. Retry 5 waits 32. Failing entries naturally “sink to the bottom” while fresh entries get processed promptly. The processor’s query becomes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;id_outbox&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'PENDING'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;next_retry_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After a maximum number of retries (I used 5), the entry moves to &lt;code&gt;DEAD_LETTER&lt;/code&gt; status. This means the system has given up on automatic recovery and the document payload might be corrupted, the target storage bucket might not exist for this tenant, or there's a permission issue that no amount of retrying will fix.&lt;/p&gt;

&lt;p&gt;Dead-lettered entries become a to-do list for the ops team: investigate, fix the root cause, and either reprocess manually or mark as abandoned.&lt;/p&gt;

&lt;p&gt;Why not retry forever? A poisoned entry like corrupted payload, invalid schema will never succeed. Infinite retries waste processing capacity and mask the real bug. The dead letter queue acts as a circuit breaker for individual entries.&lt;/p&gt;

&lt;p&gt;The outbox table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;id_outbox&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt;              &lt;span class="n"&gt;BIGSERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;document_id&lt;/span&gt;     &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;UNIQUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tenant_code&lt;/span&gt;     &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;entity_type_id&lt;/span&gt;  &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt;         &lt;span class="n"&gt;JSONB&lt;/span&gt;       &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt;          &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="s1"&gt;'PENDING'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;retry_count&lt;/span&gt;     &lt;span class="nb"&gt;INT&lt;/span&gt;         &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;next_retry_at&lt;/span&gt;   &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt;      &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;processed_at&lt;/span&gt;    &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- Partial index: only scans PENDING rows ready for processing&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_outbox_pending&lt;/span&gt;
    &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;id_outbox&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;next_retry_at&lt;/span&gt; &lt;span class="k"&gt;ASC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'PENDING'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiznj8xx5oyzvxtbzq41z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiznj8xx5oyzvxtbzq41z.png" alt="Outbox Pattern Flow" width="800" height="461"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The tradeoff is &lt;code&gt;eventual consistency&lt;/code&gt;. The document isn’t in the target store the instant the client gets the ID. There’s a 2–5 second delay while the processor runs. For bulk document uploads, this was perfectly acceptable.&lt;/p&gt;




&lt;h2&gt;
  
  
  When Redis Goes Down
&lt;/h2&gt;

&lt;p&gt;My fallback is straightforward: if LPOP fails with a &lt;code&gt;RedisConnectionFailureException&lt;/code&gt;, fall through to synchronous &lt;code&gt;id-generator&lt;/code&gt; calls.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;sequence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redisTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;opsForList&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;leftPop&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;poolKey&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;RedisConnectionFailureException&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;sequence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;idGenerator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;generateSingle&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;projectCode&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;docTypeId&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This re-introduces latency during a Redis outage (sub-ms jumps to ~200ms per ID), but the system stays available.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One important detail:&lt;/strong&gt; the &lt;code&gt;id-generator's&lt;/code&gt; sequence counter must live &lt;strong&gt;independently of Redis&lt;/strong&gt;. If the generator also relies on Redis for its counter (&lt;code&gt;INCRBY&lt;/code&gt;), then a Redis outage takes down both the pool and the fallback. I backed the counter with a database sequence so the fallback path has no Redis dependency.&lt;/p&gt;

&lt;p&gt;The other concern is &lt;strong&gt;consistency after Redis recovers&lt;/strong&gt;. During the outage, the sync fallback generates IDs using the database counter. The Redis pool still holds stale pre-generated IDs from before the crash. If you naively resume popping from the pool, you could issue duplicates.&lt;/p&gt;

&lt;p&gt;My approach: on Redis reconnection, &lt;strong&gt;invalidate all pool keys and re-seed from the current database counter value&lt;/strong&gt;. Simple but safe.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0jrm3se43vz5fetbb1wh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0jrm3se43vz5fetbb1wh.png" alt="Redis Recovery" width="638" height="1542"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;p&gt;Remember the back-of-envelope math? With the pool, a single ID retrieval drops from ~150ms (network hop to id-generator) to ~0.5ms (Redis LPOP). Across 3 app instances handling 20 concurrent requests each, the theoretical throughput for 200,000 documents is about 2 minutes. In practice, accounting for API gateway overhead, outbox commits, and occasional sync fallbacks, the observed results:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F39r0z2xqryizowt9ad4g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F39r0z2xqryizowt9ad4g.png" alt="Results" width="800" height="242"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The gap between theoretical (2 min) and observed (~1 hour) is mainly the outbox processor’s polling interval, document storage latency, and the client’s own upload pacing. The 85% improvement is on the end-to-end migration, not just ID generation but ID generation was the bottleneck that unlocked everything else.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I’d Improve
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Gap Reconciliation
&lt;/h3&gt;

&lt;p&gt;IDs still get lost. An instance can crash between popping from the pool and writing to the outbox. The outbox processor might exhaust retries and send entries to the dead letter queue. These are rare and a handful per million, but they create sequence gaps that confuse customers expecting contiguous numbering.&lt;/p&gt;

&lt;p&gt;I’d address this with a &lt;strong&gt;periodic reconciliation job&lt;/strong&gt; that compares three sources: the &lt;strong&gt;sequence counter&lt;/strong&gt; (highest ID ever generated), the &lt;strong&gt;document table&lt;/strong&gt; (which IDs have committed documents), and the &lt;strong&gt;outbox table&lt;/strong&gt; (which IDs are still pending or dead-lettered). Everything in the counter range that doesn’t appear in any of these is a gap. The job writes these to a lightweight &lt;code&gt;id_gaps&lt;/code&gt; table with the detection timestamp and inferred reason — &lt;code&gt;outbox_dlq&lt;/code&gt; for dead-lettered entries, &lt;code&gt;untracked_loss&lt;/code&gt; for IDs that never reached the outbox at all.&lt;/p&gt;

&lt;p&gt;An endpoint like &lt;code&gt;GET /api/v1/ids/gaps?project=PROJ&amp;amp;type=9012&lt;/code&gt; would let customers distinguish "this ID was skipped due to infrastructure" from "this document is actually missing." Gaps become explained rather than mysterious.&lt;/p&gt;

&lt;p&gt;I’d avoid trying to reclaim and reuse lost IDs. Reuse sounds clean but risks collision with late-arriving writes from the original assignment the kind of bug that’s nearly impossible to reproduce and debug. Better to waste a few numbers and track them than to reintroduce them into circulation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Pre-populate, don’t generate on-the-fly.&lt;/strong&gt; When generation is expensive or serial, trade storage for latency.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Watermark replenishment, not reactive.&lt;/strong&gt; Refill at 75% consumed, not when empty. No request should ever wait on generation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Distributed systems need distributed locks.&lt;/strong&gt; Java’s &lt;code&gt;ReentrantLock&lt;/code&gt; doesn't cross JVM boundaries. Redis &lt;code&gt;SET NX EX&lt;/code&gt; gives you a lightweight mutex.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Lock ownership matters.&lt;/strong&gt; UUID lock values + Lua atomic compare-and-delete. Never blindly &lt;code&gt;DELETE&lt;/code&gt; a lock key.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Double-check after acquiring the lock.&lt;/strong&gt; The distributed equivalent of double-checked locking prevents redundant work.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Solve idempotency server-side.&lt;/strong&gt; The Outbox pattern gives you atomicity without burdening API clients. Exponential backoff and dead letter queues turn “retry until it works” into something manageable.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Don’t reuse lost IDs.&lt;/strong&gt; Track gaps, explain them, move on. Reuse introduces collision risks that are far worse than a missing number.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>java</category>
      <category>redis</category>
      <category>software</category>
      <category>distributedsystems</category>
    </item>
    <item>
      <title>Discussion on personal developer websites with large padding on desktops</title>
      <dc:creator>Krishna Tej Chalamalasetty</dc:creator>
      <pubDate>Mon, 03 Jun 2019 04:16:01 +0000</pubDate>
      <link>https://dev.to/chkrishnatej/discussion-on-personal-developer-websites-with-large-padding-on-desktops-1i5g</link>
      <guid>https://dev.to/chkrishnatej/discussion-on-personal-developer-websites-with-large-padding-on-desktops-1i5g</guid>
      <description>&lt;p&gt;Hi all,&lt;/p&gt;

&lt;p&gt;I was going through lot of developers personal sites. In most of them, I have observed that while browsing the site in desktop it has lot of padding on both sides, so much that even though I am browsing it on the desktop, I feel like I am reading it on a mobile device.&lt;/p&gt;

&lt;p&gt;I understand now the whole web works on mobile-first approach. But I wanted to understand if it's a consequence of following mobile first approach or an intentional behavior.&lt;/p&gt;

&lt;p&gt;If it is an intentional behavior, what could be the rationale behind it?&lt;/p&gt;

&lt;p&gt;Please comment below&lt;/p&gt;

</description>
      <category>discuss</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Go Lang Installation</title>
      <dc:creator>Krishna Tej Chalamalasetty</dc:creator>
      <pubDate>Wed, 17 Apr 2019 01:32:00 +0000</pubDate>
      <link>https://dev.to/chkrishnatej/go-lang-installation-3b32</link>
      <guid>https://dev.to/chkrishnatej/go-lang-installation-3b32</guid>
      <description>&lt;h2&gt;
  
  
  Prerequisutes
&lt;/h2&gt;




&lt;p&gt;&lt;code&gt;go&lt;/code&gt; supports a wide range of operating systems. There are two ways &lt;code&gt;go&lt;/code&gt; can be installed on the system.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Binaries&lt;/li&gt;
&lt;li&gt;Building it from the source&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I am installing it on &lt;strong&gt;MacBook Pro with macOS Mojave v10.14.4&lt;/strong&gt; for this tutorial using the Mac binaries.&lt;/p&gt;

&lt;p&gt;You could get the download link &lt;a href="https://golang.org/dl/" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fchkrishnatej.com%2Fimages%2Fgolang%2Fdownload_page.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fchkrishnatej.com%2Fimages%2Fgolang%2Fdownload_page.png" alt="golang download page" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As shown in the image above, select the relevant package accoridng to the operating system. I have downloaded the one for &lt;code&gt;Apple macOS&lt;/code&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Download file
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fchkrishnatej.com%2Fimages%2Fgolang%2Fdownload_file.png" alt="golang download page" width="800" height="400"&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;




&lt;p&gt;The installation is a fairly straight. You could go with the defaults as shown below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fchkrishnatej.com%2Fimages%2Fgolang%2Finstallation_step_1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fchkrishnatej.com%2Fimages%2Fgolang%2Finstallation_step_1.png" alt="golang installation step1" width="800" height="400"&gt;&lt;/a&gt;&lt;br&gt;&lt;br&gt;
 &lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fchkrishnatej.com%2Fimages%2Fgolang%2Finstallation_step_2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fchkrishnatej.com%2Fimages%2Fgolang%2Finstallation_step_2.png" alt="golang installation step2" width="800" height="400"&gt;&lt;/a&gt;&lt;br&gt;&lt;br&gt;
 &lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fchkrishnatej.com%2Fimages%2Fgolang%2Finstallation_step_3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fchkrishnatej.com%2Fimages%2Fgolang%2Finstallation_step_3.png" alt="golang installation step3" width="800" height="400"&gt;&lt;/a&gt;&lt;br&gt;&lt;br&gt;
 &lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fchkrishnatej.com%2Fimages%2Fgolang%2Finstallation_success.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fchkrishnatej.com%2Fimages%2Fgolang%2Finstallation_success.png" alt="golang installation success" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Verification
&lt;/h2&gt;



&lt;p&gt;To verify if the &lt;code&gt;go&lt;/code&gt; has been successfully, it can be verified from the command line&lt;/p&gt;

&lt;p&gt;You could use the following commands in the terminal to verfiy.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; which go # Returns path where go is installed/usr/local/go/bin/go&amp;gt;&amp;gt;&amp;gt; go version # Returns the version of the go installedgo version go1.12.4 darwin/amd64
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fchkrishnatej.com%2Fimages%2Fgolang%2Fverify_installation.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fchkrishnatej.com%2Fimages%2Fgolang%2Fverify_installation.png" alt="golang installation success verification" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If the terminal shows up as the above, then &lt;code&gt;go&lt;/code&gt; lang has been successfully installed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting up workspace
&lt;/h2&gt;




&lt;p&gt;&lt;code&gt;Go&lt;/code&gt; follows a different approach to manage the code which requires to setup the project workspace. This means all the &lt;code&gt;go&lt;/code&gt; projects should be developed and maintained in the defined workspace.&lt;/p&gt;

&lt;h3&gt;
  
  
  Steps to setup the workspace
&lt;/h3&gt;




&lt;ul&gt;
&lt;li&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Open the shell config which is locate in the &lt;code&gt;HOME&lt;/code&gt; directory.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;NOTE:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;bash_profile&lt;/code&gt; or &lt;code&gt;.bashrc&lt;/code&gt; for &lt;em&gt;BASH&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
    - &lt;code&gt;.zshrc&lt;/code&gt; for &lt;em&gt;Oh my Zsh!&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;Add the following variables to shell config&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; export GOPATH=$HOME/&amp;lt;go-workspace&amp;gt; # &amp;lt;go-workspace&amp;gt; is a filler. Fill it with proper path&amp;gt;&amp;gt;&amp;gt; export PATH=$PATH:$GOPATH/bin&amp;gt;&amp;gt;&amp;gt; export GOROOT=/usr/local/opt/go/libexec&amp;gt;&amp;gt;&amp;gt; export PATH=$PATH:$GOROOT/bin
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;$GOPATH/src&lt;/code&gt; : Go Projects source code&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;$GOPATH/pkg&lt;/code&gt; : Contains imported packages&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;$GOPATH/bin&lt;/code&gt; : The compiled binaries home&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This completes the setup of &lt;code&gt;go&lt;/code&gt;. And system is ready for some &lt;strong&gt;code&lt;/strong&gt;.&lt;/p&gt;

</description>
      <category>cloud</category>
      <category>go</category>
    </item>
    <item>
      <title>GAE - An opinionated post</title>
      <dc:creator>Krishna Tej Chalamalasetty</dc:creator>
      <pubDate>Wed, 22 Aug 2018 13:45:00 +0000</pubDate>
      <link>https://dev.to/chkrishnatej/gae---an-opinionated-post-27ng</link>
      <guid>https://dev.to/chkrishnatej/gae---an-opinionated-post-27ng</guid>
      <description>&lt;p&gt;I have been looking to build a web-app and host it. My options were&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;To take a server and deploy it using IaaS like Azure, AWS or GCP&lt;/li&gt;
&lt;li&gt;Deploy it using PaaS like Heroku&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Honestly speaking, I didn't know any other PaaS for deploying Python apps except for Heroku. Before investing my time and knowledge of understanding the service, I was looking for if any other alternatives exist.&lt;/p&gt;

&lt;p&gt;My idea is to understand the different choices I have. Learn the pros and cons, before taking it up (Never test the depth of a river with both feet)&lt;/p&gt;

&lt;p&gt;My checks before taking the choice are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Simplicity&lt;/li&gt;
&lt;li&gt;Focus more on code than deployment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The choice was &lt;strong&gt;Google App Engine&lt;/strong&gt;. Even though it was launched almost at the same time of Heroku, it didn't go well in the beginning. It was known for being notoriously hard and it did suck.&lt;/p&gt;

&lt;p&gt;But now, things have changed. It has become a mature product with an excellent CLI. It really stands to what it claims, &lt;em&gt;just focus on your code&lt;/em&gt; and Google takes the responsibility of deploying it with minimal setup from the developer.&lt;/p&gt;

&lt;p&gt;It supports both Python 2.7 and 3.x. Has excellent documentation to get you started. You can build web applications and mobile backends.&lt;/p&gt;

&lt;p&gt;Now Google App Engine (GAE) is a part of Google Cloud Platform (GCP). The pricing of the GAE is simple enough and more economical than it's counterparts(It would be a whole another post if we talk about cloud pricing). It has got free daily quotas, which is more than enough to tinker to build your personal applications and test them. And of course its scalable, I have a complete belief in the capabilities of Google regarding the scalability.&lt;/p&gt;

&lt;p&gt;And there are many other services offered by GCP such as storage solutions, machine learning solutions and much more. All of them fit together beautifully.&lt;/p&gt;

&lt;p&gt;So far, it has been a great fit for me for building my pet projects and haven't found cons so far as long as I am inside Google environment. I believe Google would be a force to reckon with in the cloud space with its rich cloud ecosystem and a very affordable pricing strategy.&lt;/p&gt;

&lt;p&gt;Please comment about your experiences with GAE below.&lt;/p&gt;

</description>
      <category>gae</category>
      <category>cloud</category>
      <category>web</category>
      <category>python</category>
    </item>
    <item>
      <title>Is it normal that devs get stuck to figure out a solution while using external libraries?</title>
      <dc:creator>Krishna Tej Chalamalasetty</dc:creator>
      <pubDate>Mon, 09 Jul 2018 05:07:41 +0000</pubDate>
      <link>https://dev.to/chkrishnatej/is-it-normal-that-devs-get-stuck-to-figure-out-a-solution-while-using-external-libraries-23b9</link>
      <guid>https://dev.to/chkrishnatej/is-it-normal-that-devs-get-stuck-to-figure-out-a-solution-while-using-external-libraries-23b9</guid>
      <description>&lt;p&gt;I had this lingering doubt from so long. &lt;/p&gt;

&lt;p&gt;Whenever a problem comes up while developing, I could design a solution or at least a starting point to solve the problem. Many times I get stuck in figuring out a solution with a library which we use in the project.&lt;/p&gt;

&lt;p&gt;The whole thing of figuring out a solution with a library takes much more time than using my solution. The cons of using  my solution is, I had to write a whole method for getting the solution for which the libraries already have an existing functionality. &lt;/p&gt;

&lt;p&gt;I wonder many times does it mean I am slow at adopting the solution using an external library. This affects my pridcutivity a lot as I tend to get stuck on trivial issues for hours.&lt;/p&gt;

&lt;p&gt;Please share your advice and experiences regarding this.&lt;/p&gt;

</description>
      <category>discuss</category>
      <category>productivity</category>
      <category>advice</category>
    </item>
    <item>
      <title>Bulma Showcase Page</title>
      <dc:creator>Krishna Tej Chalamalasetty</dc:creator>
      <pubDate>Wed, 04 Jul 2018 17:33:30 +0000</pubDate>
      <link>https://dev.to/chkrishnatej/bulma-showcase-page-2hd8</link>
      <guid>https://dev.to/chkrishnatej/bulma-showcase-page-2hd8</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fha94xwam15k0ukujemg9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fha94xwam15k0ukujemg9.png" alt="Screenshot of Bulma Showcase page" width="800" height="1582"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Dev showcase page with Bulma CSS
&lt;/h1&gt;

&lt;p&gt;A &lt;a href="https://bulmadev.chkrishnatej.com/" rel="noopener noreferrer"&gt;developer showcase page&lt;/a&gt; with their projects and blog.&lt;/p&gt;

&lt;p&gt;Using &lt;a href="https://bulma.io" rel="noopener noreferrer"&gt;Bulma CSS&lt;/a&gt; the showcase page has been designed with vibrant colors. The tiles and cards showcase your work and blog posts.&lt;/p&gt;

&lt;h2&gt;
  
  
  How is it built?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The whole project has been built with using Bulma CSS and a small javascript for responsive naviagtion menu bar.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Small fragments of code for few features is sourced from different developers which all the due credit has been given.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How to use it?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Clone the repo using this link&lt;br&gt;
&lt;code&gt;https://github.com/chkrishnatej/bulma-dev-page.git&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Update the names and links appropriately.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Test it on your local machine.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Serve it on the server.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Github Pages
&lt;/h2&gt;

&lt;p&gt;Github pages allows you to host your static sites for free. For setting up Github pages, please follow the &lt;a href="https://pages.github.com/" rel="noopener noreferrer"&gt;link&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;You can even setup a custom domain for your Github pages. Follow the &lt;a href="https://help.github.com/articles/using-a-custom-domain-with-github-pages/" rel="noopener noreferrer"&gt;custom domain in github pages&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;P.S:&lt;/strong&gt;&lt;br&gt;
All ears for idea/suggestions/changes for the page. Please leave your feedback in comments.&lt;/p&gt;

</description>
      <category>dev</category>
      <category>showcase</category>
      <category>html</category>
    </item>
  </channel>
</rss>
