<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Nema Chandra Goswami</title>
    <description>The latest articles on DEV Community by Nema Chandra Goswami (@nym).</description>
    <link>https://dev.to/nym</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3822115%2Fd175b20c-284e-40a7-9264-67f9d4b7cb13.png</url>
      <title>DEV Community: Nema Chandra Goswami</title>
      <link>https://dev.to/nym</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/nym"/>
    <language>en</language>
    <item>
      <title>The 18 Minutes That Tested Me</title>
      <dc:creator>Nema Chandra Goswami</dc:creator>
      <pubDate>Tue, 17 Mar 2026 06:18:30 +0000</pubDate>
      <link>https://dev.to/nym/the-18-minutes-that-tested-me-4ac2</link>
      <guid>https://dev.to/nym/the-18-minutes-that-tested-me-4ac2</guid>
      <description>&lt;p&gt;Monday. 10:15 AM. Coffee in hand.&lt;/p&gt;

&lt;p&gt;Life was good.&lt;/p&gt;

&lt;p&gt;Then my phone buzzed.&lt;/p&gt;

&lt;p&gt;It was Jaya from Sales.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“The client portal is down. Customers are calling. What’s happening?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That one sentence changes the atmosphere in seconds.&lt;/p&gt;

&lt;p&gt;I opened the production URL.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;502 Bad Gateway.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Silence.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Situation was...&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Our platform was running on Amazon EC2, served through Nginx.

It had been stable for months.

No recent risky deployments. No alerts overnight.

So why now?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I immediately looped in:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;My manager, Aman&lt;br&gt;
Jaya from Sales&lt;br&gt;
Backend &amp;amp; DevOps team&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Aman asked the question every leader asks:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“What’s the impact?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Jaya didn’t sugarcoat it.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Two new companies was onboarded recently.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;em&gt;Pressure level? Maximum.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The Investigation...&lt;/p&gt;

&lt;p&gt;✔ EC2 instance — Running ✔ CPU — Normal ✔ Memory — Stable ✔ Nginx — Active&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Everything looked… fine.

But production doesn’t lie.

We checked application logs.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And there it was.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Database connection failures...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At the same time, Sales had launched a marketing campaign that morning. Traffic spiked. Our connection pool maxed out.&lt;/p&gt;

&lt;p&gt;Success… broke the system.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Decision&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We had two options:&lt;/p&gt;

&lt;p&gt;Restart everything and pray.&lt;/p&gt;

&lt;p&gt;Fix it properly.&lt;/p&gt;

&lt;p&gt;Aman asked calmly,&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“What do you suggest?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In moments like this, you don’t just answer — you own it.&lt;/p&gt;

&lt;p&gt;I said:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Increase DB connection pool

Vertically scale the EC2 instance

Restart services in sequence

Add monitoring immediately

Plan auto-scaling after recovery
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;We executed.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Clock ticking...&lt;/p&gt;

&lt;p&gt;8 minutes. 12 minutes. 15 minutes.&lt;br&gt;
At 18 minutes...&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;System stable. Portal loading. new companies was available for business.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The Hidden Battle: Communication&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;While engineers fixed the backend, I stayed connected with Jaya.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Not panic.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Not excuses.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Just clarity.&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“We are experiencing high traffic and scaling infrastructure. ETA: 15 minutes.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That message saved trust.&lt;/p&gt;

&lt;p&gt;Because outages hurt systems.&lt;/p&gt;

&lt;p&gt;But silence hurts relationships.&lt;/p&gt;

&lt;p&gt;What It Taught Me&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Systems fails, Traffic surprises you, and Infrastructure has limits.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;But leadership is tested in:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;• Decision speed &lt;br&gt;
• Clear communication &lt;br&gt;
• Confidence under pressure&lt;/p&gt;

&lt;p&gt;Later, Aman told me:&lt;/p&gt;

&lt;h2&gt;
  
  
  “You handled it well.”
&lt;/h2&gt;

&lt;p&gt;And that meant more than fixing the outage.&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
