<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: AlertSleep</title>
    <description>The latest articles on DEV Community by AlertSleep (@alertsleep).</description>
    <link>https://dev.to/alertsleep</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3874455%2Fb8fbf33d-5c07-4f7e-a7ee-13beaa969d84.png</url>
      <title>DEV Community: AlertSleep</title>
      <link>https://dev.to/alertsleep</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/alertsleep"/>
    <language>en</language>
    <item>
      <title>How We Handle SSL Certificate Expiration Alerts at Scale</title>
      <dc:creator>AlertSleep</dc:creator>
      <pubDate>Fri, 17 Apr 2026 07:10:00 +0000</pubDate>
      <link>https://dev.to/alertsleep/how-we-handle-ssl-certificate-expiration-alerts-at-scale-1lmd</link>
      <guid>https://dev.to/alertsleep/how-we-handle-ssl-certificate-expiration-alerts-at-scale-1lmd</guid>
      <description>&lt;p&gt;It was a Tuesday morning in June 2021. LinkedIn — a platform used daily by hundreds of millions of professionals — went partially down. Not because of a DDoS attack, a bad deploy, or a database failure. Their SSL certificate had expired.&lt;/p&gt;

&lt;p&gt;The issue was resolved within hours, but the damage was done: broken links, frustrated users, and a very public reminder that one of the most preventable failures in infrastructure still catches well-resourced engineering teams off guard. LinkedIn was not alone. Microsoft Teams suffered a similar SSL expiry incident in 2020. Spotify has had certificate-related hiccups. Even government sites regularly show up in breach reports because of expired certs.&lt;/p&gt;

&lt;p&gt;If it can happen to them, it can happen to you.&lt;/p&gt;




&lt;h2&gt;
  
  
  What SSL Certificates Actually Are (and Why They Expire)
&lt;/h2&gt;

&lt;p&gt;An SSL/TLS certificate is a cryptographically signed document that proves your server is who it says it is. It binds your domain name to a public key, and a trusted Certificate Authority (CA) vouches for that binding.&lt;/p&gt;

&lt;p&gt;There are three main validation levels:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;DV (Domain Validation)&lt;/strong&gt; — Cheapest and fastest. CA only verifies you control the domain. Used by most personal sites and small services. 90-day Let's Encrypt certs fall here.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OV (Organization Validation)&lt;/strong&gt; — CA verifies the organization's legal existence. Common for company sites.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;EV (Extended Validation)&lt;/strong&gt; — Strictest vetting. Used by banks and payment platforms.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Historically, SSL certificates were issued for 1–2 year terms. In 2020, Apple, Google, and Mozilla enforced a hard cap of 398 days for certificates trusted in their browsers. Then Let's Encrypt popularized 90-day certificates, arguing shorter lifespans reduce the damage window if a certificate is compromised.&lt;/p&gt;

&lt;p&gt;The result: &lt;strong&gt;certificates expire faster than ever&lt;/strong&gt;, and the margin for error is shrinking.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Manual Tracking Fails
&lt;/h2&gt;

&lt;p&gt;When a team has two or three certificates, the spreadsheet approach works fine. Someone adds a row, sets a calendar reminder, done.&lt;/p&gt;

&lt;p&gt;Then the company grows. Suddenly you have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A wildcard cert for &lt;code&gt;*.yourdomain.com&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;A separate cert for &lt;code&gt;api.yourdomain.com&lt;/code&gt; managed by a different team&lt;/li&gt;
&lt;li&gt;A staging cert someone set up and forgot about&lt;/li&gt;
&lt;li&gt;A cert for a third-party integration endpoint you technically own&lt;/li&gt;
&lt;li&gt;Let's Encrypt auto-renew that "should be working" but nobody has verified in six months&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The spreadsheet becomes stale. Calendar reminders get snoozed. The person who set up the cert leaves the company. Auto-renewal fails silently because the DNS challenge no longer resolves correctly after a migration.&lt;/p&gt;

&lt;p&gt;This is not a people problem. It is a systems problem. Manual tracking does not scale.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Alert Timeline That Actually Works
&lt;/h2&gt;

&lt;p&gt;After dealing with enough SSL-related incidents, the SRE community has largely converged on a tiered alerting strategy:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Days Until Expiry&lt;/th&gt;
&lt;th&gt;Alert Type&lt;/th&gt;
&lt;th&gt;Who Gets Notified&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;60 days&lt;/td&gt;
&lt;td&gt;Awareness ping&lt;/td&gt;
&lt;td&gt;Primary engineer / infra team&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;30 days&lt;/td&gt;
&lt;td&gt;Action required&lt;/td&gt;
&lt;td&gt;Team lead + primary engineer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;14 days&lt;/td&gt;
&lt;td&gt;Escalation&lt;/td&gt;
&lt;td&gt;Manager + entire team&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7 days&lt;/td&gt;
&lt;td&gt;All-hands&lt;/td&gt;
&lt;td&gt;Engineering leadership&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1 day&lt;/td&gt;
&lt;td&gt;Emergency&lt;/td&gt;
&lt;td&gt;PagerDuty / on-call rotation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The 60-day notification is intentionally low-urgency. It gives the responsible party time to renew without pressure. By the time you hit 7 days, something has already gone wrong in your process — the earlier alerts were missed or ignored. The 1-day alert should be treated like a production incident.&lt;/p&gt;

&lt;p&gt;The key insight: &lt;strong&gt;alert early enough that the first notification is never urgent.&lt;/strong&gt; If your team is routinely panicking at 7 days or fewer, your alert window is too short.&lt;/p&gt;




&lt;h2&gt;
  
  
  Checking SSL Expiry: Code Examples
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Using openssl CLI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check expiry date for a domain&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; | openssl s_client &lt;span class="nt"&gt;-connect&lt;/span&gt; yourdomain.com:443 &lt;span class="nt"&gt;-servername&lt;/span&gt; yourdomain.com 2&amp;gt;/dev/null &lt;span class="se"&gt;\&lt;/span&gt;
  | openssl x509 &lt;span class="nt"&gt;-noout&lt;/span&gt; &lt;span class="nt"&gt;-dates&lt;/span&gt;

&lt;span class="c"&gt;# Output:&lt;/span&gt;
&lt;span class="c"&gt;# notBefore=Jan  1 00:00:00 2025 GMT&lt;/span&gt;
&lt;span class="c"&gt;# notAfter=Mar 31 23:59:59 2025 GMT&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To get the number of days remaining:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;EXPIRY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; | openssl s_client &lt;span class="nt"&gt;-connect&lt;/span&gt; yourdomain.com:443 &lt;span class="nt"&gt;-servername&lt;/span&gt; yourdomain.com 2&amp;gt;/dev/null &lt;span class="se"&gt;\&lt;/span&gt;
  | openssl x509 &lt;span class="nt"&gt;-noout&lt;/span&gt; &lt;span class="nt"&gt;-enddate&lt;/span&gt; | &lt;span class="nb"&gt;cut&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nt"&gt;-f2&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="nv"&gt;EXPIRY_EPOCH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$EXPIRY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; +%s 2&amp;gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="nt"&gt;-jf&lt;/span&gt; &lt;span class="s2"&gt;"%b %d %T %Y %Z"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$EXPIRY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; +%s&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;NOW_EPOCH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%s&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;DAYS_LEFT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;EXPIRY_EPOCH &lt;span class="o"&gt;-&lt;/span&gt; NOW_EPOCH&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="m"&gt;86400&lt;/span&gt; &lt;span class="k"&gt;))&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Days until expiry: &lt;/span&gt;&lt;span class="nv"&gt;$DAYS_LEFT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Using Node.js
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tls&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;tls&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;checkSSLExpiry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;hostname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;443&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;socket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;host&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;hostname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;port&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;servername&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;hostname&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cert&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getPeerCertificate&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="nx"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;end&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;expiryDate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;cert&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;valid_to&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;now&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;daysRemaining&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;floor&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;expiryDate&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;now&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

      &lt;span class="nf"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;hostname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;expiryDate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;daysRemaining&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="nx"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;error&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nf"&gt;checkSSLExpiry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;yourdomain.com&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;info&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;hostname&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;daysRemaining&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; days remaining`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;daysRemaining&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;CRITICAL: Certificate expires in less than 7 days!&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;daysRemaining&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;WARNING: Certificate expires soon.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Using Python
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ssl&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timezone&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;check_ssl_expiry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hostname&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;443&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ssl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_default_context&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_connection&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;hostname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;sock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wrap_socket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sock&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;server_hostname&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;hostname&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;ssock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;cert&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ssock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getpeercert&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;expiry_str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cert&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;notAfter&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;expiry_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strptime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expiry_str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;%b %d %H:%M:%S %Y %Z&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tzinfo&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;timezone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;days_remaining&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expiry_date&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tz&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;timezone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utc&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="n"&gt;days&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hostname&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;hostname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;days_remaining&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;days_remaining&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;check_ssl_expiry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;yourdomain.com&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hostname&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;days_remaining&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; days remaining&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Automating Checks with a Cron Job
&lt;/h2&gt;

&lt;p&gt;A simple cron-based approach for teams managing a small number of domains:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# /usr/local/bin/check-ssl-certs.sh&lt;/span&gt;

&lt;span class="nv"&gt;DOMAINS&lt;/span&gt;&lt;span class="o"&gt;=(&lt;/span&gt;&lt;span class="s2"&gt;"yourdomain.com"&lt;/span&gt; &lt;span class="s2"&gt;"api.yourdomain.com"&lt;/span&gt; &lt;span class="s2"&gt;"dashboard.yourdomain.com"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;ALERT_EMAIL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"infra-team@yourcompany.com"&lt;/span&gt;
&lt;span class="nv"&gt;WARN_DAYS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;30

&lt;span class="k"&gt;for &lt;/span&gt;DOMAIN &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;DOMAINS&lt;/span&gt;&lt;span class="p"&gt;[@]&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;&lt;span class="nv"&gt;EXPIRY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; | openssl s_client &lt;span class="nt"&gt;-connect&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;DOMAIN&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:443"&lt;/span&gt; &lt;span class="nt"&gt;-servername&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;DOMAIN&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; 2&amp;gt;/dev/null &lt;span class="se"&gt;\&lt;/span&gt;
    | openssl x509 &lt;span class="nt"&gt;-noout&lt;/span&gt; &lt;span class="nt"&gt;-enddate&lt;/span&gt; 2&amp;gt;/dev/null | &lt;span class="nb"&gt;cut&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nt"&gt;-f2&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$EXPIRY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"ERROR: Could not retrieve cert for &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;DOMAIN&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
      | mail &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="s2"&gt;"SSL Check Failed: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;DOMAIN&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$ALERT_EMAIL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;continue
  fi

  &lt;/span&gt;&lt;span class="nv"&gt;DAYS_LEFT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$EXPIRY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; +%s&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%s&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="m"&gt;86400&lt;/span&gt; &lt;span class="k"&gt;))&lt;/span&gt;

  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DAYS_LEFT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-le&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$WARN_DAYS&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"SSL cert for &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;DOMAIN&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; expires in &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;DAYS_LEFT&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; days (&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;EXPIRY&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;)"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
      | mail &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="s2"&gt;"SSL Warning: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;DOMAIN&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; expires in &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;DAYS_LEFT&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; days"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$ALERT_EMAIL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="k"&gt;fi
done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add to crontab to run daily at 8 AM:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;0 8 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; /usr/local/bin/check-ssl-certs.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gets you to a functional baseline. The limitation: it only works when your cron runner is healthy, and it has no concept of alert escalation or historical tracking.&lt;/p&gt;




&lt;h2&gt;
  
  
  External Monitoring as a Safety Net
&lt;/h2&gt;

&lt;p&gt;Self-hosted cron jobs are a good first layer. They are not sufficient on their own. The machine running your cron job could be the same machine whose cert expires. Or the job runs but silently fails because your SMTP relay is down.&lt;/p&gt;

&lt;p&gt;External monitoring services check your SSL certificates from outside your infrastructure, on a schedule, and alert you through independent channels (email, Slack, PagerDuty, SMS). This separation is the point — if your infrastructure has a problem, you still get notified.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://alertsleep.com/features/ssl-monitoring" rel="noopener noreferrer"&gt;AlertSleep&lt;/a&gt; is one example: it monitors SSL certificates continuously, tracks expiry dates across all your domains, and fires alerts at configurable thresholds — without requiring you to manage any infrastructure for the monitoring itself. For teams that want visibility without operational overhead, this kind of external check is a meaningful complement to internal automation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Managing SSL at Scale: 50+ Certificates
&lt;/h2&gt;

&lt;p&gt;When you cross the threshold of managing 50 or more certificates, new problems emerge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Build a certificate inventory.&lt;/strong&gt; Know which cert covers which domain, when it was issued, when it expires, who owns renewal, and whether it auto-renews. A simple internal wiki page is better than nothing. A proper certificate management tool is better still.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wildcard certificates need special attention.&lt;/strong&gt; A &lt;code&gt;*.yourdomain.com&lt;/code&gt; wildcard might cover dozens of subdomains. If it expires, all of them break simultaneously. The blast radius of a wildcard expiry is much larger than a single-domain cert.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Treat auto-renewal as a process, not a guarantee.&lt;/strong&gt; Let's Encrypt auto-renewal via certbot or ACME clients is reliable under normal conditions. It fails when DNS records change, when ports 80/443 are firewalled during the renewal window, or when the renewal configuration drifts after infrastructure changes. Verify that auto-renewal is actually succeeding, not just scheduled.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use centralized alerting.&lt;/strong&gt; Sending expiry alerts directly to individual engineers does not work at scale. Route all SSL alerts to a shared channel (Slack &lt;code&gt;#infra-alerts&lt;/code&gt;) and a ticketing system. Coverage should not depend on any single person being available.&lt;/p&gt;




&lt;h2&gt;
  
  
  Closing Thoughts
&lt;/h2&gt;

&lt;p&gt;SSL certificate expiration is a solved problem. The tools exist, the alert timelines are well-established, and the failure modes are well-documented. What makes it persistent as an incident cause is the gap between knowing what to do and actually having it in place.&lt;/p&gt;

&lt;p&gt;The LinkedIn outage in 2021 was not a failure of knowledge. It was a failure of process. Somewhere in the chain, a certificate slipped through without the right person getting the right alert at the right time.&lt;/p&gt;

&lt;p&gt;The fix is not complicated: external monitoring as your safety net, tiered alerts with enough lead time to act calmly, and an inventory that does not live in one person's head.&lt;/p&gt;

&lt;p&gt;The goal is to make certificate expiry the most boring part of your infrastructure. An alert fires at 60 days, someone renews, done. No incident, no postmortem, no Tuesday morning scramble.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Setting up SSL monitoring for the first time? &lt;a href="https://alertsleep.com/features/ssl-monitoring" rel="noopener noreferrer"&gt;AlertSleep's SSL monitoring&lt;/a&gt; handles the external check layer and alert routing out of the box — worth a look before you build your own.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>devops</category>
      <category>sre</category>
      <category>security</category>
    </item>
    <item>
      <title>Cron Expression Cheat Sheet: Every Pattern You'll Actually Use</title>
      <dc:creator>AlertSleep</dc:creator>
      <pubDate>Thu, 16 Apr 2026 10:00:00 +0000</pubDate>
      <link>https://dev.to/alertsleep/cron-expression-cheat-sheet-every-pattern-youll-actually-use-d0</link>
      <guid>https://dev.to/alertsleep/cron-expression-cheat-sheet-every-pattern-youll-actually-use-d0</guid>
      <description>&lt;p&gt;You've seen it before. You need to schedule a job to run every weekday at 7:30 AM, so you open a tab, search "cron expression weekdays", stare at five cryptic fields, and immediately second-guess yourself.&lt;/p&gt;

&lt;p&gt;Is it &lt;code&gt;30 7 * * 1-5&lt;/code&gt; or &lt;code&gt;30 7 * * MON-FRI&lt;/code&gt;? Does &lt;code&gt;*/15&lt;/code&gt; mean every 15 minutes starting from zero, or every 15th minute? And why does the internet have seventeen different answers?&lt;/p&gt;

&lt;p&gt;Cron expressions are one of those things that look like line noise the first hundred times and suddenly click on the hundred and first. This article skips the theory lecture and goes straight to the patterns you'll actually reach for — explained clearly, with copy-paste syntax ready to go.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Five Fields (Say Them Out Loud Once)
&lt;/h2&gt;

&lt;p&gt;Every standard cron expression is exactly five fields separated by spaces:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌───────────── minute        (0–59)
│ ┌─────────── hour          (0–23)
│ │ ┌───────── day of month  (1–31)
│ │ │ ┌─────── month         (1–12)
│ │ │ │ ┌───── day of week   (0–7, where 0 and 7 are both Sunday)
│ │ │ │ │
* * * * *
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Read it left to right: &lt;strong&gt;minute → hour → day → month → weekday&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Drill that order into your head and half the confusion disappears. The other half disappears once you know what the special characters actually do.&lt;/p&gt;




&lt;h2&gt;
  
  
  Special Characters
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Character&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;*&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Every possible value&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;* * * * *&lt;/code&gt; = every minute&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;,&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;List of values&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;0,30 * * * *&lt;/code&gt; = at :00 and :30&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;-&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Range of values&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;0 9-17 * * *&lt;/code&gt; = every hour from 9 AM to 5 PM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Step (every N)&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;*/15 * * * *&lt;/code&gt; = every 15 minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That's it. Four characters. Everything else is just combining them.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Cheat Sheet
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Every minute
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;* * * * *
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You probably won't run jobs this frequently in production, but great for testing that your scheduler is alive.&lt;/p&gt;




&lt;h3&gt;
  
  
  Every 5 minutes
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;*/5 * * * *
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;/&lt;/code&gt; means "every N units." So &lt;code&gt;*/5&lt;/code&gt; in the minute field = 0, 5, 10, 15 ... 55. Works the same way in any field.&lt;/p&gt;




&lt;h3&gt;
  
  
  Every 15 minutes
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;*/15 * * * *
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Classic for polling jobs, cache warming, or health checks you want more granular than hourly.&lt;/p&gt;




&lt;h3&gt;
  
  
  Every 30 minutes
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;*/30 * * * *
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or equivalently: &lt;code&gt;0,30 * * * *&lt;/code&gt;. Both hit :00 and :30 of every hour.&lt;/p&gt;




&lt;h3&gt;
  
  
  Every hour (on the hour)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0 * * * *
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note the &lt;code&gt;0&lt;/code&gt; in the minute field. &lt;code&gt;* * * * *&lt;/code&gt; is every minute. &lt;code&gt;0 * * * *&lt;/code&gt; is every hour at minute zero.&lt;/p&gt;




&lt;h3&gt;
  
  
  Every hour at :30
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;30 * * * *
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Useful when you want to offset from other hourly jobs to spread load.&lt;/p&gt;




&lt;h3&gt;
  
  
  Every 6 hours
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0 */6 * * *
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Runs at midnight, 6 AM, noon, 6 PM.&lt;/p&gt;




&lt;h3&gt;
  
  
  Daily at midnight
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0 0 * * *
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Most common cron pattern in existence. Daily reports, log rotation, database backups.&lt;/p&gt;




&lt;h3&gt;
  
  
  Daily at 2 AM
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0 2 * * *
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;2 AM is a popular "quiet hours" slot. Low user traffic, indexes done rebuilding.&lt;/p&gt;




&lt;h3&gt;
  
  
  Daily at 8:30 AM
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;30 8 * * *
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Minute field first, then hour. Easy to flip — don't.&lt;/p&gt;




&lt;h3&gt;
  
  
  Weekdays only (Monday–Friday)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0 9 * * 1-5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;1-5&lt;/code&gt; in the weekday field means Monday through Friday. Run a standup digest, a daily business report, anything that shouldn't fire on weekends.&lt;/p&gt;




&lt;h3&gt;
  
  
  Weekdays at 9 AM and 5 PM
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0 9,17 * * 1-5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Combining the comma list and the weekday range. Two fires per day, only on business days.&lt;/p&gt;




&lt;h3&gt;
  
  
  Weekends only
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0 10 * * 6,7
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Saturday and Sunday at 10 AM. Useful for batch jobs you want to avoid during business hours.&lt;/p&gt;




&lt;h3&gt;
  
  
  Every Monday at 9 AM
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0 9 * * 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Weekly summaries, reports, cleanup jobs. Day &lt;code&gt;1&lt;/code&gt; = Monday.&lt;/p&gt;




&lt;h3&gt;
  
  
  First day of every month at midnight
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0 0 1 * *
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Monthly invoicing, reports, archiving. The &lt;code&gt;1&lt;/code&gt; is in the day-of-month field (third position).&lt;/p&gt;




&lt;h3&gt;
  
  
  First Monday of the month
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0 9 1-7 * 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;1-7&lt;/code&gt; (first 7 days) combined with &lt;code&gt;1&lt;/code&gt; (Monday) fires when both conditions are met — the first Monday of the month. Verify this with your cron implementation, as behavior can vary.&lt;/p&gt;




&lt;h3&gt;
  
  
  Specific date once a year (e.g., January 1st)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;0 0 1 1 *
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Midnight on the 1st of January. Happy New Year, cron job.&lt;/p&gt;




&lt;h2&gt;
  
  
  Day-of-Week Reference
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;th&gt;Day&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0 or 7&lt;/td&gt;
&lt;td&gt;Sunday&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Monday&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Tuesday&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Wednesday&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Thursday&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Friday&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Saturday&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Both &lt;code&gt;0&lt;/code&gt; and &lt;code&gt;7&lt;/code&gt; are Sunday — a legacy quirk. Use whichever your tool accepts.&lt;/p&gt;




&lt;h2&gt;
  
  
  Month Reference
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;th&gt;Month&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;January&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;February&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;March&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;...&lt;/td&gt;
&lt;td&gt;...&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;December&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Some implementations accept &lt;code&gt;JAN&lt;/code&gt;, &lt;code&gt;FEB&lt;/code&gt;, &lt;code&gt;MON&lt;/code&gt;, &lt;code&gt;TUE&lt;/code&gt;, etc. as aliases. Check your platform's docs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick Test: Can You Read These?
&lt;/h2&gt;

&lt;p&gt;Before looking at the answers, try reading each expression aloud:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="k"&gt;*&lt;/span&gt;/10 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt;
0 0 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; 0
0 12 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; 1-5
0 0 1,15 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt;
30 23 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Answers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Every 10 minutes&lt;/li&gt;
&lt;li&gt;Every Sunday at midnight&lt;/li&gt;
&lt;li&gt;Noon on weekdays&lt;/li&gt;
&lt;li&gt;Midnight on the 1st and 15th of every month&lt;/li&gt;
&lt;li&gt;11:30 PM every day&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you got those, you know cron.&lt;/p&gt;




&lt;h2&gt;
  
  
  If You Don't Want to Memorize All This
&lt;/h2&gt;

&lt;p&gt;The honest truth: nobody remembers every pattern. Even experienced engineers double-check their expressions before deploying a job that runs monthly.&lt;/p&gt;

&lt;p&gt;If you want a visual way to build and validate cron expressions without trial and error, &lt;a href="https://alertsleep.com/tools/cron-generator" rel="noopener noreferrer"&gt;AlertSleep's cron expression generator&lt;/a&gt; lets you select human-readable options (every weekday, first of the month, etc.) and generates the correct syntax automatically. Useful for the edge cases that are easy to get wrong.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick-Reference Summary Table
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Goal&lt;/th&gt;
&lt;th&gt;Cron Expression&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Every minute&lt;/td&gt;
&lt;td&gt;&lt;code&gt;* * * * *&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Every 5 minutes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;*/5 * * * *&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Every 15 minutes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;*/15 * * * *&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Every 30 minutes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;*/30 * * * *&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Every hour&lt;/td&gt;
&lt;td&gt;&lt;code&gt;0 * * * *&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Every hour at :30&lt;/td&gt;
&lt;td&gt;&lt;code&gt;30 * * * *&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Every 6 hours&lt;/td&gt;
&lt;td&gt;&lt;code&gt;0 */6 * * *&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Daily at midnight&lt;/td&gt;
&lt;td&gt;&lt;code&gt;0 0 * * *&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Daily at 2 AM&lt;/td&gt;
&lt;td&gt;&lt;code&gt;0 2 * * *&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Daily at 8:30 AM&lt;/td&gt;
&lt;td&gt;&lt;code&gt;30 8 * * *&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Weekdays at 9 AM&lt;/td&gt;
&lt;td&gt;&lt;code&gt;0 9 * * 1-5&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Every Monday at 9 AM&lt;/td&gt;
&lt;td&gt;&lt;code&gt;0 9 * * 1&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;First of month at midnight&lt;/td&gt;
&lt;td&gt;&lt;code&gt;0 0 1 * *&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Every Sunday at midnight&lt;/td&gt;
&lt;td&gt;&lt;code&gt;0 0 * * 0&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;January 1st at midnight&lt;/td&gt;
&lt;td&gt;&lt;code&gt;0 0 1 1 *&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Weekdays at 9 AM and 5 PM&lt;/td&gt;
&lt;td&gt;&lt;code&gt;0 9,17 * * 1-5&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  One Last Thing
&lt;/h2&gt;

&lt;p&gt;When you deploy a cron job, always verify the timezone your scheduler uses. A job set to &lt;code&gt;0 2 * * *&lt;/code&gt; on a UTC server fires at 2 AM UTC — which might be 9 PM, 6 AM, or some other local time depending on where your users are. Always check. Always document it in a comment next to your cron expression.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Runs daily at 2 AM UTC (10 PM ET / 7 PM PT)&lt;/span&gt;
0 2 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; /usr/local/bin/run-backup.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Future-you will thank present-you.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Found this useful? Save the summary table and never Google "cron expression weekdays" again.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>devops</category>
      <category>beginners</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Building a Status Page From Scratch vs Using a Service: A Cost Analysis</title>
      <dc:creator>AlertSleep</dc:creator>
      <pubDate>Tue, 14 Apr 2026 12:34:59 +0000</pubDate>
      <link>https://dev.to/alertsleep/building-a-status-page-from-scratch-vs-using-a-service-a-cost-analysis-3ad</link>
      <guid>https://dev.to/alertsleep/building-a-status-page-from-scratch-vs-using-a-service-a-cost-analysis-3ad</guid>
      <description>&lt;p&gt;Your users know your app is down before you do.&lt;/p&gt;

&lt;p&gt;They see the spinning loader, the 502 error, the silence where data should be. And they have nowhere to go for answers. So they flood your support inbox, post on Twitter, and quietly decide to check out your competitor.&lt;/p&gt;

&lt;p&gt;A status page changes that dynamic completely. It's not just a "we're working on it" page — it's a trust instrument. It tells users: &lt;em&gt;we see what you see, we're on it, here's what we know.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;But here's the question every engineering team eventually faces: &lt;strong&gt;do you build it, or do you buy it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let me break down the real costs of both.&lt;/p&gt;




&lt;h2&gt;
  
  
  What a Status Page Actually Needs to Do
&lt;/h2&gt;

&lt;p&gt;Before comparing options, let's align on minimum viable functionality:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Show current status of each component (API, dashboard, payments, etc.)&lt;/li&gt;
&lt;li&gt;Display active incidents with live updates&lt;/li&gt;
&lt;li&gt;Historical uptime data (last 30-90 days)&lt;/li&gt;
&lt;li&gt;Subscriber notifications (email, SMS) when incidents are created or resolved&lt;/li&gt;
&lt;li&gt;Maintenance window announcements&lt;/li&gt;
&lt;li&gt;Public URL that stays up even when your app is down&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last point is critical and often overlooked: &lt;strong&gt;your status page must be hosted independently from your main infrastructure.&lt;/strong&gt; A status page that goes down with your app is worse than no status page at all.&lt;/p&gt;




&lt;h2&gt;
  
  
  Option A: Build It Yourself
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What you're actually building
&lt;/h3&gt;

&lt;p&gt;Most teams underestimate the scope. A status page isn't a static HTML file — it's a small application:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Frontend:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Component status grid with color states (operational / degraded / outage)&lt;/li&gt;
&lt;li&gt;Incident timeline with markdown support&lt;/li&gt;
&lt;li&gt;Uptime history graph (requires storing and querying ping data)&lt;/li&gt;
&lt;li&gt;Subscriber signup form&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Backend:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;API to update component status&lt;/li&gt;
&lt;li&gt;Incident management CRUD&lt;/li&gt;
&lt;li&gt;Email/SMS notification system (integrate Mailgun, SendGrid, Twilio)&lt;/li&gt;
&lt;li&gt;Webhook receiver (if you want auto-updates from your monitoring tool)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Infrastructure:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hosted separately from your main stack (different cloud region, different provider)&lt;/li&gt;
&lt;li&gt;Must stay online during your worst outages&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Realistic time estimate
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;th&gt;Hours&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Basic frontend (React/Vue)&lt;/td&gt;
&lt;td&gt;8–16 hrs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Backend API&lt;/td&gt;
&lt;td&gt;8–12 hrs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Email notifications&lt;/td&gt;
&lt;td&gt;4–6 hrs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SMS notifications&lt;/td&gt;
&lt;td&gt;3–5 hrs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Historical uptime graph&lt;/td&gt;
&lt;td&gt;6–10 hrs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Separate hosting setup&lt;/td&gt;
&lt;td&gt;2–4 hrs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Testing &amp;amp; polish&lt;/td&gt;
&lt;td&gt;4–8 hrs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;35–61 hrs&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;At a conservative $75/hr developer rate, that's &lt;strong&gt;$2,600 – $4,600&lt;/strong&gt; before the first user sees it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ongoing costs
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;~2–4 hours/month maintenance&lt;/li&gt;
&lt;li&gt;Hosting: $5–20/month (Fly.io, Railway, Render)&lt;/li&gt;
&lt;li&gt;Email service: $0–15/month (SendGrid free tier runs out)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Total recurring: $60–$420/year&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The hidden cost nobody accounts for
&lt;/h3&gt;

&lt;p&gt;Your status page will have its first real test during your worst incident. When your database is on fire and every engineer is in a war room call, someone also has to update the status page.&lt;/p&gt;

&lt;p&gt;If that status page is your own codebase — with its own deployment pipeline, its own bugs, its own "why isn't the email sending" moments — you've just doubled the cognitive load during the exact moment you can least afford it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Option B: Use a Service
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The main players in 2026
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;th&gt;Free Tier&lt;/th&gt;
&lt;th&gt;Paid Starts At&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Atlassian Statuspage&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;$29/mo&lt;/td&gt;
&lt;td&gt;Industry standard, complex&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Better Uptime&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;td&gt;$20/mo&lt;/td&gt;
&lt;td&gt;Good UX, integrated monitoring&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Instatus&lt;/td&gt;
&lt;td&gt;Yes (limited)&lt;/td&gt;
&lt;td&gt;$20/mo&lt;/td&gt;
&lt;td&gt;Clean, fast&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AlertSleep&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Paid plans available&lt;/td&gt;
&lt;td&gt;Integrated with uptime monitoring&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cachet (self-hosted)&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;td&gt;Hosting costs&lt;/td&gt;
&lt;td&gt;Open source, DIY maintenance&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  What you get immediately
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Status page live in 10 minutes&lt;/li&gt;
&lt;li&gt;Subscriber management handled&lt;/li&gt;
&lt;li&gt;Hosted on separate, reliable infrastructure&lt;/li&gt;
&lt;li&gt;Incident management UI (no code required)&lt;/li&gt;
&lt;li&gt;Uptime history auto-populated from monitoring checks&lt;/li&gt;
&lt;li&gt;Mobile app for on-call updates&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The real cost comparison
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Build&lt;/th&gt;
&lt;th&gt;Buy (mid-tier)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Initial setup cost&lt;/td&gt;
&lt;td&gt;$2,600–$4,600&lt;/td&gt;
&lt;td&gt;$0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Time to launch&lt;/td&gt;
&lt;td&gt;1–2 weeks&lt;/td&gt;
&lt;td&gt;10 minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Monthly recurring&lt;/td&gt;
&lt;td&gt;$5–35/mo&lt;/td&gt;
&lt;td&gt;$20–29/mo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Year 1 total&lt;/td&gt;
&lt;td&gt;$2,900–$5,000&lt;/td&gt;
&lt;td&gt;$240–$350&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Year 2 total&lt;/td&gt;
&lt;td&gt;$60–$420&lt;/td&gt;
&lt;td&gt;$240–$350&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Break-even&lt;/td&gt;
&lt;td&gt;~Year 8&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The build option theoretically becomes cheaper in year 8. But it doesn't account for the ongoing maintenance, the engineering time spent on features instead of your core product, or the incidents that went poorly because the status page had a bug.&lt;/p&gt;




&lt;h2&gt;
  
  
  When Building Makes Sense
&lt;/h2&gt;

&lt;p&gt;There are legitimate reasons to build your own:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Build if:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You're a platform company where the status page is part of your product (think Vercel, Heroku)&lt;/li&gt;
&lt;li&gt;You need deep integration with proprietary internal tooling&lt;/li&gt;
&lt;li&gt;You have dedicated SRE resources with time to maintain it&lt;/li&gt;
&lt;li&gt;You have specific branding/white-label requirements that no service offers&lt;/li&gt;
&lt;li&gt;You're already building a monitoring platform yourself&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Buy if:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You have fewer than 10 engineers&lt;/li&gt;
&lt;li&gt;You need it working before your next launch&lt;/li&gt;
&lt;li&gt;Your team is already stretched thin&lt;/li&gt;
&lt;li&gt;You've had a public incident and need to restore user trust quickly&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Architecture Decision Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;Whether you build or buy, there's one architectural decision that matters more than everything else:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your status page data must come from external monitoring, not internal reporting.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If your status page only shows "down" when your own systems detect and report it, you have a problem: your systems might be down in a way that prevents them from self-reporting.&lt;/p&gt;

&lt;p&gt;The right architecture:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;External monitor (different cloud, different region)
    ↓ detects outage
    ↓ triggers alert
    ↓ auto-creates incident on status page
    ↓ notifies subscribers
    ↓ engineers get paged
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Not:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Your app
    ↓ is down
    ↓ engineer notices 20 minutes later
    ↓ manually logs into status page
    ↓ manually creates incident
    ↓ users have been confused for 20 minutes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is why integrated solutions — where your uptime monitoring and status page share data — tend to work better in practice.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Recommendation
&lt;/h2&gt;

&lt;p&gt;For most teams: &lt;strong&gt;buy, don't build.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not because building is wrong — building is often the right answer for product problems. But a status page is infrastructure, not product. It should be invisible when things are working and bulletproof when things aren't.&lt;/p&gt;

&lt;p&gt;The engineering time you'd spend building a status page is almost certainly better spent on the features that make outages less frequent in the first place.&lt;/p&gt;

&lt;p&gt;Start with a free tier, get it live this week, and revisit when you've outgrown it.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What's your current setup? If you're still manually emailing users during incidents, it's worth spending 10 minutes setting up something better. Tools like &lt;a href="https://alertsleep.com" rel="noopener noreferrer"&gt;AlertSleep&lt;/a&gt; let you connect uptime monitoring directly to a public status page — so when a check fails, the incident is created automatically.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Drop your status page setup in the comments — curious what the dev.to community is using.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devops</category>
      <category>webdev</category>
      <category>productivity</category>
      <category>sre</category>
    </item>
    <item>
      <title>What 99.9% Uptime Actually Means: 8.7 Hours of Downtime Per Year</title>
      <dc:creator>AlertSleep</dc:creator>
      <pubDate>Sun, 12 Apr 2026 06:07:48 +0000</pubDate>
      <link>https://dev.to/alertsleep/what-999-uptime-actually-means-87-hours-of-downtime-per-year-33k</link>
      <guid>https://dev.to/alertsleep/what-999-uptime-actually-means-87-hours-of-downtime-per-year-33k</guid>
      <description>&lt;p&gt;You've seen it everywhere. On hosting pages, SaaS pricing tables, cloud provider dashboards:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"99.9% uptime guaranteed"&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Sounds impressive. Almost perfect. Like, what's 0.1%?&lt;/p&gt;

&lt;p&gt;A lot, actually. Let me show you the math — and more importantly, what it means for your users, your revenue, and your sleep schedule.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Math Nobody Does
&lt;/h2&gt;

&lt;p&gt;99.9% uptime means your service is &lt;strong&gt;unavailable for 0.1% of the time&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Here's what 0.1% looks like across different time windows:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Time Period&lt;/th&gt;
&lt;th&gt;Allowed Downtime&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Per day&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1 minute 26 seconds&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per week&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;10 minutes 4 seconds&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per month&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;43 minutes 49 seconds&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per year&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;8 hours 45 minutes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That last one is the one that should make you pause. &lt;strong&gt;8 hours and 45 minutes of downtime per year&lt;/strong&gt; — and your SLA is technically fine the whole time.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Full SLA Cheat Sheet
&lt;/h2&gt;

&lt;p&gt;Most people only know the "three nines" (99.9%). Here's the complete picture:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;SLA&lt;/th&gt;
&lt;th&gt;Downtime/Year&lt;/th&gt;
&lt;th&gt;Downtime/Month&lt;/th&gt;
&lt;th&gt;Downtime/Day&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;99%&lt;/td&gt;
&lt;td&gt;3 days 15 hrs&lt;/td&gt;
&lt;td&gt;7 hrs 18 min&lt;/td&gt;
&lt;td&gt;14 min 24 sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;99.5%&lt;/td&gt;
&lt;td&gt;1 day 19 hrs&lt;/td&gt;
&lt;td&gt;3 hrs 39 min&lt;/td&gt;
&lt;td&gt;7 min 12 sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;99.9%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;8 hrs 45 min&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;43 min 49 sec&lt;/td&gt;
&lt;td&gt;1 min 26 sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;99.95%&lt;/td&gt;
&lt;td&gt;4 hrs 22 min&lt;/td&gt;
&lt;td&gt;21 min 54 sec&lt;/td&gt;
&lt;td&gt;43 sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;99.99%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;52 min 35 sec&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4 min 22 sec&lt;/td&gt;
&lt;td&gt;8.6 sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;99.999%&lt;/td&gt;
&lt;td&gt;5 min 15 sec&lt;/td&gt;
&lt;td&gt;26 sec&lt;/td&gt;
&lt;td&gt;0.86 sec&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The jump from 99.9% to 99.99% — one extra "9" — reduces your annual downtime budget from &lt;strong&gt;8.7 hours to 52 minutes&lt;/strong&gt;. That's a 10x difference.&lt;/p&gt;




&lt;h2&gt;
  
  
  Calculate Your Own Uptime
&lt;/h2&gt;

&lt;p&gt;The formula is simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Downtime = Total Time × (1 - Uptime %)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For example, a year has &lt;code&gt;365.25 × 24 = 8,766 hours&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;At 99.9%:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;8,766 hours × 0.001 = 8.766 hours ≈ 8 hrs 45 min
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or in JavaScript, if you want to build it yourself:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;calculateDowntime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;uptimePercent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;periodHours&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;downtimeRatio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;uptimePercent&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;downtimeHours&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;periodHours&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;downtimeRatio&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;downtimeMinutes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;downtimeHours&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;downtimeSeconds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;downtimeMinutes&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;hours&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;floor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;downtimeHours&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;minutes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;floor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;downtimeMinutes&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;floor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;downtimeSeconds&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// 99.9% uptime over a year (8766 hours)&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;calculateDowntime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;99.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8766&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="c1"&gt;// → { hours: 8, minutes: 45, seconds: 46 }&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you'd rather skip the math, tools like &lt;a href="https://alertsleep.com/tools/uptime-calculator" rel="noopener noreferrer"&gt;AlertSleep's uptime calculator&lt;/a&gt; let you punch in any percentage and get the breakdown instantly.&lt;/p&gt;




&lt;h2&gt;
  
  
  "But Our SLA Excludes Planned Maintenance"
&lt;/h2&gt;

&lt;p&gt;This is the clause that quietly turns "99.9%" into "something much lower."&lt;/p&gt;

&lt;p&gt;Many SLAs include language like:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Uptime calculations exclude scheduled maintenance windows, force majeure events, and incidents caused by the customer."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In practice, this means a vendor can take their service down for a 4-hour maintenance window every month and still advertise "99.9% uptime" — because those hours simply don't count.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Always check:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Does the SLA count maintenance windows as downtime?&lt;/li&gt;
&lt;li&gt;How much advance notice is required for scheduled maintenance?&lt;/li&gt;
&lt;li&gt;What's the compensation if they breach the SLA? (Hint: it's usually service credits, not money)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What Does Downtime Actually Cost?
&lt;/h2&gt;

&lt;p&gt;Here's where it gets real. Abstract percentages become concrete when you map them to your business.&lt;/p&gt;

&lt;p&gt;A rough formula used by most reliability engineers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Cost of Downtime = Lost Revenue/hr + Productivity Cost/hr + Reputation Damage
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For an e-commerce site doing $100k/day in revenue:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Revenue per hour = $100,000 / 24 ≈ $4,166/hr

At 99.9% uptime → 8.75 hours of downtime/year
→ Lost revenue: 8.75 × $4,166 ≈ $36,000/year
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And that's before counting the customer support tickets, the social media complaints, and the users who never come back.&lt;/p&gt;




&lt;h2&gt;
  
  
  The "Five Nines" Problem
&lt;/h2&gt;

&lt;p&gt;You'll sometimes see "five nines" (99.999%) thrown around by cloud providers. It sounds incredible — only &lt;strong&gt;5 minutes of downtime per year&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;But here's the uncomfortable truth: &lt;strong&gt;achieving five nines is mostly about architecture, not monitoring.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Five nines requires:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multi-region active-active deployments&lt;/li&gt;
&lt;li&gt;Zero-downtime deployments (blue/green or canary)&lt;/li&gt;
&lt;li&gt;Automatic failover with sub-second detection&lt;/li&gt;
&lt;li&gt;Chaos engineering to test failure scenarios&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most startups and even mid-size companies realistically operate at &lt;strong&gt;99.5% to 99.95%&lt;/strong&gt;. And that's fine — if you know it and plan for it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Difference Between Measured and Actual Uptime
&lt;/h2&gt;

&lt;p&gt;Here's a subtle but important distinction.&lt;/p&gt;

&lt;p&gt;Your hosting provider might achieve 99.99% uptime at the infrastructure level. But &lt;strong&gt;your application&lt;/strong&gt; might only hit 99.5% because of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Memory leaks that require weekly restarts&lt;/li&gt;
&lt;li&gt;Slow database queries that cause timeouts (HTTP 504 — is that "downtime"?)&lt;/li&gt;
&lt;li&gt;Third-party API dependencies that go down&lt;/li&gt;
&lt;li&gt;SSL certificate expiry (this kills more sites than you'd think)&lt;/li&gt;
&lt;li&gt;Your own deployment going wrong at 2am&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Your uptime is only as good as the weakest link in the chain. And the only way to know your real uptime — not your provider's uptime — is to monitor from the outside.&lt;/p&gt;




&lt;h2&gt;
  
  
  What to Actually Monitor
&lt;/h2&gt;

&lt;p&gt;Most developers start monitoring too late and measure too little. Here's a baseline:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Minimum viable monitoring:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] HTTP status check every 1-5 minutes&lt;/li&gt;
&lt;li&gt;[ ] Response time tracking (a 503 that takes 30s is worse than a fast 503)&lt;/li&gt;
&lt;li&gt;[ ] SSL certificate expiry alert (set to 30 days before)&lt;/li&gt;
&lt;li&gt;[ ] Domain expiration alert (set to 60 days before)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Level up:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Multi-region checks (your site might be down only in the US East)&lt;/li&gt;
&lt;li&gt;[ ] API endpoint monitoring (not just the homepage)&lt;/li&gt;
&lt;li&gt;[ ] Port monitoring for non-HTTP services&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Alert channels that actually wake you up:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SMS/phone call for critical alerts (email is too easy to miss at 3am)&lt;/li&gt;
&lt;li&gt;Slack/Teams for the team&lt;/li&gt;
&lt;li&gt;Status page for your users so they know you know&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Real Takeaway
&lt;/h2&gt;

&lt;p&gt;99.9% uptime is &lt;strong&gt;not&lt;/strong&gt; "always online." It's a budget — a budget of how much downtime your users are willing to accept before they find an alternative.&lt;/p&gt;

&lt;p&gt;The question isn't "what SLA does my provider offer?" The question is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;What uptime does your business actually need — and how will you know when you're not hitting it?&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The first step is measuring. You can't improve what you can't see.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you're building something people depend on, set up external uptime monitoring today — not after the first outage. Tools like &lt;a href="https://alertsleep.com" rel="noopener noreferrer"&gt;AlertSleep&lt;/a&gt; start free and take about 2 minutes to configure.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;What SLA does your app target? And are you actually measuring it? Drop it in the comments.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>devops</category>
      <category>sre</category>
      <category>beginners</category>
    </item>
  </channel>
</rss>
