DEV Community

JP
JP

Posted on

Why Your SSL Certificate Will Expire at 3 AM on a Saturday (And How to Stop It)

Let me tell you about the worst Saturday morning of my career.

3:17 AM. My phone explodes with alerts. The main production site is down. API is down. Mobile app can't connect. Customer support is getting flooded.

The culprit? An expired SSL certificate.

Not a server crash. Not a database failure. Not a DDoS attack. A certificate that expired at 3:00 AM on a Saturday morning, taking down a multi-million dollar e-commerce platform.

The certificate had been valid for a year. We knew it would expire. There were emails. There were reminders. But between team changes, inbox filters, and "I'll do it tomorrow" syndrome, it slipped through the cracks.

That incident cost the company six figures in lost revenue and taught me that SSL certificate monitoring isn't optional -- it's survival.

The Problem with SSL Certificates

SSL/TLS certificates are one of those things that work perfectly until they don't. And when they don't, they fail catastrophically:

  • Silent expiry -- No warning. Your site just stops working.
  • Browser warnings -- "This connection is not private" scares away customers
  • API breakage -- Mobile apps and integrations stop working
  • Zero grace period -- The second it expires, everything breaks

Even worse, modern infrastructure has dozens or hundreds of certificates:

  • Main domain (example.com)
  • WWW subdomain (www.example.com)
  • API endpoints (api.example.com)
  • Admin panels (admin.example.com)
  • CDN origins
  • Wildcard certificates (*.example.com)
  • Internal services
  • Load balancers
  • VPNs

Each one is a ticking time bomb.

Why Certificates Expire (And Why That's Good)

Before we dive into monitoring, let's understand why certificates expire:

  1. Security -- Shorter validity periods limit damage if a private key is compromised
  2. Cryptographic aging -- Today's "secure" algorithms become tomorrow's vulnerabilities
  3. Ownership changes -- Domains change hands, organizations restructure

Certificate authorities (CAs) have been steadily reducing validity periods:

  • 2015: 5 years maximum
  • 2018: 2 years maximum (825 days)
  • 2020: 1 year maximum (398 days)
  • 2024: Proposals for 90 days or less

Let's Encrypt already issues 90-day certificates by default. The industry is moving toward automated, short-lived certificates -- which means monitoring becomes even more critical.

The Anatomy of an SSL Certificate

To monitor certificates effectively, you need to understand what you're monitoring:

# Check a certificate with OpenSSL
openssl s_client -connect example.com:443 -servername example.com < /dev/null 2>/dev/null | openssl x509 -noout -dates

# Output:
notBefore=Jan 15 00:00:00 2026 GMT
notAfter=Apr 15 23:59:59 2026 GMT
Enter fullscreen mode Exit fullscreen mode

Key fields to monitor:

  • Subject -- Who the cert is issued to (CN=example.com)
  • Issuer -- Who issued it (Let's Encrypt, DigiCert, etc.)
  • Valid From (notBefore)
  • Valid Until (notAfter) -- THE critical field
  • Subject Alternative Names (SANs) -- Additional domains covered
  • Signature Algorithm -- Is it still secure?

The SAN Trap

Here's a gotcha that bit me: Subject Alternative Names (SANs).

Modern certificates often cover multiple domains:

CN=example.com
SAN=example.com, www.example.com, api.example.com
Enter fullscreen mode Exit fullscreen mode

If you only check the primary domain but monitor from a different SAN, you might miss expiry warnings. Always verify the certificate covers the specific hostname you're checking.

Building a Certificate Monitor

The simplest version is a shell script:

#!/bin/bash
DOMAIN="example.com"
PORT=443
ALERT_DAYS=30

# Get expiry date
EXPIRY=$(echo | openssl s_client -servername "$DOMAIN" -connect "$DOMAIN:$PORT" 2>/dev/null | \
         openssl x509 -noout -enddate | cut -d= -f2)

# Convert to epoch timestamp
EXPIRY_EPOCH=$(date -d "$EXPIRY" +%s)
NOW_EPOCH=$(date +%s)
DAYS_LEFT=$(( ($EXPIRY_EPOCH - $NOW_EPOCH) / 86400 ))

if [ $DAYS_LEFT -lt $ALERT_DAYS ]; then
    echo "Certificate for $DOMAIN expires in $DAYS_LEFT days!"
    # Send alert (email, Slack, etc.)
fi
Enter fullscreen mode Exit fullscreen mode

Run this daily via cron:

0 9 * * * /usr/local/bin/check-cert.sh
Enter fullscreen mode Exit fullscreen mode

But this has problems:

  1. Only checks when the cron runs
  2. No historical data
  3. Doesn't handle multiple domains
  4. No retry logic for network blips
  5. Alert fatigue (you get the same alert every day)

Let's level up.

A Production-Grade Solution

Here's a Node.js version that handles edge cases:

const tls = require('tls');
const https = require('https');

async function checkCertificate(hostname, port = 443) {
    return new Promise((resolve, reject) => {
        const options = {
            host: hostname,
            port: port,
            servername: hostname, // SNI support
            rejectUnauthorized: false, // Check even invalid certs
        };

        const socket = tls.connect(options, () => {
            const cert = socket.getPeerCertificate();

            if (!cert || !cert.valid_to) {
                socket.end();
                return reject(new Error('No certificate found'));
            }

            const validTo = new Date(cert.valid_to);
            const validFrom = new Date(cert.valid_from);
            const now = new Date();
            const daysLeft = Math.floor((validTo - now) / (1000 * 60 * 60 * 24));

            const result = {
                hostname: hostname,
                subject: cert.subject,
                issuer: cert.issuer,
                validFrom: validFrom,
                validTo: validTo,
                daysLeft: daysLeft,
                isValid: now >= validFrom && now <= validTo,
                subjectAltNames: cert.subjectaltname,
                serialNumber: cert.serialNumber,
                fingerprint: cert.fingerprint
            };

            socket.end();
            resolve(result);
        });

        socket.on('error', (err) => {
            reject(err);
        });

        socket.setTimeout(10000, () => {
            socket.destroy();
            reject(new Error('Connection timeout'));
        });
    });
}

// Usage
checkCertificate('example.com')
    .then(cert => {
        console.log(`Certificate for ${cert.hostname}:`);
        console.log(`  Expires: ${cert.validTo}`);
        console.log(`  Days left: ${cert.daysLeft}`);
        console.log(`  Issuer: ${cert.issuer.O}`);

        if (cert.daysLeft < 30) {
            sendAlert(cert);
        }
    })
    .catch(err => console.error('Check failed:', err.message));
Enter fullscreen mode Exit fullscreen mode

Python Alternative

import ssl
import socket
from datetime import datetime, timedelta

def check_certificate(hostname, port=443):
    context = ssl.create_default_context()

    with socket.create_connection((hostname, port), timeout=10) as sock:
        with context.wrap_socket(sock, server_hostname=hostname) as ssock:
            cert = ssock.getpeercert()

            # Parse expiry
            not_after = datetime.strptime(cert['notAfter'], '%b %d %H:%M:%S %Y %Z')
            days_left = (not_after - datetime.now()).days

            return {
                'hostname': hostname,
                'subject': dict(x[0] for x in cert['subject']),
                'issuer': dict(x[0] for x in cert['issuer']),
                'valid_to': not_after,
                'days_left': days_left,
                'serial_number': cert['serialNumber'],
                'san': cert.get('subjectAltName', [])
            }

# Usage
cert = check_certificate('example.com')
print(f"Certificate expires in {cert['days_left']} days")

if cert['days_left'] < 30:
    send_alert(cert)
Enter fullscreen mode Exit fullscreen mode

Alert Thresholds: A Practical Strategy

Don't just alert at "30 days". Use multiple thresholds with escalating urgency:

  • 60 days Informational (Slack channel, low priority)
  • 30 days Warning (Email to team)
  • 14 days Urgent (Email + Slack mention)
  • 7 days Critical (SMS/PagerDuty to on-call)
  • 3 days Emergency (Wake everyone up)
function getAlertLevel(daysLeft) {
    if (daysLeft <= 3) return 'EMERGENCY';
    if (daysLeft <= 7) return 'CRITICAL';
    if (daysLeft <= 14) return 'URGENT';
    if (daysLeft <= 30) return 'WARNING';
    if (daysLeft <= 60) return 'INFO';
    return 'OK';
}

function sendAlert(cert) {
    const level = getAlertLevel(cert.daysLeft);

    const message = `${level}: Certificate for ${cert.hostname} expires in ${cert.daysLeft} days (${cert.validTo})`;

    switch(level) {
        case 'EMERGENCY':
            sendSMS(message);
            sendEmail(message);
            sendSlack(message);
            break;
        case 'CRITICAL':
            sendEmail(message);
            sendSlack(message, '@channel');
            break;
        case 'URGENT':
            sendEmail(message);
            sendSlack(message);
            break;
        case 'WARNING':
            sendEmail(message);
            break;
        case 'INFO':
            sendSlack(message); // No email spam
            break;
    }
}
Enter fullscreen mode Exit fullscreen mode

Edge Cases That Will Bite You

1. Certificate Chains

A valid server cert with an expired intermediate certificate will fail:

Root CA (trusted)
  Intermediate CA (EXPIRED!)
    Server Cert (valid)
Enter fullscreen mode Exit fullscreen mode

Always check the entire chain:

const cert = socket.getPeerCertificate(true); // true = include chain
const chain = [cert];
let current = cert;

while (current.issuerCertificate && current !== current.issuerCertificate) {
    chain.push(current.issuerCertificate);
    current = current.issuerCertificate;
}

// Check each cert in the chain
chain.forEach((c, i) => {
    const validTo = new Date(c.valid_to);
    const daysLeft = Math.floor((validTo - Date.now()) / 86400000);
    console.log(`  [${i}] ${c.subject.CN} - ${daysLeft} days left`);
});
Enter fullscreen mode Exit fullscreen mode

2. Wildcard Certificates

A wildcard cert (*.example.com) covers:

  • api.example.com
  • www.example.com
  • admin.example.com

But NOT:

  • example.com (root domain)
  • api.staging.example.com (nested subdomain)

Monitor both the wildcard AND the root domain separately.

3. SNI (Server Name Indication)

Modern servers host multiple domains on one IP. Without SNI, you might check the wrong certificate:

# Wrong (no SNI)
openssl s_client -connect 192.0.2.1:443

# Right (with SNI)
openssl s_client -connect 192.0.2.1:443 -servername example.com
Enter fullscreen mode Exit fullscreen mode

Always specify -servername or the equivalent in your programming language.

4. Load Balancers and CDNs

You have:

  • Origin server cert (your server)
  • Load balancer cert (AWS ELB, Cloudflare, etc.)
  • CDN edge cert (Cloudflare, Fastly, etc.)

Monitor the certificate that users actually see, not just your origin. Check from outside your infrastructure.

5. Internal Services

Don't forget:

  • Database SSL connections
  • Redis TLS
  • Internal APIs
  • VPNs
  • LDAP/LDAPS
  • SMTP with STARTTLS

These often use self-signed or internal CA certs that expire quietly.

Let's Encrypt: Automation & Monitoring

Let's Encrypt changed the game with free, automated certificates. But automation doesn't mean "set and forget".

ACME Renewal Monitoring

Certbot (the Let's Encrypt client) auto-renews certs. But what if renewal fails?

# Check last renewal attempt
journalctl -u certbot.timer --since "7 days ago" | grep -i error

# Or parse certbot's logs
grep "Renewal failed" /var/log/letsencrypt/letsencrypt.log
Enter fullscreen mode Exit fullscreen mode

Better: Monitor the certificate expiry and renewal success separately:

const fs = require('fs');
const path = require('path');

function checkLetsEncryptRenewal(domain) {
    const renewalConfig = `/etc/letsencrypt/renewal/${domain}.conf`;

    if (!fs.existsSync(renewalConfig)) {
        return { error: 'Renewal config not found' };
    }

    const stats = fs.statSync(renewalConfig);
    const daysSinceUpdate = (Date.now() - stats.mtime) / 86400000;

    // LE renews certs older than 60 days
    if (daysSinceUpdate > 60) {
        return { 
            warning: `Renewal config not updated in ${Math.floor(daysSinceUpdate)} days`
        };
    }

    return { ok: true };
}
Enter fullscreen mode Exit fullscreen mode

Common Let's Encrypt Failures

  1. HTTP-01 challenge blocked -- Firewall rules, CDN config
  2. DNS-01 fails -- API rate limits, DNS propagation
  3. Webroot permissions -- Certbot can't write to .well-known/acme-challenge/
  4. Rate limits hit -- Too many renewals in a week

Monitor renewal logs and alert on failure patterns.

Dashboard: Making Data Actionable

Raw alerts are good. A dashboard is better. Track:

  • Certificates expiring soon (sorted by days left)
  • Recently renewed (confirm automation is working)
  • Issuer distribution (Are you still using that old CA?)
  • Expiry timeline (visual calendar of upcoming expirations)

Simple HTML dashboard:

async function generateDashboard(domains) {
    const checks = await Promise.all(
        domains.map(d => checkCertificate(d).catch(err => ({ 
            hostname: d, 
            error: err.message 
        })))
    );

    // Sort by days left (ascending)
    checks.sort((a, b) => (a.daysLeft || 999) - (b.daysLeft || 999));

    const html = `
    <h1>SSL Certificate Dashboard</h1>
    <table>
        <tr><th>Domain</th><th>Days Left</th><th>Expires</th><th>Issuer</th></tr>
        ${checks.map(c => `
            <tr class="${getRowClass(c.daysLeft)}">
                <td>${c.hostname}</td>
                <td>${c.daysLeft || 'ERROR'}</td>
                <td>${c.validTo?.toLocaleDateString() || '-'}</td>
                <td>${c.issuer?.O || '-'}</td>
            </tr>
        `).join('')}
    </table>
    `;

    return html;
}

function getRowClass(daysLeft) {
    if (!daysLeft) return 'error';
    if (daysLeft < 7) return 'critical';
    if (daysLeft < 14) return 'urgent';
    if (daysLeft < 30) return 'warning';
    return 'ok';
}
Enter fullscreen mode Exit fullscreen mode

Pro Tips from Production

1. Check from Multiple Locations

Your cert might be valid from your office but expired on your CDN edge nodes:

const locations = [
    'us-east.probe.example.com',
    'eu-west.probe.example.com',
    'ap-southeast.probe.example.com'
];

for (const probe of locations) {
    const cert = await checkCertificateViaProxy(probe, 'example.com');
    // Compare results
}
Enter fullscreen mode Exit fullscreen mode

2. Track Certificate Changes

Alert when certificates change unexpectedly (potential security issue):

const lastKnownFingerprint = db.get('cert_fingerprint', domain);
const currentFingerprint = cert.fingerprint;

if (lastKnownFingerprint && lastKnownFingerprint !== currentFingerprint) {
    sendSecurityAlert(`Certificate changed for ${domain}!`);
}

db.set('cert_fingerprint', domain, currentFingerprint);
Enter fullscreen mode Exit fullscreen mode

3. Test Renewal Before It's Urgent

Don't wait until 7 days before expiry to test renewal. Run a dry-run monthly:

certbot renew --dry-run --quiet || echo "Renewal will fail!"
Enter fullscreen mode Exit fullscreen mode

4. Document Your Renewal Process

When the cert expires at 3 AM and the person who set it up left the company 6 months ago, you need documentation:

# SSL Certificate Renewal - example.com

**Provider:** Let's Encrypt  
**Renewal method:** Certbot with HTTP-01 challenge  
**Auto-renewal:** Yes (certbot.timer systemd service)  
**Manual renewal:** `sudo certbot renew --force-renewal`  
**Webroot:** `/var/www/example.com/`  
**On-call:** ops@example.com  
**Last manual renewal:** 2026-01-15 by @john
Enter fullscreen mode Exit fullscreen mode

Tools Worth Knowing

  • SSL Labs -- Free online checker (https://www.ssllabs.com/ssltest/)
  • testssl.sh -- Comprehensive SSL/TLS scanner
  • Certbot -- Let's Encrypt client
  • cert-manager -- Kubernetes cert automation
  • AWS Certificate Manager -- Managed certs for AWS

And of course, dedicated monitoring services exist (I'll leave product names out of this, but they're worth Googling).

The Post-Incident Checklist

After that 3 AM disaster, here's what we implemented:

  1. Automated monitoring for all domains (check daily)
  2. Multi-threshold alerts (60/30/14/7/3 days)
  3. Multiple alert channels (email, Slack, SMS)
  4. Dashboard showing all certs at a glance
  5. Documentation for manual renewal
  6. Automated renewal (Let's Encrypt)
  7. Renewal monitoring (separate from expiry monitoring)
  8. On-call runbook for cert emergencies
  9. Monthly dry-run renewal tests
  10. Certificate inventory (every cert, every service)

We've had zero certificate-related outages in the two years since.

Conclusion

SSL certificates will expire. It's not if, it's when. The question is: will you know about it 30 days in advance, or at 3 AM on a Saturday when everything is on fire?

Monitoring isn't hard:

  1. Check expiry dates regularly (daily is fine)
  2. Alert with escalating urgency
  3. Automate renewal where possible
  4. Monitor renewal success separately
  5. Document everything

The 3 AM outage is 100% preventable. Don't let it happen to you.


Have you had a certificate disaster story? Share it in the comments. Misery loves company, and we all learn from each other's mistakes.

Building certificate monitoring? What challenges are you facing? Drop a comment and let's discuss.

Top comments (0)