Let me tell you about the worst Saturday morning of my career.
3:17 AM. My phone explodes with alerts. The main production site is down. API is down. Mobile app can't connect. Customer support is getting flooded.
The culprit? An expired SSL certificate.
Not a server crash. Not a database failure. Not a DDoS attack. A certificate that expired at 3:00 AM on a Saturday morning, taking down a multi-million dollar e-commerce platform.
The certificate had been valid for a year. We knew it would expire. There were emails. There were reminders. But between team changes, inbox filters, and "I'll do it tomorrow" syndrome, it slipped through the cracks.
That incident cost the company six figures in lost revenue and taught me that SSL certificate monitoring isn't optional -- it's survival.
The Problem with SSL Certificates
SSL/TLS certificates are one of those things that work perfectly until they don't. And when they don't, they fail catastrophically:
- Silent expiry -- No warning. Your site just stops working.
- Browser warnings -- "This connection is not private" scares away customers
- API breakage -- Mobile apps and integrations stop working
- Zero grace period -- The second it expires, everything breaks
Even worse, modern infrastructure has dozens or hundreds of certificates:
- Main domain (
example.com) - WWW subdomain (
www.example.com) - API endpoints (
api.example.com) - Admin panels (
admin.example.com) - CDN origins
- Wildcard certificates (
*.example.com) - Internal services
- Load balancers
- VPNs
Each one is a ticking time bomb.
Why Certificates Expire (And Why That's Good)
Before we dive into monitoring, let's understand why certificates expire:
- Security -- Shorter validity periods limit damage if a private key is compromised
- Cryptographic aging -- Today's "secure" algorithms become tomorrow's vulnerabilities
- Ownership changes -- Domains change hands, organizations restructure
Certificate authorities (CAs) have been steadily reducing validity periods:
- 2015: 5 years maximum
- 2018: 2 years maximum (825 days)
- 2020: 1 year maximum (398 days)
- 2024: Proposals for 90 days or less
Let's Encrypt already issues 90-day certificates by default. The industry is moving toward automated, short-lived certificates -- which means monitoring becomes even more critical.
The Anatomy of an SSL Certificate
To monitor certificates effectively, you need to understand what you're monitoring:
# Check a certificate with OpenSSL
openssl s_client -connect example.com:443 -servername example.com < /dev/null 2>/dev/null | openssl x509 -noout -dates
# Output:
notBefore=Jan 15 00:00:00 2026 GMT
notAfter=Apr 15 23:59:59 2026 GMT
Key fields to monitor:
-
Subject -- Who the cert is issued to (
CN=example.com) -
Issuer -- Who issued it (
Let's Encrypt,DigiCert, etc.) -
Valid From (
notBefore) -
Valid Until (
notAfter) -- THE critical field - Subject Alternative Names (SANs) -- Additional domains covered
- Signature Algorithm -- Is it still secure?
The SAN Trap
Here's a gotcha that bit me: Subject Alternative Names (SANs).
Modern certificates often cover multiple domains:
CN=example.com
SAN=example.com, www.example.com, api.example.com
If you only check the primary domain but monitor from a different SAN, you might miss expiry warnings. Always verify the certificate covers the specific hostname you're checking.
Building a Certificate Monitor
The simplest version is a shell script:
#!/bin/bash
DOMAIN="example.com"
PORT=443
ALERT_DAYS=30
# Get expiry date
EXPIRY=$(echo | openssl s_client -servername "$DOMAIN" -connect "$DOMAIN:$PORT" 2>/dev/null | \
openssl x509 -noout -enddate | cut -d= -f2)
# Convert to epoch timestamp
EXPIRY_EPOCH=$(date -d "$EXPIRY" +%s)
NOW_EPOCH=$(date +%s)
DAYS_LEFT=$(( ($EXPIRY_EPOCH - $NOW_EPOCH) / 86400 ))
if [ $DAYS_LEFT -lt $ALERT_DAYS ]; then
echo "Certificate for $DOMAIN expires in $DAYS_LEFT days!"
# Send alert (email, Slack, etc.)
fi
Run this daily via cron:
0 9 * * * /usr/local/bin/check-cert.sh
But this has problems:
- Only checks when the cron runs
- No historical data
- Doesn't handle multiple domains
- No retry logic for network blips
- Alert fatigue (you get the same alert every day)
Let's level up.
A Production-Grade Solution
Here's a Node.js version that handles edge cases:
const tls = require('tls');
const https = require('https');
async function checkCertificate(hostname, port = 443) {
return new Promise((resolve, reject) => {
const options = {
host: hostname,
port: port,
servername: hostname, // SNI support
rejectUnauthorized: false, // Check even invalid certs
};
const socket = tls.connect(options, () => {
const cert = socket.getPeerCertificate();
if (!cert || !cert.valid_to) {
socket.end();
return reject(new Error('No certificate found'));
}
const validTo = new Date(cert.valid_to);
const validFrom = new Date(cert.valid_from);
const now = new Date();
const daysLeft = Math.floor((validTo - now) / (1000 * 60 * 60 * 24));
const result = {
hostname: hostname,
subject: cert.subject,
issuer: cert.issuer,
validFrom: validFrom,
validTo: validTo,
daysLeft: daysLeft,
isValid: now >= validFrom && now <= validTo,
subjectAltNames: cert.subjectaltname,
serialNumber: cert.serialNumber,
fingerprint: cert.fingerprint
};
socket.end();
resolve(result);
});
socket.on('error', (err) => {
reject(err);
});
socket.setTimeout(10000, () => {
socket.destroy();
reject(new Error('Connection timeout'));
});
});
}
// Usage
checkCertificate('example.com')
.then(cert => {
console.log(`Certificate for ${cert.hostname}:`);
console.log(` Expires: ${cert.validTo}`);
console.log(` Days left: ${cert.daysLeft}`);
console.log(` Issuer: ${cert.issuer.O}`);
if (cert.daysLeft < 30) {
sendAlert(cert);
}
})
.catch(err => console.error('Check failed:', err.message));
Python Alternative
import ssl
import socket
from datetime import datetime, timedelta
def check_certificate(hostname, port=443):
context = ssl.create_default_context()
with socket.create_connection((hostname, port), timeout=10) as sock:
with context.wrap_socket(sock, server_hostname=hostname) as ssock:
cert = ssock.getpeercert()
# Parse expiry
not_after = datetime.strptime(cert['notAfter'], '%b %d %H:%M:%S %Y %Z')
days_left = (not_after - datetime.now()).days
return {
'hostname': hostname,
'subject': dict(x[0] for x in cert['subject']),
'issuer': dict(x[0] for x in cert['issuer']),
'valid_to': not_after,
'days_left': days_left,
'serial_number': cert['serialNumber'],
'san': cert.get('subjectAltName', [])
}
# Usage
cert = check_certificate('example.com')
print(f"Certificate expires in {cert['days_left']} days")
if cert['days_left'] < 30:
send_alert(cert)
Alert Thresholds: A Practical Strategy
Don't just alert at "30 days". Use multiple thresholds with escalating urgency:
- 60 days Informational (Slack channel, low priority)
- 30 days Warning (Email to team)
- 14 days Urgent (Email + Slack mention)
- 7 days Critical (SMS/PagerDuty to on-call)
- 3 days Emergency (Wake everyone up)
function getAlertLevel(daysLeft) {
if (daysLeft <= 3) return 'EMERGENCY';
if (daysLeft <= 7) return 'CRITICAL';
if (daysLeft <= 14) return 'URGENT';
if (daysLeft <= 30) return 'WARNING';
if (daysLeft <= 60) return 'INFO';
return 'OK';
}
function sendAlert(cert) {
const level = getAlertLevel(cert.daysLeft);
const message = `${level}: Certificate for ${cert.hostname} expires in ${cert.daysLeft} days (${cert.validTo})`;
switch(level) {
case 'EMERGENCY':
sendSMS(message);
sendEmail(message);
sendSlack(message);
break;
case 'CRITICAL':
sendEmail(message);
sendSlack(message, '@channel');
break;
case 'URGENT':
sendEmail(message);
sendSlack(message);
break;
case 'WARNING':
sendEmail(message);
break;
case 'INFO':
sendSlack(message); // No email spam
break;
}
}
Edge Cases That Will Bite You
1. Certificate Chains
A valid server cert with an expired intermediate certificate will fail:
Root CA (trusted)
Intermediate CA (EXPIRED!)
Server Cert (valid)
Always check the entire chain:
const cert = socket.getPeerCertificate(true); // true = include chain
const chain = [cert];
let current = cert;
while (current.issuerCertificate && current !== current.issuerCertificate) {
chain.push(current.issuerCertificate);
current = current.issuerCertificate;
}
// Check each cert in the chain
chain.forEach((c, i) => {
const validTo = new Date(c.valid_to);
const daysLeft = Math.floor((validTo - Date.now()) / 86400000);
console.log(` [${i}] ${c.subject.CN} - ${daysLeft} days left`);
});
2. Wildcard Certificates
A wildcard cert (*.example.com) covers:
api.example.comwww.example.comadmin.example.com
But NOT:
-
example.com(root domain) -
api.staging.example.com(nested subdomain)
Monitor both the wildcard AND the root domain separately.
3. SNI (Server Name Indication)
Modern servers host multiple domains on one IP. Without SNI, you might check the wrong certificate:
# Wrong (no SNI)
openssl s_client -connect 192.0.2.1:443
# Right (with SNI)
openssl s_client -connect 192.0.2.1:443 -servername example.com
Always specify -servername or the equivalent in your programming language.
4. Load Balancers and CDNs
You have:
- Origin server cert (your server)
- Load balancer cert (AWS ELB, Cloudflare, etc.)
- CDN edge cert (Cloudflare, Fastly, etc.)
Monitor the certificate that users actually see, not just your origin. Check from outside your infrastructure.
5. Internal Services
Don't forget:
- Database SSL connections
- Redis TLS
- Internal APIs
- VPNs
- LDAP/LDAPS
- SMTP with STARTTLS
These often use self-signed or internal CA certs that expire quietly.
Let's Encrypt: Automation & Monitoring
Let's Encrypt changed the game with free, automated certificates. But automation doesn't mean "set and forget".
ACME Renewal Monitoring
Certbot (the Let's Encrypt client) auto-renews certs. But what if renewal fails?
# Check last renewal attempt
journalctl -u certbot.timer --since "7 days ago" | grep -i error
# Or parse certbot's logs
grep "Renewal failed" /var/log/letsencrypt/letsencrypt.log
Better: Monitor the certificate expiry and renewal success separately:
const fs = require('fs');
const path = require('path');
function checkLetsEncryptRenewal(domain) {
const renewalConfig = `/etc/letsencrypt/renewal/${domain}.conf`;
if (!fs.existsSync(renewalConfig)) {
return { error: 'Renewal config not found' };
}
const stats = fs.statSync(renewalConfig);
const daysSinceUpdate = (Date.now() - stats.mtime) / 86400000;
// LE renews certs older than 60 days
if (daysSinceUpdate > 60) {
return {
warning: `Renewal config not updated in ${Math.floor(daysSinceUpdate)} days`
};
}
return { ok: true };
}
Common Let's Encrypt Failures
- HTTP-01 challenge blocked -- Firewall rules, CDN config
- DNS-01 fails -- API rate limits, DNS propagation
-
Webroot permissions -- Certbot can't write to
.well-known/acme-challenge/ - Rate limits hit -- Too many renewals in a week
Monitor renewal logs and alert on failure patterns.
Dashboard: Making Data Actionable
Raw alerts are good. A dashboard is better. Track:
- Certificates expiring soon (sorted by days left)
- Recently renewed (confirm automation is working)
- Issuer distribution (Are you still using that old CA?)
- Expiry timeline (visual calendar of upcoming expirations)
Simple HTML dashboard:
async function generateDashboard(domains) {
const checks = await Promise.all(
domains.map(d => checkCertificate(d).catch(err => ({
hostname: d,
error: err.message
})))
);
// Sort by days left (ascending)
checks.sort((a, b) => (a.daysLeft || 999) - (b.daysLeft || 999));
const html = `
<h1>SSL Certificate Dashboard</h1>
<table>
<tr><th>Domain</th><th>Days Left</th><th>Expires</th><th>Issuer</th></tr>
${checks.map(c => `
<tr class="${getRowClass(c.daysLeft)}">
<td>${c.hostname}</td>
<td>${c.daysLeft || 'ERROR'}</td>
<td>${c.validTo?.toLocaleDateString() || '-'}</td>
<td>${c.issuer?.O || '-'}</td>
</tr>
`).join('')}
</table>
`;
return html;
}
function getRowClass(daysLeft) {
if (!daysLeft) return 'error';
if (daysLeft < 7) return 'critical';
if (daysLeft < 14) return 'urgent';
if (daysLeft < 30) return 'warning';
return 'ok';
}
Pro Tips from Production
1. Check from Multiple Locations
Your cert might be valid from your office but expired on your CDN edge nodes:
const locations = [
'us-east.probe.example.com',
'eu-west.probe.example.com',
'ap-southeast.probe.example.com'
];
for (const probe of locations) {
const cert = await checkCertificateViaProxy(probe, 'example.com');
// Compare results
}
2. Track Certificate Changes
Alert when certificates change unexpectedly (potential security issue):
const lastKnownFingerprint = db.get('cert_fingerprint', domain);
const currentFingerprint = cert.fingerprint;
if (lastKnownFingerprint && lastKnownFingerprint !== currentFingerprint) {
sendSecurityAlert(`Certificate changed for ${domain}!`);
}
db.set('cert_fingerprint', domain, currentFingerprint);
3. Test Renewal Before It's Urgent
Don't wait until 7 days before expiry to test renewal. Run a dry-run monthly:
certbot renew --dry-run --quiet || echo "Renewal will fail!"
4. Document Your Renewal Process
When the cert expires at 3 AM and the person who set it up left the company 6 months ago, you need documentation:
# SSL Certificate Renewal - example.com
**Provider:** Let's Encrypt
**Renewal method:** Certbot with HTTP-01 challenge
**Auto-renewal:** Yes (certbot.timer systemd service)
**Manual renewal:** `sudo certbot renew --force-renewal`
**Webroot:** `/var/www/example.com/`
**On-call:** ops@example.com
**Last manual renewal:** 2026-01-15 by @john
Tools Worth Knowing
- SSL Labs -- Free online checker (https://www.ssllabs.com/ssltest/)
- testssl.sh -- Comprehensive SSL/TLS scanner
- Certbot -- Let's Encrypt client
- cert-manager -- Kubernetes cert automation
- AWS Certificate Manager -- Managed certs for AWS
And of course, dedicated monitoring services exist (I'll leave product names out of this, but they're worth Googling).
The Post-Incident Checklist
After that 3 AM disaster, here's what we implemented:
- Automated monitoring for all domains (check daily)
- Multi-threshold alerts (60/30/14/7/3 days)
- Multiple alert channels (email, Slack, SMS)
- Dashboard showing all certs at a glance
- Documentation for manual renewal
- Automated renewal (Let's Encrypt)
- Renewal monitoring (separate from expiry monitoring)
- On-call runbook for cert emergencies
- Monthly dry-run renewal tests
- Certificate inventory (every cert, every service)
We've had zero certificate-related outages in the two years since.
Conclusion
SSL certificates will expire. It's not if, it's when. The question is: will you know about it 30 days in advance, or at 3 AM on a Saturday when everything is on fire?
Monitoring isn't hard:
- Check expiry dates regularly (daily is fine)
- Alert with escalating urgency
- Automate renewal where possible
- Monitor renewal success separately
- Document everything
The 3 AM outage is 100% preventable. Don't let it happen to you.
Have you had a certificate disaster story? Share it in the comments. Misery loves company, and we all learn from each other's mistakes.
Building certificate monitoring? What challenges are you facing? Drop a comment and let's discuss.
Top comments (0)