nine

Posted on Apr 22 • Originally published at certpulse.dev

TLS Certificate Expiry: Detection, Renewal, and the 47-Day Future

#certificate #expiry #validity #period

The cert expired on a Saturday at 02:14 UTC — not a Tuesday, not during business hours. That's when I learned our paging rotation had a gap between outgoing and incoming on-calls. By the time we deployed a valid cert, we'd lost 97 minutes of checkout traffic and two SOC 2 evidence items. TLS certificate expiry is the most predictable outage in production infrastructure, and it's about to get 8x noisier as validity periods drop from 398 days to 47 by 2029.

This guide covers what expiry means at the X.509 level, how the CA/Browser Forum's phased reduction changes day-to-day operations, and the detection, renewal, and inventory practices that survive when you're renewing every 31 days instead of every year.

What TLS Certificate Expiry Actually Means

A TLS certificate expires when the current time passes the notAfter field in its X.509 structure. After that point, compliant clients refuse the handshake with errors like NET::ERR_CERT_DATE_INVALID (Chromium) or SEC_ERROR_EXPIRED_CERTIFICATE (Firefox). The cert isn't revoked — just outside its validity window. In my experience post-morteming outages over the last three years, roughly 1 in 5 traces back to this single field.

The notBefore and notAfter fields

Every X.509 certificate carries a Validity sequence with two UTCTime or GeneralizedTime values: notBefore and notAfter (RFC 5280, section 4.1.2.5). Read them yourself:

openssl x509 -in cert.pem -noout -dates
notBefore=Feb 14 00:00:00 2026 GMT
notAfter=May 15 23:59:59 2026 GMT

Against a live endpoint:

echo | openssl s_client -servername example.com -connect example.com:443 \
  2>/dev/null | openssl x509 -noout -dates

That's the authoritative source. Everything else — the dashboard, the monitoring alert, the spreadsheet your predecessor left behind — is a derivative view of those two byte sequences, and any of them can drift.

What browsers and clients do at expiry

The moment wall-clock time crosses notAfter, compliant TLS clients fail the handshake. Specific behaviors:

Chrome: throws NET::ERR_CERT_DATE_INVALID; since 2017 treats expiry as hard fail with no click-through on HSTS sites
curl: returns exit code 60 with "certificate has expired"
Go crypto/tls: returns x509: certificate has expired or is not yet valid, with no "ignore expiry" flag short of a custom VerifyConnection callback
Clock-skewed clients: hit the error early or late — I've seen a Windows Server with 47 minutes of NTP drift take down an mTLS link that was still technically valid

Why expiry exists (and why it's getting shorter)

Bounded validity is a defense-in-depth measure against two realities: keys get compromised, and revocation checking is unreliable. CRLs are too large, OCSP stapling is broken on roughly half of production endpoints, and soft-fail revocation means a determined attacker can often suppress the check. Short lifetimes cap the damage window when the revocation channel fails.

The tradeoff is operational burden. The industry is choosing automation at scale over the devil it can't detect, which sets TLS certificate validity on a one-way trip downward.

The Shrinking Validity Timeline: 398 to 47 Days

Under CA/Browser Forum ballot SC-081 (adopted April 2025), public TLS certificate maximum lifetimes drop on this schedule:

Date	Max Validity	DCV Reuse
Today	398 days	398 days
March 15, 2026	200 days	200 days
March 15, 2027	100 days	100 days
March 15, 2029	47 days	10 days

The DCV reduction breaks more workflows than the lifetime reduction does.

March 2026: 200 days

Starting March 15, 2026, newly issued public TLS certificates can have a maximum validity of 200 days. DCV reuse drops to 200 days as well. This is the warm-up: most semi-automated pipelines that renew at 60-90 days won't notice the ceiling. Shops still renewing annually will.

March 2027: 100 days

March 15, 2027 drops the ceiling to 100 days and DCV reuse to 100 days. This is where manual renewal stops being viable for any fleet above about 20 certs. Annual calendar reminders fail. Quarterly isn't frequent enough. You're issuing three to four times per year per cert, and any gap in your process becomes an incident.

March 2029: 47 days

March 15, 2029 finalizes the ceiling at 47 days and DCV reuse at 10 days. At 47-day validity, the industry is effectively mandating what Let's Encrypt has done since 2015. There's no longer a meaningful distinction between "automated" and "not your problem." If you're reading about the 47-day certificate timeline in 2028, you are already late.

Why domain validation reuse drops to 10 days

DCV reuse drops from 398 days to 10 days by 2029 — the change competitors underplay. Today, once you pass domain control validation (via HTTP-01, DNS-01, or TLS-ALPN-01), your CA can reuse that validation for 398 days, issuing new certs without re-checking. After SC-081 fully rolls out, you get 10 days.

Practically: if your renewal flow depends on a human updating a DNS TXT record once a year, you now need to automate DNS updates or switch to HTTP-01 with a stable webroot. Any automation that "works" because it reuses cached DCV starts failing every 11th day. Enterprise PKI workflows that treat DCV as a quarterly ticket will break first.

How to Check TLS Certificate Expiration at Scale

Detection requires three layers: single-endpoint openssl checks for debugging, fleet-wide scanning for public surface area, and explicit discovery for internal PKI, mTLS endpoints, and certs embedded in container images or IoT firmware. In my experience running this across enterprise fleets, layer three is where 80% of the surprise expiries live — the outage always comes from the cert nobody was watching.

Command-line checks (openssl, curl, nmap)

For one-off debugging, openssl is authoritative. To check TLS certificate expiration on a live host:

openssl s_client -servername api.example.com -connect api.example.com:443 \
  </dev/null 2>/dev/null \
  | openssl x509 -noout -enddate -subject -issuer

For a pass/fail days-remaining check:

openssl s_client -connect api.example.com:443 </dev/null 2>/dev/null \
  | openssl x509 -noout -checkend $((30*86400)) \
  && echo "OK" || echo "EXPIRES WITHIN 30 DAYS"

nmap works for scanning a port range on hosts that don't respond to a normal TLS handshake: nmap --script ssl-cert -p 443,8443,4443,6443 api.example.com.

Monitoring at scale

A one-liner per cert doesn't scale past 50 endpoints. The pattern that works:

Weekly discovery of new endpoints from CT logs and cloud APIs
Daily expiry check against the full inventory
Per-cert metrics labelled with owner, service, and CA
Prometheus + blackbox exporter handles the expiry check natively via probe_ssl_earliest_cert_expiry
Alert at 30/14/7/1 days, page only on the 1-day alert

For deeper context on failure modes past "is it expired," the SSL Certificate Checker guide covers the rest.

The endpoints you forget

Every cert incident I've post-mortemed came from an endpoint that wasn't in the inventory. The usual suspects:

Internal PKI endpoints signed by a private root, typically on non-443 ports
mTLS client certs embedded in service-mesh sidecars
Certs baked into container images (expired at build time, discovered at runtime)
Load balancer listener certs (present in ACM, invisible to external scans)
Certs on appliances: network gear, storage arrays, IPMI controllers
Signing certs for code, JWTs, and SAML assertions (not TLS, same expiry pattern)

Renewal Strategies That Survive 47-Day Validity

At 47-day validity with a 2/3 renewal trigger, you're renewing every ~31 days. For a fleet of 500 certs that's ~16 renewals a day, seven days a week. Manual renewal is dead. Cron plus certbot works up to a few dozen certs; beyond that you need orchestration, staggering, and retry logic. Certificate renewal automation stops being a nice-to-have the day your ops team hits its first all-day cert renewal sprint.

ACME and full automation

For CAs that support it, the ACME protocol is the only approach that scales. Let's Encrypt, ZeroSSL, Google Trust Services, and Sectigo all support ACME for public TLS. cert-manager handles Kubernetes, Caddy handles edge, and certbot with deploy hooks covers everything else.

The two pitfalls I see most often:

Clients that don't retry on CA rate limits — Let's Encrypt caps at 300 new orders per account per 3 hours
Renewal jobs that succeed but never deploy the new cert — the renewal-deployment gap causes more outages than failed renewals themselves

Dealing with non-ACME CAs

Enterprise CAs like DigiCert, Entrust, and GlobalSign offer REST APIs but rarely full ACME. You end up writing glue. The honest answer: budget a week of engineering time per CA to build and test the automation, then re-budget quarterly as the CA changes their API. Or move workloads that don't need EV/OV certs to a CA that supports ACME.

Handling pinned certificates and embedded devices

Certificate pinning and 47-day validity are incompatible. Your three options:

Remove the pin (preferred)
Pin to a long-lived intermediate or root instead of the leaf
Run your own internal CA with a multi-year leaf for that specific client

Embedded devices with hard-coded CAs in firmware don't have a clean answer; plan fleet firmware updates as part of your cert strategy.

Staging renewals at 2/3 of validity

The industry rule of thumb is to renew at 2/3 of validity: 30 days before expiry on a 90-day cert, 31 days on a 47-day cert. This gives you a retry window roughly equal to the validity remainder — enough to catch two failed renewal attempts before the danger zone.

At 47-day validity, stagger renewals across the week. Every cert renewing at the same 03:00 UTC will thundering-herd your CA and hit DCV rate limits.

The Real Cost of a Missed Renewal

An expired SSL certificate cost Microsoft Teams a ~3 hour global outage in February 2020, LinkedIn hours of cert warnings in November 2021, and Starlink ~5 hours of network downtime in April 2023. Incident cost roughly follows MTTR × revenue-per-hour, plus reputational decay. Expired-cert MTTR is typically longer than normal outage MTTR because the fix requires CA issuance and sometimes DNS propagation.

Customer-facing outages

Named examples from the public record:

Microsoft Teams, February 3, 2020: auth service cert expired, ~3 hour global outage
LinkedIn, November 2021: multiple subdomains served expired certs for several hours
Starlink, April 2023: expired ground-segment cert took the network offline ~5 hours globally
Ericsson, December 2018: expired cert in an SMF node knocked O2 UK and SoftBank offline for most of a day, affecting ~32M users

For an e-commerce site doing $2M/day at 3 hours MTTR, direct revenue loss alone is roughly $250K. The reputational tail is worse than a normal outage because the browser literally tells the user "NOT SECURE" in red text.

Internal service failures and cascading timeouts

Internal mTLS expiry is the quieter sibling. When a mesh cert expires, the first symptom is handshake failures; the second is retries building queue depth upstream. I've watched an expired cert in a payment service cause cascading timeouts in checkout, inventory, and notifications over 40 minutes before the on-call traced it to the SSL certificate expiry date on one sidecar.

SOC 2 and compliance implications

Most SOC 2 Type 2 audits include a control around encryption in transit. An expired cert in production is a finding. Auditors want to see monitoring evidence, renewal runbooks, and incident records. "We got lucky" is not a control.

Building a Certificate Inventory You Actually Trust

Most orgs have 15-30% more certs than their inventory knows about. Build a trustworthy inventory from three sources: Certificate Transparency logs for public certs, internal port scans for private ones, and cloud-provider APIs (AWS ACM, Azure Key Vault, GCP Certificate Manager) for managed surfaces. Ownership mapping is the part that always slips.

Discovery: CT logs, internal scans, cloud APIs

Since 2018, every publicly trusted cert gets logged to a Certificate Transparency log. Query crt.sh or parse the logs directly. Diff CT issuance against your inventory weekly; the delta is shadow IT, forgotten projects, or an attacker. All three are worth knowing about.

For internal, run nmap on common TLS ports (443, 8443, 4443, 5671, 6443) across your CIDR blocks on a schedule. For cloud-managed certs:

AWS: aws acm list-certificates --region <region> across every region in every account
Azure: az keyvault certificate list --vault-name <vault>
GCP: gcloud certificate-manager certificates list

Ownership mapping

The technical part is easy. The organizational part is where inventories rot. Every cert needs a current human owner and a current team. Without it, alerts go to the void. The pattern that works: encode owner in a cert tag at issuance, require the tag in the renewal pipeline, re-verify ownership quarterly with a script that checks whether the owner still exists in your IdP.

Alerting thresholds that aren't noise

Alert at 30, 14, 7, and 1 days remaining, with escalating severity:

30 days: ticket to owner's queue, no page
14 days: ticket plus Slack to team channel, no page
7 days: page during business hours
1 day: page 24/7, wake someone up

Below 1 day you're relying on someone reading email on a weekend.

FAQ

How do I check when a TLS certificate expires?

Run echo | openssl s_client -servername example.com -connect example.com:443 2>/dev/null | openssl x509 -noout -enddate. The output shows the notAfter field, which is the expiry timestamp. For a pass/fail check against a threshold, add -checkend $((days*86400)): exit code 0 means still valid, 1 means expires within the window.

What happens when a TLS certificate expires?

Compliant TLS clients refuse the handshake and return errors like NET::ERR_CERT_DATE_INVALID (Chrome), SEC_ERROR_EXPIRED_CERTIFICATE (Firefox), or exit code 60 from curl. The connection fails before any application data transfers. On HSTS-pinned origins, there is no click-through override; the site is unreachable until a valid cert is deployed.

Can I use an expired certificate?

Only where you fully control the client and can disable expiry validation, such as internal testing with curl -k or Go's InsecureSkipVerify. Never in production. Public clients, CDNs, and load balancers enforce expiry as a hard failure, and regulatory frameworks like SOC 2 and PCI-DSS flag expired certs as control failures.

How long are TLS certificates valid in 2026?

As of March 15, 2026, publicly trusted TLS certificates have a maximum validity of 200 days under CA/Browser Forum ballot SC-081. This drops to 100 days in March 2027 and 47 days in March 2029. Domain control validation reuse shrinks in parallel, reaching 10 days by 2029.

Do Let's Encrypt certificates expire faster?

Let's Encrypt has issued 90-day certificates since its 2015 launch — shorter than today's 398-day public ceiling but longer than the post-2029 47-day ceiling. Let's Encrypt also offers 6-day short-lived certificates as of 2025 for advanced automation users. Most Let's Encrypt clients trigger renewal at 60 days remaining.

Closing thoughts

TLS certificate expiry is the most predictable outage in production infrastructure, and it's about to happen 8 times as often. The teams that survive the 47-day future treat renewal as a deployment pipeline instead of a calendar reminder: inventory built from CT logs and cloud APIs, alerting at 30/14/7/1 days, automation that handles the DCV-reuse reduction, and ownership that actually maps to a human. If you're managing more than a hundred certs and still babysitting renewals, the next four years are going to hurt. CertPulse monitors TLS certificates and delivers the inventory and alerting layer without writing the discovery pipeline yourself — but the operational discipline is on you either way.

DEV Community