quietpulse

Posted on Apr 2 • Originally published at quietpulse.xyz

How to Detect if a Cron Job Is Not Running (Before It Becomes a Real Problem)

#saas #monitoring #cron #devops

How to Detect if a Cron Job Is Not Running (Before It Becomes a Real Problem)

Your backup script was supposed to run every night. Your data import should have triggered at 6 AM. But nobody checked if they actually did. Here's how to catch a cron job not running before the damage is done.

The crontab entry is there. You set it up months ago, and since it didn't throw errors, you assumed it's been running fine ever since.

Three weeks later you realize your backups haven't been created since February. Your scheduled email digests stopped. Your database cleanup script expired.

A cron job not running is one of those problems that hides in plain sight. Unlike a server that crashes loudly, a silent cron just—doesn't happen. No error, no alert, no drama. Just a slow accumulation of missing data, unsent emails, and uncleaned state.

This article will show you exactly how to detect it and why heartbeat monitoring is the most reliable approach.

The Problem: Cron Jobs That Simply Don't Fire

Cron looks reliable on paper. You define a schedule, save it to crontab, and it "just works." But in production, things quietly break:

The server was rebooted, and the cron service never restarted
A package update changed the path to an executable
The cron daemon crashed under memory pressure
A deployment updated the script permissions
The machine image was replaced during a migration, and the crontab entry was lost

None of these generate error messages. Cron doesn't announce "I forgot to run this task." It just skips it.

If you're not actively checking, a cron job not running can go unnoticed for weeks or months.

Why This Happens

The core issue is fundamental: cron is fire-and-forget. It executes a command at a scheduled time and provides no built-in mechanism to confirm that the command actually ran—or succeeded.

There's no callback, no heartbeat, no "I ran at 3:00 AM as scheduled" signal. The only feedback is a local log file that nobody reads.

Consider this common crontab entry:

0 3 * * * /opt/scripts/backup-db.sh > /dev/null 2>&1

That > /dev/null 2>&1 part? It actively discards all output. Even error logs. You couldn't see a failure if you tried.

And even without output suppression, there's a difference between:

The cron daemon attempted to run the job (but the script failed instantly)
The cron daemon never triggered the job at all
The cron service itself stopped running

These are three different failure modes, and none of them alert you by default.

Why It's Dangerous

When a cron job not running goes undetected, the consequences compound over time:

Data loss: Database backups stop, and when you need to restore, the most recent backup is weeks old
Revenue impact: Scheduled billing or invoicing scripts don't fire—customers don't get charged
Security gaps: Certificate renewal scripts, security scans, and log rotation scripts stop working
User-facing failures: Automated emails, notifications, and reports stop without anyone noticing
Compliance violations: Audit log exports and data retention policies aren't enforced

The worst part? You won't find out during the failure. You'll find out during the crisis.

How to Detect a Cron Job Not Running

The most reliable way to detect a cron job not running is to use the heartbeat pattern—also known as a dead man's switch.

Here's how it works:

Your script "phones home" each time it runs by sending an HTTP request to a monitoring endpoint
The monitoring service expects to receive these signals on a schedule
If the signal doesn't arrive on time, the service alerts you

This gives you a simple but powerful guarantee: if you don't receive a heartbeat at the expected time, the cron job not running is confirmed—not presumed.

Basic Example: curl + Monitoring Endpoint

Here's the minimal implementation:

0 3 * * * /opt/scripts/backup-db.sh && curl -fsS --retry 3 https://monitor.yourdomain.com/heartbeat/abc123

Or, if you want to confirm the script ran regardless of success or failure:

0 3 * * * ( /opt/scripts/backup-db.sh; curl -fsS --retry 3 https://monitor.yourdomain.com/heartbeat/abc123 )

The monitoring endpoint receives a hit every time the job executes. If the expected heartbeat is late or missing, you get an alert.

A Simple Solution That Actually Works

You could build your own monitoring endpoint—a simple web app that records the last-seen timestamp per job and checks for stale entries. But there are easier options.

Using a Dedicated Heartbeat Monitoring Service

Tools like QuietPulse are built specifically for this. You create a job, get a unique heartbeat URL, and add a single curl command to your cron script.

Example script:

#!/bin/bash
# backup-db.sh

# Execute the actual task
pg_dump mydb > /backups/db-$(date +%Y%m%d).sql

# Send heartbeat confirmation
curl -fsS --retry 3 https://quietpulse.xyz/ping/your-unique-token

That's it. Two lines of monitoring for a cron job that could otherwise go silent forever.

With QuietPulse, you configure:

Minimum run interval (how often you expect the job)
Grace period (how long to wait before alerting)
Alert channel (Telegram notifications, for example)

When the job doesn't ping on time, you get notified before you even realize something's wrong.

Common Mistakes When Trying to Detect Cron Failures

Here are the traps people keep falling into:

1. Relying on exit codes alone

Exit codes tell you if the script failed—but they don't tell you if the script never ran. A cron job not running produces no exit code at all.

2. Checking log files manually

Manual log checks don't scale and depend on someone remembering to look. By definition, a process you "forgot about" won't have someone checking its logs.

3. Using uptime monitoring as a proxy

Uptime monitors check if your server is online. They can't verify if your specific scheduled task actually executed. Your server can be up for 99.9% and your cron can be failing 100%.

4. Alerting only on errors, not on silence

A missing event is fundamentally different from a failed event. You need a monitoring system that understands the difference between "the job ran and errored" and "the job didn't run at all."

5. Not testing the monitoring itself

If your monitoring endpoint goes down, you'll have a monitoring gap where a cron job not running is invisible. Test your monitoring setup periodically.

Alternative Approaches

Wrapper Scripts

Wrap every cron job in a shell script that logs start/end times and writes to a status file:

#!/bin/bash
echo "$(date) started" >> /var/log/backup.status
/opt/scripts/backup-db.sh
echo "$(date) completed" >> /var/log/backup.status

Then add a separate cron job that checks if the last entry is recent enough. This is essentially what a heartbeat service does, but built in-house.

Systemd Timers

Systemd timers can replace cron on Linux and provide better logging, restart policies, and dependency management. They won't eliminate silent failures, but they give you more observability.

Email Notifications from Cron

You can set MAILTO in crontab to receive emails on output. This helps with crashes but won't catch a cron job not running—the cron daemon must execute the job to generate any email.

FAQ

How do I know if my cron jobs are actually running?

The most reliable method is heartbeat monitoring: have each job send a signal (HTTP request, webhook) to a monitoring service when it executes. If the signal is missing at the expected time, you'll know immediately.

Can I detect a cron job not running without modifying the script?

Limited options exist. You can check system logs (/var/log/syslog or journalctl -u cron) for execution records, check if output files change, or use filesystem monitoring (inotify) on files the script modifies. But these are indirect and unreliable.

What's the difference between a cron job failing and not running?

A failed job started but encountered an error (non-zero exit code). A not-running job never started at all—cron didn't trigger it, or the cron daemon itself stopped. Heartbeat monitoring catches both.

How often should I check if my cron jobs ran?

Your check interval should be shorter than your job's run interval. If a job runs every hour, check within 15-30 minutes of the scheduled time. For daily jobs, check within a few hours.

What happens if the monitoring service itself goes down?

This is why some teams run redundant monitoring (e.g., both a SaaS tool and a local wrapper script). But in practice, monitoring services have higher uptime than individual servers—the real risk is a cron job not running, not the monitoring going down.

Conclusion

A cron job not running is the kind of failure that doesn't announce itself. The absence of activity is much harder to detect than a loud error. But it's also much more preventable.

A simple heartbeat ping after each execution—combined with a monitoring service that alerts you when the ping is late—gives you early warning that something's wrong, before the missing work becomes a real crisis.

Two lines of code. One monitoring endpoint. No more guessing.

Originally published at quietpulse.xyz/blog/how-to-detect-if-a-cron-job-is-not-running

DEV Community

How to Detect if a Cron Job Is Not Running (Before It Becomes a Real Problem)

How to Detect if a Cron Job Is Not Running (Before It Becomes a Real Problem)

The Problem: Cron Jobs That Simply Don't Fire

Why This Happens

Why It's Dangerous

How to Detect a Cron Job Not Running

Basic Example: curl + Monitoring Endpoint

A Simple Solution That Actually Works

Using a Dedicated Heartbeat Monitoring Service

Common Mistakes When Trying to Detect Cron Failures

Alternative Approaches

Wrapper Scripts

Systemd Timers

Email Notifications from Cron

FAQ

Conclusion

Top comments (0)