DEV Community

Jasper Brookers
Jasper Brookers

Posted on

Cron Job Monitoring for Backups: What Actually Goes Wrong

Backups are the most trusted cron jobs in any system.
Once they are set up, everyone assumes:
“Backup is running daily, no worries.”

But in reality, backup cron jobs fail more dangerously than any other job — because they often look successful until the day you actually need them.

Let’s talk honestly about what actually goes wrong with cron-based backups, and how to monitor them properly.

Why Backup Cron Jobs Are Special (And Risky)

Unlike many other cron jobs:

  • Backups are rarely checked
  • Failures are discovered only during recovery
  • A “successful run” does not mean a usable backup

This makes backup monitoring critical.

Common Backup Failures You Will Eventually Face

1. Backup Runs, But File Is Empty

This is very common.

What happens:

  • Database connection fails
  • Dump command exits early
  • Permissions issue

Result:

  • Backup file exists
  • File size is 0 bytes

Cron thinks job succeeded.

Your restore will not.

Why this is dangerous:

Most people only check if a file exists, not whether it contains data.

2. Backup Stops Mid-Way

What goes wrong:

  • Network drops
  • Disk fills up
  • Database connection resets

Result:

  • Partial backup file
  • Script exits without clear error
  • No alert sent

This is a false sense of security failure.

3. Backup Never Runs at All

This is the worst one.

Reasons:

  • Server rebooted and cron didn’t start
  • Cron daemon stopped
  • Crontab overwritten during deployment
  • Timezone misconfiguration

Result:

  • No backup for days or weeks
  • No logs
  • No alerts

4. Backup Runs, But Upload Fails

Very common with cloud backups.

What goes wrong:

  • S3 credentials expire
  • Network timeout
  • Storage quota exceeded

Result:

  • Local backup exists
  • Remote backup missing
  • Nobody notices

You think you have offsite backups. You don’t.

5. Backup Is Too Old (But Nobody Notices)

Backup runs fine, but:

  • Retention job fails
  • Rotation logic breaks
  • Old backups silently deleted

Result:

  • Only very old backups remain
  • Recent data is gone

6. Backup Takes Longer and Longer Over Time

As data grows:

  • Backup duration increases
  • Job overlaps with next run
  • Server load increases

Result:

  • Slowdowns
  • Failed backups
  • Corrupted files

Cron does not warn you about runtime drift.

Why Logs Don’t Save You

Many teams rely on logs to monitor backups.

But logs fail when:

  • Disk is full
  • Job never starts
  • Script hangs before logging
  • Nobody checks logs regularly

A backup that fails silently is worse than no backup at all.

What You Should Actually Monitor for Backups

Monitoring backups is not about checking cron ran.

It’s about verifying outcomes.

Here’s what actually matters:

1. Did the Backup Job Run?

If it didn’t run:

  • Something is fundamentally broken
  • You need to know immediately
  • Use execution confirmation with alerting.

2. Did the Backup Job Finish?

A started backup is not a finished backup.

Monitor:

  • Start vs completion
  • Maximum expected duration

Alert if the job hangs or runs too long.

3. Did It Produce a Valid Backup?

Don’t just check existence.

Monitor:

  • File size
  • Timestamp freshness
  • Basic integrity checks

Empty or tiny backups are failures.

4. Did the Backup Reach Safe Storage?

Local backup is not enough.

Monitor:

  • Upload success
  • Remote storage presence
  • Storage quota issues

If offsite copy fails, backup is incomplete.

5. Is Backup Health Degrading Over Time?

Watch trends:

  • Increasing runtime
  • Increasing failures
  • Increasing storage usage

These are early warning signals.

Heartbeat vs Workflow Monitoring for Backups

Backups are not simple jobs. They need more than a single “ping”.

Heartbeat Monitoring

Good for:

  • “Did the job run?”

Bad for:

  • Partial backups
  • Hung uploads
  • Multi-step failures

Workflow Monitoring (Recommended)
Better approach:

  • Signal when backup starts
  • Signal when backup completes
  • Signal when upload completes
  • Alert if any step fails or times out

This gives you real confidence, not hope.

Tools That Help Monitor Backup Cron Jobs

Tool Detect Missed Backups Detect Hung Backups Track Duration Workflow Steps Backup-Friendly
Cronbee Very Good
Cronitor ⚠️ (limited) Good
Custom Scripts Powerful but risky
Dead Man’s Snitch Basic
Healthchecks.io ⚠️ (timeouts) ⚠️ (basic) Good (simple)

One Hard Truth About Backups

A backup you don’t monitor is not a backup.

If nobody knows it failed, then when the day comes:

  • Data is already lost
  • The incident already happened

Final Thoughts

Cron is good at running backup commands.

It is terrible at telling you whether backups are usable.

The solution is not replacing cron — it’s adding visibility:

  • Did it run?
  • Did it finish?
  • Did it produce valid data?
  • Did it reach safe storage

Answer these questions automatically, and your backups will finally be reliable.

Top comments (0)