Alex Spinov

Posted on Mar 25 • Edited on Mar 26

Your Cron Jobs Are Failing Silently (Here's a 50-Line Fix)

#devops #linux #python #discuss

Last month, my backup cron job failed at 3 AM on a Saturday. I didn't notice until Monday morning when I needed to restore data.

Three days of backups — gone.

The job had been failing with a disk space error, but cron doesn't care about exit codes by default. It just runs the command and moves on.

The Silent Killer

Here's what most cron setups look like:

# crontab
0 3 * * * /path/to/backup.sh
0 6 * * * /path/to/report.sh
0 */2 * * * /path/to/cleanup.sh

No monitoring. No alerts. No logging. If any of these fail, you won't know until the damage is done.

The Fix: Wrap Every Job

I built a simple Python wrapper that:

Logs start/end time and exit code
Sends a Telegram/Slack alert on failure
Detects missed runs

import subprocess
import json
from datetime import datetime
from pathlib import Path
import urllib.request

DB = Path.home() / '.cron-monitor.json'

def load_db():
    return json.loads(DB.read_text()) if DB.exists() else {'jobs': {}}

def save_db(db):
    DB.write_text(json.dumps(db, indent=2, default=str))

def alert(message, bot_token, chat_id):
    url = f'https://api.telegram.org/bot{bot_token}/sendMessage'
    data = json.dumps({'chat_id': chat_id, 'text': message}).encode()
    req = urllib.request.Request(url, data=data,
        headers={'Content-Type': 'application/json'})
    urllib.request.urlopen(req, timeout=10)

def run_job(name, command, bot_token=None, chat_id=None):
    db = load_db()
    start = datetime.now()

    result = subprocess.run(command, capture_output=True, text=True)

    db['jobs'][name] = {
        'last_run': start.isoformat(),
        'duration': (datetime.now() - start).total_seconds(),
        'exit_code': result.returncode,
        'status': 'success' if result.returncode == 0 else 'failed'
    }
    save_db(db)

    if result.returncode != 0 and bot_token:
        alert(
            f"🔴 CRON FAILED: {name}\n"
            f"Exit code: {result.returncode}\n"
            f"Error: {result.stderr[:200]}",
            bot_token, chat_id
        )

# Usage
run_job("daily-backup", ["bash", "/path/to/backup.sh"],
        bot_token="YOUR_BOT_TOKEN", chat_id="YOUR_CHAT_ID")

Updated Crontab

# Before (silent failures)
0 3 * * * /path/to/backup.sh

# After (monitored)
0 3 * * * python3 /path/to/monitor.py --name "daily-backup" -- bash /path/to/backup.sh

What You Get

Every job is now tracked:

=== Cron Job Monitor ===

Job: daily-backup
  Last run: 2026-03-25 03:00:01
  Status: ✅ Success (exit code 0)
  Duration: 4m 23s

Job: db-cleanup
  Last run: 2026-03-25 02:00:00
  Status: 🔴 Failed (exit code 1)
  Error: "connection refused"
  Alert sent: Telegram ✅

Why Not Use Existing Tools?

Healthchecks.io — great service, but it's external. I want self-hosted.
Cronitor — $20/month. This is free.
systemd timers — powerful but complex to set up.
Dead Man's Snitch — SaaS, costs money.

My solution: 50 lines of Python, zero dependencies, instant setup.

3 Bonus Tips for Cron

1. Always redirect output

0 3 * * * /path/to/backup.sh >> /var/log/backup.log 2>&1

2. Use `flock` to prevent overlapping runs

0 3 * * * flock -n /tmp/backup.lock /path/to/backup.sh

3. Set PATH explicitly

PATH=/usr/local/bin:/usr/bin:/bin
0 3 * * * /path/to/backup.sh

Cron has a minimal PATH. Your script works in terminal but fails in cron? This is why.

The full monitor with Telegram, Slack, timeout detection, and missed run alerts is on GitHub.

What's the worst cron failure you've had? I know I'm not the only one who lost backups.

Follow for more DevOps and automation content.

Need custom dev tools, scrapers, or API integrations? I build automation for dev teams. Email spinov001@gmail.com — or explore awesome-web-scraping.

Need data from the web without writing scrapers? Check my *Apify actors** — ready-made scrapers for HN, Reddit, LinkedIn, and 75+ more sites. Or email: spinov001@gmail.com*

DEV Community

Your Cron Jobs Are Failing Silently (Here's a 50-Line Fix)

The Silent Killer

The Fix: Wrap Every Job

Updated Crontab

What You Get

Why Not Use Existing Tools?

3 Bonus Tips for Cron

1. Always redirect output

2. Use `flock` to prevent overlapping runs

3. Set PATH explicitly

Top comments (0)

The Silent Killer

The Fix: Wrap Every Job

Updated Crontab

What You Get

Why Not Use Existing Tools?

3 Bonus Tips for Cron

1. Always redirect output

2. Use flock to prevent overlapping runs

3. Set PATH explicitly

2. Use `flock` to prevent overlapping runs