Monitoring GitHub Actions scheduled workflows: a practical guide
GitHub Actions is a surprisingly capable cron scheduler. Schedule a workflow, let it run nightly, forget about it.
Until it stops running. And you don't notice for two weeks.
Scheduled workflows in GitHub Actions are quietly unreliable. GitHub delays them, skips them during high load, and — most importantly — gives you no built-in alerting when they fail silently. Adding external monitoring takes about 5 minutes and saves you from that two-week discovery.
The basic setup
Here's a minimal scheduled workflow with monitoring:
name: Nightly export
on:
schedule:
- cron: '0 2 * * *' # 2am UTC every day
workflow_dispatch: # allows manual triggering for testing
jobs:
export:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run export
run: python scripts/export.py
- name: Ping DeadManCheck
if: success()
run: curl -fsS https://deadmancheck.io/ping/${{ secrets.DEADMANCHECK_TOKEN }} > /dev/null
The last step pings DeadManCheck only if all previous steps succeeded (if: success()). If the export script fails, the ping doesn't fire, and you get alerted after your configured grace period.
Set up the monitor with a 25-hour interval (giving a 1-hour buffer on the 24-hour schedule). Store your token in GitHub: Settings → Secrets and variables → Actions → New repository secret named DEADMANCHECK_TOKEN.
Adding start/end pings for longer jobs
For jobs that run more than a few minutes, use the start/end pattern. This catches jobs that hang:
steps:
- uses: actions/checkout@v4
- name: Ping start
run: curl -fsS https://deadmancheck.io/ping/${{ secrets.DEADMANCHECK_TOKEN }}/start > /dev/null || true
- name: Run ETL
id: etl
run: |
python scripts/run_etl.py
echo "rows=$(cat /tmp/etl_row_count.txt)" >> $GITHUB_OUTPUT
- name: Ping done
if: success()
run: |
curl -fsS \
"https://deadmancheck.io/ping/${{ secrets.DEADMANCHECK_TOKEN }}?count=${{ steps.etl.outputs.rows }}" \
> /dev/null || true
- name: Ping fail
if: failure()
run: curl -fsS https://deadmancheck.io/ping/${{ secrets.DEADMANCHECK_TOKEN }}/fail > /dev/null || true
Your ETL script writes the row count to /tmp/etl_row_count.txt. The monitoring step picks it up and includes it in the ping — so your monitor can alert on zero-output runs, not just missed runs.
The gotchas
GitHub delays scheduled workflows
This is the big one. GitHub's docs admit that scheduled workflows may be delayed during high load. A workflow scheduled for 2:00am UTC might run at 2:23am or 2:51am. During busy periods, delays of 30–60 minutes aren't unusual.
Don't set your DeadManCheck interval to exactly 24 hours. Set it to 25 hours. That buffer absorbs GitHub's scheduling jitter without letting real failures go undetected.
Scheduled workflows stop on inactive repos
If a repository has no commits in 60 days, GitHub disables scheduled workflows. You'll get an email warning. If you miss it, the job silently stops running — and your external monitor will catch it where GitHub's notification didn't reach you.
Test with workflow_dispatch before trusting the schedule
Always add workflow_dispatch as a trigger (it's in all examples above). You can trigger the workflow manually from the Actions tab or via the CLI:
gh workflow run nightly-export.yml
Test your monitoring integration before the first scheduled run. Confirm the ping appears in your DeadManCheck dashboard with the correct count.
Secrets aren't available in forks
If your repo is public and someone forks it, secrets.DEADMANCHECK_TOKEN will be empty in their fork. The curl will fail silently. This is fine — you don't want random forks pinging your monitor — but be aware of it when debugging.
Full production example
name: Nightly database backup
on:
schedule:
- cron: '0 2 * * *'
workflow_dispatch:
jobs:
backup:
runs-on: ubuntu-latest
timeout-minutes: 30 # hard limit — prevent hung jobs accumulating
steps:
- uses: actions/checkout@v4
- name: Ping start
run: |
curl -fsS \
"https://deadmancheck.io/ping/${{ secrets.DEADMANCHECK_TOKEN }}/start" \
> /dev/null || true # don't fail if monitoring is down
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
- name: Run backup
id: backup
run: |
python scripts/backup.py
echo "rows=$(cat /tmp/backup_row_count.txt)" >> $GITHUB_OUTPUT
- name: Upload to S3
run: aws s3 cp /backups/latest.dump s3://my-backups/
- name: Ping done
if: success()
run: |
curl -fsS \
"https://deadmancheck.io/ping/${{ secrets.DEADMANCHECK_TOKEN }}?count=${{ steps.backup.outputs.rows }}" \
> /dev/null || true
- name: Ping fail
if: failure()
run: |
curl -fsS \
"https://deadmancheck.io/ping/${{ secrets.DEADMANCHECK_TOKEN }}/fail" \
> /dev/null || true
A few things worth noting:
-
timeout-minutes: 30is a hard ceiling. Without it, a hung job can sit there for 6 hours consuming a runner. -
|| trueon the monitoring pings means a DeadManCheck outage won't cause your backup job to report failed. - The row count flows from the backup step through
$GITHUB_OUTPUTto the ping step.
After deploying
Trigger the workflow manually and confirm:
- The workflow runs end-to-end without errors
- DeadManCheck shows a recent ping on your monitor dashboard
- The count looks correct for what the job processed
Wait for the first scheduled run and verify again. Two successful data points before you trust it.
Scheduled workflows are one of those things that feel reliable until the day they aren't. External monitoring is the difference between finding out immediately and finding out when someone asks why the weekly report is missing.
Top comments (0)