GitHub Actions scheduled workflows have a well-documented reliability problem: they can be delayed by 30 minutes or more during peak load, silently dropped when GitHub's infrastructure is busy, and automatically disabled when a repository has no activity for 60 days — with no notification beyond an easy-to-miss email.
If your scheduled workflow runs a nightly data sync, a backup, a report generation job, or any task where missing a run has real consequences, relying on GitHub to tell you when something goes wrong is not enough.
Here's the failure mode that catches most teams: the workflow runs successfully 95% of the time. Then one night it doesn't. GitHub doesn't send an alert for a skipped scheduled run — it only sends notifications for failures. A skipped run is invisible.
Why GitHub Actions schedules are unreliable
GitHub documents this explicitly. Scheduled workflows are deprioritised during high load:
Schedules that run during periods of high demand may be delayed or, if load is sufficiently high, potentially dropped.
Additionally, scheduled workflows are automatically disabled on repositories with no activity for 60 days. "Activity" means commits, issues, or pull requests — a workflow successfully running on schedule does not count as activity. This means a stable repository that hasn't had a commit in two months will silently stop running its scheduled workflows.
There is also no timezone support. All schedules run in UTC. If you need a workflow to fire at 9am in a specific timezone, you calculate the offset manually and update it when daylight saving changes.
The fix: external dead man's switch monitoring
The correct solution is to make your scheduled workflow report its own health to an external service after each run. If the ping doesn't arrive when expected, the external service fires an alert.
Add a ping step at the end of every scheduled workflow:
name: Nightly sync
on:
schedule:
- cron: '0 2 * * *'
workflow_dispatch: # always add this for manual testing
jobs:
sync:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Run sync
run: ./scripts/sync.sh
- name: Ping Crontify
if: success()
run: |
curl -fsS -X POST \
https://api.crontify.com/api/v1/ping/${{ secrets.CRONTIFY_MONITOR_ID }}/success \
-H "X-API-Key: ${{ secrets.CRONTIFY_API_KEY }}"
Store CRONTIFY_MONITOR_ID and CRONTIFY_API_KEY as repository secrets under Settings → Secrets and variables → Actions.
The if: success() condition means the ping only fires when all previous steps complete without error. If any step fails, no ping is sent. Crontify waits for the expected ping and fires an alert when it doesn't arrive — whether because the workflow failed, was delayed, or was silently skipped.
Adding start and fail pings for complete coverage
The pattern above only sends a success ping. For complete coverage — detecting hung workflows and attaching failure context — add start and fail pings too:
jobs:
sync:
runs-on: ubuntu-latest
steps:
- name: Ping start
run: |
curl -fsS -X POST \
https://api.crontify.com/api/v1/ping/${{ secrets.CRONTIFY_MONITOR_ID }}/start \
-H "X-API-Key: ${{ secrets.CRONTIFY_API_KEY }}"
- name: Checkout
uses: actions/checkout@v4
- name: Run sync
id: sync
run: ./scripts/sync.sh
- name: Ping success
if: success()
run: |
curl -fsS -X POST \
https://api.crontify.com/api/v1/ping/${{ secrets.CRONTIFY_MONITOR_ID }}/success \
-H "X-API-Key: ${{ secrets.CRONTIFY_API_KEY }}"
- name: Ping fail
if: failure()
run: |
curl -fsS -X POST \
https://api.crontify.com/api/v1/ping/${{ secrets.CRONTIFY_MONITOR_ID }}/fail \
-H "X-API-Key: ${{ secrets.CRONTIFY_API_KEY }}" \
-H "Content-Type: application/json" \
-d '{"message": "Workflow step failed"}'
With start and success pings, Crontify can detect workflows that start but never complete — the case where a workflow hangs on a long-running step.
Attaching output metadata
If your workflow produces meaningful output — records processed, files synced, rows exported — attach it to the success ping so you can define alert rules on the output:
- name: Run sync
id: sync
run: |
RESULT=$(./scripts/sync.sh --json)
echo "records=$(echo $RESULT | jq .records)" >> $GITHUB_OUTPUT
echo "errors=$(echo $RESULT | jq .errors)" >> $GITHUB_OUTPUT
- name: Ping success with metadata
if: success()
run: |
curl -fsS -X POST \
https://api.crontify.com/api/v1/ping/${{ secrets.CRONTIFY_MONITOR_ID }}/success \
-H "X-API-Key: ${{ secrets.CRONTIFY_API_KEY }}" \
-H "Content-Type: application/json" \
-d '{
"meta": {
"records_synced": ${{ steps.sync.outputs.records }},
"errors": ${{ steps.sync.outputs.errors }}
}
}'
In Crontify's dashboard, define a rule: records_synced eq 0 → fire alert. This catches the case where the workflow ran and completed but produced no output — the silent failure that GitHub has no mechanism to detect.
Configuring the monitor
In Crontify's dashboard, create a monitor with:
-
Expected schedule: the same cron expression as your workflow (
0 2 * * *) - Grace period: 45–60 minutes — GitHub schedules can be delayed significantly
- Alert channels: Slack, email, or webhook
The generous grace period is important. A GitHub scheduled workflow that fires 35 minutes late should not trigger a false alarm. The monitoring goal is detecting skipped runs and failures — not penalising GitHub's inherent schedule drift.
Frequently asked questions
Does this work for workflows on private repositories?
Yes. The ping is an outbound HTTP request from the GitHub Actions runner. Repository visibility doesn't affect it.
What if the ping step itself fails?
Use curl with --retry 3 to handle transient failures. If the ping step still fails, the subsequent fail ping path won't fire — but Crontify will still detect a missed run because no success ping arrived within the grace period.
Can I monitor multiple workflows with one Crontify account?
Yes. Create a separate monitor for each workflow. The free tier covers 5 monitors.
Crontify is free for up to 5 monitors — no credit card required.
If you're running scheduled GitHub Actions workflows for anything important and haven't added external monitoring, you're flying blind on one of the least reliable scheduling mechanisms in common use.
Top comments (0)