Your Django app just crashed at 2 AM and you found out when a customer emailed you the next morning. Sound familiar?
Most Django tutorials skip monitoring entirely. By the end of this article you'll have HTTP uptime monitoring, heartbeat checks for your scheduled tasks, Slack/Discord alerts, and a public status page — all in under 30 minutes, free.
The silent failure problem
Django apps have two common failure modes that go undetected without proper monitoring:
Endpoint outages — your /api/ or /health/ URL starts returning 500s. Users see errors, but you don't know until they complain.
Silent task failures — your Celery beat tasks or manage.py commands stop running. The database stops getting updated, emails stop sending, reports stop generating. No errors are thrown. Nothing is logged. You just... stop getting results.
Both problems are easy to detect. You just need a tool that's actively checking.
Step 1: Add a health check endpoint
Install django-health-check:
pip install django-health-check
Add it to INSTALLED_APPS and wire up the URL:
# settings.py
INSTALLED_APPS = [
# ... your other apps
'health_check',
'health_check.db', # checks database connectivity
'health_check.cache', # checks cache backend
'health_check.storage', # checks file storage
]
# urls.py
from django.urls import path, include
urlpatterns = [
# ... your other URLs
path('health/', include('health_check.urls')),
]
Now GET /health/ returns 200 OK when everything is healthy and 500 when something is wrong. This single endpoint gives you deep visibility into your app's health.
Step 2: Set up HTTP uptime monitoring
With your health endpoint live, point Vigilmon at it:
- Sign up for a free account at vigilmon.online
- Click New Monitor → HTTP
- Enter
https://yourdomain.com/health/ - Set check interval to 5 minutes (free tier)
- Save
Vigilmon will now check your endpoint every 5 minutes from multiple locations. If it fails, you get alerted immediately — before your users notice.
You can also monitor your main API routes directly:
-
https://yourdomain.com/api/— confirms the API is reachable -
https://yourdomain.com/— confirms the frontend is serving
Step 3: Heartbeat monitoring for Django management commands
HTTP monitoring catches server outages. But what about tasks that run on a schedule and stop silently?
The heartbeat pattern: your task pings a URL at the end of every successful run. If Vigilmon stops receiving that ping, it knows the task failed or stopped running.
Create a management command that includes a heartbeat ping:
# myapp/management/commands/send_daily_digest.py
import requests
from django.core.management.base import BaseCommand
from django.conf import settings
class Command(BaseCommand):
help = 'Send daily digest emails'
def handle(self, *args, **options):
try:
# Your actual task logic here
self._send_digest_emails()
# Ping the heartbeat URL on success
heartbeat_url = getattr(settings, 'DIGEST_HEARTBEAT_URL', None)
if heartbeat_url:
requests.get(heartbeat_url, timeout=5)
self.stdout.write(self.style.SUCCESS('Digest sent successfully'))
except Exception as e:
self.stderr.write(f'Error: {e}')
raise
def _send_digest_emails(self):
# your logic here
pass
In your settings:
# settings.py
DIGEST_HEARTBEAT_URL = 'https://vigilmon.online/heartbeats/your-unique-token'
In Vigilmon, create a Heartbeat Monitor:
- Click New Monitor → Heartbeat
- Set the expected interval (e.g. every 24 hours)
- Copy the unique ping URL
- Paste it into your settings as
DIGEST_HEARTBEAT_URL
Now if your command fails midway — or your cron job stops firing entirely — Vigilmon will alert you within one missed interval.
Works with Celery too
Same pattern applies to Celery tasks:
# tasks.py
from celery import shared_task
import requests
from django.conf import settings
import logging
logger = logging.getLogger(__name__)
@shared_task
def process_pending_orders():
try:
# Your task logic
_do_processing()
# Ping on success
url = getattr(settings, 'ORDER_PROCESSING_HEARTBEAT_URL', None)
if url:
requests.get(url, timeout=5)
except Exception as e:
logger.error(f'Order processing failed: {e}')
raise
Step 4: Webhook alerts to Slack or Discord
Set up alert delivery in Vigilmon:
For Slack:
- Create an incoming webhook in your Slack workspace
- In Vigilmon, go to Notifications → New Channel → Slack
- Paste the webhook URL
- Enable it on your monitors
For Discord:
- In your Discord server, go to channel settings → Integrations → Webhooks
- Create a new webhook, copy the URL
- In Vigilmon, go to Notifications → New Channel → Discord
- Paste the Discord webhook URL
You'll get alerts like:
DOWN: yourdomain.com/health/ is DOWN
Checked from: US-East, EU-West
Status: 500 Internal Server Error
Started: 2 minutes ago
And when it recovers:
RESOLVED: yourdomain.com/health/ is back UP
Downtime: 8 minutes
Step 5: Public status page
A public status page builds trust with your users and reduces support load during incidents — they can self-serve the "is it just me?" question.
In Vigilmon:
- Go to Status Pages → New Status Page
- Give it a name and choose which monitors to display
- Share the public URL (e.g.
status.yourdomain.com)
You can embed it in your docs or link to it from your site footer. Users check it during outages so your support inbox doesn't get flooded.
What you've built
In under 30 minutes you've set up:
| What | How |
|---|---|
| HTTP uptime monitoring |
django-health-check + Vigilmon HTTP monitor |
| Database/cache health |
health_check.db and health_check.cache plugins |
| Scheduled task monitoring | Heartbeat ping at end of each management command |
| Instant alerts | Slack/Discord webhook notifications |
| Public status page | Vigilmon status page |
The full setup costs $0 on the free tier and takes less time than debugging a silent failure that's been going on for three days.
Next steps
- Add
health_check.contrib.celeryto monitor your Celery workers directly - Set up response time monitoring to catch performance regressions before they become outages
- Create a heartbeat for each critical scheduled task — treat them like uptime monitors
Get started free at vigilmon.online.
Top comments (0)