Vigilmon

Posted on Jun 27

How to monitor your Django app with uptime checks and heartbeat monitoring (free)

#django #python #monitoring #devops

Your Django app just crashed at 2 AM and you found out when a customer emailed you the next morning. Sound familiar?

Most Django tutorials skip monitoring entirely. By the end of this article you'll have HTTP uptime monitoring, heartbeat checks for your scheduled tasks, Slack/Discord alerts, and a public status page — all in under 30 minutes, free.

The silent failure problem

Django apps have two common failure modes that go undetected without proper monitoring:

Endpoint outages — your /api/ or /health/ URL starts returning 500s. Users see errors, but you don't know until they complain.

Silent task failures — your Celery beat tasks or manage.py commands stop running. The database stops getting updated, emails stop sending, reports stop generating. No errors are thrown. Nothing is logged. You just... stop getting results.

Both problems are easy to detect. You just need a tool that's actively checking.

Step 1: Add a health check endpoint

Install django-health-check:

pip install django-health-check

Add it to INSTALLED_APPS and wire up the URL:

# settings.py
INSTALLED_APPS = [
    # ... your other apps
    'health_check',
    'health_check.db',          # checks database connectivity
    'health_check.cache',       # checks cache backend
    'health_check.storage',     # checks file storage
]

# urls.py
from django.urls import path, include

urlpatterns = [
    # ... your other URLs
    path('health/', include('health_check.urls')),
]

Now GET /health/ returns 200 OK when everything is healthy and 500 when something is wrong. This single endpoint gives you deep visibility into your app's health.

Step 2: Set up HTTP uptime monitoring

With your health endpoint live, point Vigilmon at it:

Sign up for a free account at vigilmon.online
Click New Monitor → HTTP
Enter https://yourdomain.com/health/
Set check interval to 5 minutes (free tier)
Save

Vigilmon will now check your endpoint every 5 minutes from multiple locations. If it fails, you get alerted immediately — before your users notice.

You can also monitor your main API routes directly:

https://yourdomain.com/api/ — confirms the API is reachable
https://yourdomain.com/ — confirms the frontend is serving

Step 3: Heartbeat monitoring for Django management commands

HTTP monitoring catches server outages. But what about tasks that run on a schedule and stop silently?

The heartbeat pattern: your task pings a URL at the end of every successful run. If Vigilmon stops receiving that ping, it knows the task failed or stopped running.

Create a management command that includes a heartbeat ping:

# myapp/management/commands/send_daily_digest.py
import requests
from django.core.management.base import BaseCommand
from django.conf import settings

class Command(BaseCommand):
    help = 'Send daily digest emails'

    def handle(self, *args, **options):
        try:
            # Your actual task logic here
            self._send_digest_emails()

            # Ping the heartbeat URL on success
            heartbeat_url = getattr(settings, 'DIGEST_HEARTBEAT_URL', None)
            if heartbeat_url:
                requests.get(heartbeat_url, timeout=5)

            self.stdout.write(self.style.SUCCESS('Digest sent successfully'))
        except Exception as e:
            self.stderr.write(f'Error: {e}')
            raise

    def _send_digest_emails(self):
        # your logic here
        pass

In your settings:

# settings.py
DIGEST_HEARTBEAT_URL = 'https://vigilmon.online/heartbeats/your-unique-token'

In Vigilmon, create a Heartbeat Monitor:

Click New Monitor → Heartbeat
Set the expected interval (e.g. every 24 hours)
Copy the unique ping URL
Paste it into your settings as DIGEST_HEARTBEAT_URL

Now if your command fails midway — or your cron job stops firing entirely — Vigilmon will alert you within one missed interval.

Works with Celery too

Same pattern applies to Celery tasks:

# tasks.py
from celery import shared_task
import requests
from django.conf import settings
import logging

logger = logging.getLogger(__name__)

@shared_task
def process_pending_orders():
    try:
        # Your task logic
        _do_processing()

        # Ping on success
        url = getattr(settings, 'ORDER_PROCESSING_HEARTBEAT_URL', None)
        if url:
            requests.get(url, timeout=5)
    except Exception as e:
        logger.error(f'Order processing failed: {e}')
        raise

Step 4: Webhook alerts to Slack or Discord

Set up alert delivery in Vigilmon:

For Slack:

Create an incoming webhook in your Slack workspace
In Vigilmon, go to Notifications → New Channel → Slack
Paste the webhook URL
Enable it on your monitors

For Discord:

In your Discord server, go to channel settings → Integrations → Webhooks
Create a new webhook, copy the URL
In Vigilmon, go to Notifications → New Channel → Discord
Paste the Discord webhook URL

You'll get alerts like:

DOWN: yourdomain.com/health/ is DOWN
Checked from: US-East, EU-West
Status: 500 Internal Server Error
Started: 2 minutes ago

And when it recovers:

RESOLVED: yourdomain.com/health/ is back UP
Downtime: 8 minutes

Step 5: Public status page

A public status page builds trust with your users and reduces support load during incidents — they can self-serve the "is it just me?" question.

In Vigilmon:

Go to Status Pages → New Status Page
Give it a name and choose which monitors to display
Share the public URL (e.g. status.yourdomain.com)

You can embed it in your docs or link to it from your site footer. Users check it during outages so your support inbox doesn't get flooded.

What you've built

In under 30 minutes you've set up:

What	How
HTTP uptime monitoring	`django-health-check` + Vigilmon HTTP monitor
Database/cache health	`health_check.db` and `health_check.cache` plugins
Scheduled task monitoring	Heartbeat ping at end of each management command
Instant alerts	Slack/Discord webhook notifications
Public status page	Vigilmon status page

The full setup costs $0 on the free tier and takes less time than debugging a silent failure that's been going on for three days.

Next steps

Add health_check.contrib.celery to monitor your Celery workers directly
Set up response time monitoring to catch performance regressions before they become outages
Create a heartbeat for each critical scheduled task — treat them like uptime monitors

Get started free at vigilmon.online.

DEV Community