DEV Community

AttractivePenguin
AttractivePenguin

Posted on

I Built an Uptime Monitor in a Weekend — And Saved $180/Year

I was paying $15/month for Uptime Robot. I used maybe 30% of it — basic HTTP checks on a handful of endpoints, email alerts when something went down, and a status page nobody really looked at.

That's $180/year for a glorified cron job.

So I built my own. In a weekend. Here's exactly how, what I learned, and whether you should do the same.

Why Build Your Own?

Let's be clear: Uptime Robot, Pingdom, and friends are excellent services. If you're running production infrastructure at scale, use them. But if you're a solo developer or small team monitoring a handful of personal projects, side hustles, or staging environments, the math is painful:

Service Free Tier Paid Tier
Uptime Robot 50 monitors, 5-min intervals $7-84/month
Pingdom None $15/month+
Better Stack 10 monitors, 3-min intervals $24/month+

For my use case — 5-10 endpoints, 1-minute intervals, Slack alerts — the free tiers were too limited and the paid tiers were overkill.

What I actually needed:

  • HTTP health checks every 60 seconds
  • Response time tracking
  • Slack/email notifications on downtime
  • A simple dashboard to see current status

That's maybe 200 lines of code. Let me prove it.

Architecture

Here's the setup I landed on after experimenting:

┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│   Scheduler  │────▶│  HTTP Check   │────▶│    SQLite   │
│  (node-cron) │     │  (fetch API)  │     │   (results) │
└─────────────┘     └──────────────┘     └─────────────┘
                            │
                            ▼
                    ┌──────────────┐
                    │   Alerter    │
                    │ (Slack/Email)│
                    └──────────────┘
                            │
                            ▼
                    ┌──────────────┐
                    │   Dashboard  │
                    │   (Express)  │
                    └──────────────┘
Enter fullscreen mode Exit fullscreen mode

Why SQLite? Because this is a low-write, read-heavy workload that fits comfortably in a single file. No need to run a database server. If you're monitoring <1,000 endpoints, SQLite handles it effortlessly.

Why Node.js? Because the fetch API is built in (Node 18+), the async model fits perfectly for concurrent HTTP checks, and you probably already have it installed.

Step 1: Project Setup

mkdir uptime-monitor && cd uptime-monitor
npm init -y
npm install better-sqlite3 node-cron express dotenv
Enter fullscreen mode Exit fullscreen mode

Create a .env file:

MONITOR_INTERVAL=1
SLACK_WEBHOOK_URL=https://hooks.slack.com/services/YOUR/WEBHOOK/HERE
ALERT_EMAIL=you@example.com
PORT=3000
Enter fullscreen mode Exit fullscreen mode

Step 2: Database Setup

// db.js
const Database = require('better-sqlite3');
const db = new Database('monitor.db');

db.exec(`
  CREATE TABLE IF NOT EXISTS checks (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    url TEXT NOT NULL,
    status INTEGER,
    response_time_ms INTEGER,
    up INTEGER NOT NULL,
    checked_at DATETIME DEFAULT CURRENT_TIMESTAMP
  );

  CREATE INDEX IF NOT EXISTS idx_checks_url ON checks(url);
  CREATE INDEX IF NOT EXISTS idx_checks_checked_at ON checks(checked_at);
`);

// Prepared statements for performance
const insertCheck = db.prepare(`
  INSERT INTO checks (url, status, response_time_ms, up)
  VALUES (@url, @status, @response_time_ms, @up)
`);

const getRecentChecks = db.prepare(`
  SELECT * FROM checks
  WHERE url = ?
  ORDER BY checked_at DESC
  LIMIT ?
`);

module.exports = { db, insertCheck, getRecentChecks };
Enter fullscreen mode Exit fullscreen mode

Using prepared statements is important here — you'll be inserting thousands of rows, and prepared statements are dramatically faster than raw queries.

Step 3: The Health Checker

// checker.js
const { insertCheck } = require('./db');

const ENDPOINTS = [
  { url: 'https://myapp.com', expectedStatus: 200 },
  { url: 'https://api.myapp.com/health', expectedStatus: 200 },
  { url: 'https://myportfolio.dev', expectedStatus: 200 },
];

async function checkEndpoint(endpoint) {
  const start = Date.now();
  try {
    const response = await fetch(endpoint.url, {
      signal: AbortSignal.timeout(10000), // 10s timeout
    });
    const response_time_ms = Date.now() - start;
    const up = response.status === endpoint.expectedStatus ? 1 : 0;

    insertCheck.run({
      url: endpoint.url,
      status: response.status,
      response_time_ms,
      up,
    });

    return { url: endpoint.url, up, status: response.status, response_time_ms };
  } catch (err) {
    const response_time_ms = Date.now() - start;
    insertCheck.run({
      url: endpoint.url,
      status: null,
      response_time_ms,
      up: 0,
    });

    return { url: endpoint.url, up: false, error: err.message, response_time_ms };
  }
}

async function runAllChecks() {
  const results = await Promise.allSettled(
    ENDPOINTS.map(ep => checkEndpoint(ep))
  );
  return results.map(r => r.value || r.reason);
}

module.exports = { runAllChecks, ENDPOINTS };
Enter fullscreen mode Exit fullscreen mode

Key detail: AbortSignal.timeout(10000) is a Node.js 18+ feature that gives you a clean timeout without manual AbortController plumbing. If you're on an older Node version, use setTimeout + AbortController.

Step 4: Alerting

// alerter.js
require('dotenv').config();
const { getRecentChecks } = require('./db');

const SLACK_WEBHOOK = process.env.SLACK_WEBHOOK_URL;
const COOLDOWN_MINUTES = 5; // Don't spam — alert once per 5 min per endpoint

async function sendSlackAlert(url, status, responseTime) {
  if (!SLACK_WEBHOOK) return;

  const emoji = status === null ? '🔴' : '🟡';
  const message = status === null
    ? `${emoji} *DOWN*: ${url} — Connection failed (${responseTime}ms)`
    : `${emoji} *DEGRADED*: ${url} — Got ${status} (${responseTime}ms)`;

  await fetch(SLACK_WEBHOOK, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ text: message }),
  });
}

// Simple cooldown: only alert if last check was UP
async function shouldAlert(url) {
  const recent = getRecentChecks.all(url, 2);
  if (recent.length < 2) return true; // First checks — always alert on down
  // Only alert if the previous check was UP (prevents spam)
  return recent[1].up === 1;
}

async function processAlerts(results) {
  for (const result of results) {
    if (!result.up) {
      const shouldSend = await shouldAlert(result.url);
      if (shouldSend) {
        await sendSlackAlert(result.url, result.status, result.response_time_ms);
      }
    }
  }
}

module.exports = { processAlerts };
Enter fullscreen mode Exit fullscreen mode

The cooldown logic is critical. Without it, you'll get a Slack message every single minute something is down. With it, you only get alerted at the transition — when a site goes from UP to DOWN. Add a recovery notification the same way.

Step 5: Dashboard (30 Lines)

// server.js
const express = require('express');
const { db, getRecentChecks } = require('./db');
const { ENDPOINTS } = require('./checker');

const app = express();

app.get('/', (req, res) => {
  const statuses = ENDPOINTS.map(ep => {
    const checks = getRecentChecks.all(ep.url, 60); // Last 60 checks
    const current = checks[0];
    const uptimePercent = checks.filter(c => c.up).length / checks.length * 100;
    const avgResponseTime = checks.reduce((sum, c) => sum + c.response_time_ms, 0) / checks.length;

    return {
      url: ep.url,
      up: current?.up ?? false,
      status: current?.status,
      responseTime: current?.response_time_ms,
      uptime: uptimePercent.toFixed(1),
      avgResponseTime: Math.round(avgResponseTime),
    };
  });

  res.send(`
    <html><head><title>Uptime Monitor</title>
    <style>
      body { font-family: system-ui; max-width: 800px; margin: 2rem auto; padding: 0 1rem; }
      .endpoint { border: 1px solid #ddd; border-radius: 8px; padding: 1rem; margin: 0.5rem 0; }
      .up { border-left: 4px solid #22c55e; } .down { border-left: 4px solid #ef4444; }
      .dot { display: inline-block; width: 12px; height: 12px; border-radius: 50%; margin-right: 8px; }
      .dot.up { background: #22c55e; } .dot.down { background: #ef4444; }
    </style></head><body>
    <h1>📊 Uptime Monitor</h1>
    ${statuses.map(s => `
      <div class="endpoint ${s.up ? 'up' : 'down'}">
        <span class="dot ${s.up ? 'up' : 'down'}"></span>
        <strong>${s.url}</strong><br>
        Status: ${s.status || 'N/A'} | Response: ${s.responseTime || 'N/A'}ms |
        Uptime (1hr): ${s.uptime}% | Avg: ${s.avgResponseTime}ms
      </div>
    `).join('')}
    <p style="color:#888;margin-top:2rem">Checks every ${process.env.MONITOR_INTERVAL || 1} min</p>
    </body></html>
  `);
});

app.listen(process.env.PORT || 3000);
Enter fullscreen mode Exit fullscreen mode

No React, no bundler, no CSS framework. Just server-rendered HTML. It loads in under 100ms and shows everything you need at a glance.

Step 6: Tie It All Together

// index.js
require('dotenv').config();
const cron = require('node-cron');
const { runAllChecks } = require('./checker');
const { processAlerts } = require('./alerter');
require('./server'); // Start dashboard

// Run every minute (or adjust the cron expression)
const interval = process.env.MONITOR_INTERVAL || 1;
cron.schedule(`*/${interval} * * * *`, async () => {
  console.log(`[${new Date().toISOString()}] Running checks...`);
  const results = await runAllChecks();
  await processAlerts(results);

  const downCount = results.filter(r => !r.up).length;
  if (downCount > 0) {
    console.log(`⚠️  ${downCount} endpoint(s) DOWN`);
  } else {
    console.log(`✅ All ${results.length} endpoints UP`);
  }
});

console.log('🚀 Uptime monitor running. Dashboard at http://localhost:3000');
Enter fullscreen mode Exit fullscreen mode
node index.js
Enter fullscreen mode Exit fullscreen mode

That's it. You now have a fully functional uptime monitor with alerting and a dashboard.

Real-World Scenarios

Personal Projects (What I Built This For)

I monitor 7 endpoints across 3 personal projects. The total cost is $0/month, running on a $4 DigitalOcean droplet that's also hosting the projects themselves. Response time data lives in a 12MB SQLite file.

Small Team (2-5 People)

Add Slack alerting per channel (one per service), and you've got team-wide visibility. Rotate the SQLite database monthly with a cron job that archives old data. You're still at $0.

The Cost Comparison

Approach Monthly Cost Annual Cost
Uptime Robot (Pro) $7 $84
Pingdom $15 $180
Better Stack $24 $288
This solution $0* $0*

*Assuming you already have a server. If not, a $4/month droplet handles this + your apps.

FAQ

What if my monitor server goes down?

This is the classic "who monitors the monitor?" problem. Two options: (1) Use a free-tier external service as a backup check on your monitor, or (2) Set up a simple healthcheck on a separate $0 cloud function. For personal projects, this edge case rarely matters.

Can I scale this to 1,000+ endpoints?

Yes, but switch SQLite to PostgreSQL. Add connection pooling and batch your inserts. The checker logic stays the same.

What about SSL certificate monitoring?

Add this check to checker.js:

const https = require('https');
function checkSSLCert(url) {
  return new Promise((resolve) => {
    const req = https.request(url, { method: 'HEAD' }, (res) => {
      const cert = res.socket.getPeerCertificate();
      const daysLeft = Math.floor(
        (new Date(cert.valid_to) - Date.now()) / (1000 * 60 * 60 * 24)
      );
      resolve({ valid: daysLeft > 0, daysLeft, issuer: cert.issuer?.O });
    });
    req.on('error', () => resolve({ valid: false }));
    req.end();
  });
}
Enter fullscreen mode Exit fullscreen mode

How do I handle cron in production?

Use PM2 for process management: pm2 start index.js --name uptime-monitor. It handles restarts, logs, and monitoring. No Docker needed for something this simple.

Won't SQLite corrupt under concurrent writes?

SQLite uses WAL mode by default with better-sqlite3. For this workload (one write per minute per endpoint), corruption is essentially impossible. If you're worried, add db.pragma('journal_mode = WAL'); explicitly.

What I Learned

  1. Most monitoring features are noise. I was paying for 40+ integrations, custom status pages, and team management I never used. Identify what you actually need.

  2. SQLite is underrated. For read-heavy monitoring data with infrequent writes, it's perfect. Don't default to PostgreSQL for everything.

  3. Alerting is harder than monitoring. The checking part is trivial. The hard part is not alerting — cooldowns, deduplication, and escalation policies. Keep it simple at first.

  4. Server-rendered HTML is fine. My dashboard is 30 lines of template literals and loads instantly. Not everything needs React.

  5. Own your data. When Uptime Robot had an outage last year, I couldn't access my monitoring data. With SQLite on my own server, that data is always mine.

Conclusion

Building a custom uptime monitor took me a weekend, costs nothing to run, and does exactly what I need. No feature bloat, no subscription, no vendor lock-in.

That said — if you're running production systems at a company, use a real service. Monitoring is critical infrastructure. The $15/month is worth the reliability, compliance, and support. But for personal projects and small teams? Roll your own. It's more fun, you'll learn something, and you'll save money.

The full code is available in the examples above — copy, paste, deploy. If you build something with this, I'd love to hear about it.


Happy monitoring! 🚀

Top comments (0)