I Built an API Monitoring Platform Because My Own API Went Down and I Had No Idea
TL;DR: My deployed API went down silently. I found out hours later. UptimeRobot felt like overkill for a student project. So I built Monitorly — a real-time API monitoring platform with a live dashboard, email alerts (without spamming you), and uptime tracking. It's live and open source.
The Problem
I deployed my first production backend and felt great about it.
Then it went down. Silently. No alert, no notification, nothing. I found out hours later when I went to check it manually.
That's when I realised two things:
- Every deployed project needs uptime monitoring
- Existing tools like UptimeRobot felt like overkill — too many settings, too much noise, not built for someone just learning how production works
So I built my own. And in the process, I learned more about real-time systems, cron jobs, and backend architecture than any tutorial had taught me.
What is Monitorly?
Monitorly is a production-grade API uptime monitoring platform. You add your endpoints, it checks them on a schedule, shows you a live dashboard, and emails you when something goes down — without flooding your inbox.
Live: urlzap.me/pulsewatch
GitHub: github.com/vbv0507/api-monitoring-system
Features
1. Real-Time Dashboard with Socket.io
No page refresh needed. When a monitor check completes, the result pushes instantly to your dashboard via WebSocket. You see status changes the moment they happen.
2. Email Alerts — Without Spamming You
This was a deliberate product decision. Most monitoring tools email you on every failed check. If your API is flapping (going up and down repeatedly), you'd get 50 emails in an hour.
Monitorly only sends an alert when the status changes — down → alert sent. Back up → recovery email sent. One email per incident, not one per check.
3. Uptime Percentage Calculation
Every monitor tracks its full check history and calculates a rolling uptime percentage. You can see at a glance whether your API has been 99.9% up or quietly degraded over time.
4. Configurable Check Intervals
Checks run every 1, 5, or 15 minutes depending on how closely you need to watch an endpoint. The cron engine handles all of it automatically.
5. Per-User Monitor Isolation
Every user only sees their own monitors. JWT authentication ensures complete isolation — you can't accidentally see or affect someone else's data.
Tech Stack
| Layer | Technology |
|---|---|
| Runtime | Node.js |
| Framework | Express.js |
| Database | MongoDB + Mongoose |
| Real-time | Socket.io |
| Scheduling | node-cron |
| HTTP checks | Axios |
| Email alerts | Nodemailer |
| Auth | JWT |
| Deployment | Azure App Service |
How It Works — The Core Architecture
The Monitoring Engine
The heart of Monitorly is the cron-based monitoring engine. When a user adds a monitor, it registers a cron job that fires at the configured interval.
const cron = require('node-cron');
const axios = require('axios');
function scheduleMonitor(monitor) {
const intervals = {
1: '* * * * *', // every 1 minute
5: '*/5 * * * *', // every 5 minutes
15: '*/15 * * * *' // every 15 minutes
};
const expression = intervals[monitor.interval] || '*/5 * * * *';
cron.schedule(expression, async () => {
await runCheck(monitor);
});
}
Running a Health Check
Each check measures response time, status code, and whether the endpoint is reachable at all.
async function runCheck(monitor) {
const startTime = Date.now();
let status = 'down';
let responseTime = null;
let statusCode = null;
try {
const response = await axios.get(monitor.url, { timeout: 10000 });
responseTime = Date.now() - startTime;
statusCode = response.status;
status = statusCode >= 200 && statusCode < 400 ? 'up' : 'down';
} catch (err) {
responseTime = Date.now() - startTime;
statusCode = err.response?.status || null;
}
// Save the log
await MonitorLog.create({
monitorId: monitor._id,
status,
responseTime,
statusCode,
checkedAt: new Date()
});
// Push real-time update to dashboard
io.to(monitor.userId.toString()).emit('monitor-update', {
monitorId: monitor._id,
status,
responseTime,
statusCode
});
// Handle alert logic
await handleAlerts(monitor, status);
}
The Alert Logic — No Spam
This is the part I'm most proud of. Instead of alerting on every failed check, Monitorly tracks the previous status and only triggers an email when the state changes.
async function handleAlerts(monitor, newStatus) {
const previousStatus = monitor.lastStatus;
// Only act on status change
if (newStatus === previousStatus) return;
// Update stored status
await Monitor.findByIdAndUpdate(monitor._id, { lastStatus: newStatus });
if (newStatus === 'down') {
// Send downtime alert
await sendEmail({
to: monitor.userId.email,
subject: `🔴 ${monitor.name} is down`,
body: `Your endpoint ${monitor.url} is not responding. Detected at ${new Date().toISOString()}.`
});
}
if (newStatus === 'up' && previousStatus === 'down') {
// Send recovery alert
await sendEmail({
to: monitor.userId.email,
subject: `🟢 ${monitor.name} is back up`,
body: `Your endpoint ${monitor.url} has recovered. Downtime ended at ${new Date().toISOString()}.`
});
}
}
One status change = one email. That's it.
Uptime Percentage Calculation
async function getUptimePercentage(monitorId, days = 7) {
const since = new Date();
since.setDate(since.getDate() - days);
const logs = await MonitorLog.find({
monitorId,
checkedAt: { $gte: since }
});
if (logs.length === 0) return null;
const upCount = logs.filter(log => log.status === 'up').length;
return ((upCount / logs.length) * 100).toFixed(2);
}
Simple, accurate, and runs off the existing log data with no extra storage.
Real-Time Dashboard with Socket.io
When the monitoring engine runs a check, it emits a monitor-update event to the user's socket room. On the frontend, the dashboard listens and updates the UI instantly.
// Server — emit to user's room
io.to(monitor.userId.toString()).emit('monitor-update', updatePayload);
// Client — listen and update UI
socket.on('monitor-update', (data) => {
updateMonitorCard(data.monitorId, data.status, data.responseTime);
});
No polling. No page refresh. The dashboard just stays live.
MongoDB Schema Design
// Monitor — what the user wants to track
const monitorSchema = new mongoose.Schema({
userId: { type: mongoose.Schema.Types.ObjectId, ref: 'User', required: true },
name: { type: String, required: true },
url: { type: String, required: true },
interval: { type: Number, enum: [1, 5, 15], default: 5 },
lastStatus: { type: String, enum: ['up', 'down', 'pending'], default: 'pending' },
createdAt: { type: Date, default: Date.now }
});
// Log — individual check result
const monitorLogSchema = new mongoose.Schema({
monitorId: { type: mongoose.Schema.Types.ObjectId, ref: 'Monitor', required: true },
status: { type: String, enum: ['up', 'down'], required: true },
responseTime: { type: Number },
statusCode: { type: Number },
checkedAt: { type: Date, default: Date.now }
});
// Index for fast log queries per monitor
monitorLogSchema.index({ monitorId: 1, checkedAt: -1 });
Separating monitors from logs is important — logs grow fast, and you don't want a single document ballooning with embedded arrays.
What I Learned Building This
1. Socket.io rooms are perfect for multi-user real-time apps.
Instead of broadcasting every update to every connected client, I put each user in their own room (socket.join(userId)). Clean isolation with one line of code.
2. Cron jobs need restart handling.
When the server restarts, all scheduled cron jobs are gone. I reload all active monitors from MongoDB on server startup and reschedule them. Always think about what happens on restart.
3. Alert logic is a UX problem, not just a technical one.
The "no spam" decision came from thinking about what I'd actually want as a user. Technical correctness (alert on every failure) is not the same as a good user experience. Think about both.
4. Separate your logs from your main documents.
Early on I stored check results as an embedded array inside the Monitor document. It hit MongoDB's 16MB document limit faster than expected. Separate collections with indexes is the right pattern for time-series data.
5. Timeouts are not optional.
Without a timeout on Axios requests, a hanging endpoint would block the check indefinitely. Always set a timeout on outbound HTTP calls in a monitoring system.
What's Next
- [ ] SMS alerts via Twilio
- [ ] Response body validation (not just status codes)
- [ ] Public status pages per user
- [ ] Webhook support for Slack / Discord notifications
- [ ] Multi-region checks
Try It
Live: urlzap.me/pulsewatch
GitHub: github.com/vbv0507/api-monitoring-system
Add your own endpoints and watch the dashboard update in real time. If something breaks, you'll know immediately — not hours later like I did.
Questions about any part of the build? Drop them in the comments.
Also built: URLzap — a free URL shortener with custom aliases, because Bit.ly paywalls them. And yes, the short links in this post are from my own shortener.
Tags: #node #javascript #webdev #showdev
Top comments (0)