Every day, millions of Minecraft players search for multiplayer servers. But finding one that's actually online, has players, and runs the right version? That's surprisingly hard.
I built Minecraft ServerHub to solve this. It pings 5000+ servers every 5 minutes and shows real-time player counts, uptime history, and version info. Here's how the monitoring system works under the hood.
The Architecture
The stack:
- Next.js 14 (App Router) for the frontend and API routes
- PostgreSQL with Prisma ORM for persistent storage
- Redis for caching and rate limiting
- minecraft-server-util for direct server pings
The key insight: instead of relying on third-party APIs (which have rate limits), we ping Minecraft servers directly using the Minecraft protocol.
Direct Server Pinging
Minecraft servers speak a well-defined protocol. When you connect to a server, the first thing that happens is a "status" handshake that returns the server's MOTD, player count, version, and favicon.
We use the minecraft-server-util library to handle this:
import { status, statusBedrock } from "minecraft-server-util";
async function pingServer(ip: string, port: number, edition: "java" | "bedrock") {
const timeout = 5000; // 5 second timeout
if (edition === "bedrock") {
const response = await statusBedrock(ip, port, { timeout });
return {
online: true,
players: response.players.online,
maxPlayers: response.players.max,
version: response.version.name,
motd: response.motd.clean,
};
}
const response = await status(ip, port, { timeout });
return {
online: true,
players: response.players.online,
maxPlayers: response.players.max,
version: response.version.name,
motd: response.motd.clean,
favicon: response.favicon ?? null,
};
}
This gives us zero API rate limits. The only constraint is network throughput.
Batch Processing at Scale
Pinging 5000 servers sequentially would take forever. So we batch them with controlled concurrency:
const BATCH_SIZE = 400;
const CONCURRENT_PINGS = 15;
async function processBatch(servers: Server[]) {
const results = [];
for (let i = 0; i < servers.length; i += CONCURRENT_PINGS) {
const chunk = servers.slice(i, i + CONCURRENT_PINGS);
const chunkResults = await Promise.allSettled(
chunk.map((server) =>
pingServer(server.ipAddress, server.port, server.edition)
)
);
results.push(...chunkResults);
}
return results;
}
Why Promise.allSettled instead of Promise.all? Because servers go offline. We don't want one timeout to kill the entire batch.
With 15 concurrent connections and a 5-second timeout, worst case we process 15 servers every 5 seconds. That's 180 servers per minute — enough to cover our entire database in under 30 minutes, well within our 5-minute ping cycle when spread across cron invocations.
Multi-Layer Caching with Redis
Every ping result is cached to avoid hitting the database on every page load:
const CACHE_TTLS = {
SERVER_LIST: 300, // 5 minutes
SERVER_DETAIL: 300, // 5 minutes
SERVER_PING: 360, // 6 minutes (slightly longer than ping interval)
TAGS: 3600, // 1 hour
GLOBAL_STATS: 86400, // 1 day
};
async function getCachedServer(id: string) {
const cacheKey = `server:${id}`;
// Try Redis first
const cached = await redis.get(cacheKey);
if (cached) return JSON.parse(cached);
// Fall back to database
const server = await prisma.server.findUnique({ where: { id } });
// Cache for next request
if (server) {
await redis.setex(cacheKey, CACHE_TTLS.SERVER_DETAIL, JSON.stringify(server));
}
return server;
}
We also have an in-memory fallback cache for when Redis is unavailable. It uses a stale-while-revalidate pattern with 2x TTL:
class MemoryCache {
private cache = new Map<string, { data: any; expires: number; stale: number }>();
get(key: string) {
const entry = this.cache.get(key);
if (!entry) return null;
if (Date.now() < entry.expires) return entry.data; // Fresh
if (Date.now() < entry.stale) return entry.data; // Stale but usable
this.cache.delete(key);
return null;
}
}
Storing Ping History for Uptime Calculation
Every ping creates a PingHistory record:
model PingHistory {
id String @id @default(cuid())
serverId String
isOnline Boolean
players Int @default(0)
latency Int? // round-trip ms
createdAt DateTime @default(now())
server Server @relation(fields: [serverId], references: [id])
@@index([serverId, createdAt]) // Composite index for uptime queries
}
Uptime is calculated from a 30-day rolling window:
async function calculateUptime(serverId: string): Promise<number> {
const thirtyDaysAgo = new Date(Date.now() - 30 * 24 * 60 * 60 * 1000);
const pings = await prisma.pingHistory.groupBy({
by: ["isOnline"],
where: {
serverId,
createdAt: { gte: thirtyDaysAgo },
},
_count: true,
});
const online = pings.find((p) => p.isOnline)?._count ?? 0;
const total = pings.reduce((sum, p) => sum + p._count, 0);
return total > 0 ? Math.round((online / total) * 10000) / 100 : 0;
}
The composite index on [serverId, createdAt] is critical — without it, this query would do a full table scan on millions of rows.
The Cron Setup
We use Next.js API routes as cron endpoints, authenticated with a secret:
// /api/cron/ping-servers/route.ts
export async function GET(request: NextRequest) {
const authHeader = request.headers.get("authorization");
if (authHeader !== `Bearer ${process.env.CRON_SECRET}`) {
return NextResponse.json({ error: "Unauthorized" }, { status: 401 });
}
const servers = await getNextBatch(); // Gets servers not pinged recently
const results = await processBatch(servers);
await savePingResults(results);
return NextResponse.json({
processed: results.length,
online: results.filter(r => r.status === "fulfilled").length,
});
}
An external cron service (we use cron-job.org) calls this endpoint every 5 minutes. The batch system ensures each invocation processes a different set of servers.
Embeddable Status Badges
One feature that turned out to be really popular: embeddable SVG badges that server owners put on their websites.
https://minecraft-serverhub.com/api/badge/play.example.com?style=rounded&players=true
This returns a dynamic SVG that shows online/offline status and player count. The SVG is regenerated on each request with a 2-minute cache. Server owners embed it on their websites, forums, and Discord — and each badge links back to the server's page on ServerHub.
Results
After 3 months of running this system:
- 5000+ servers monitored across Java and Bedrock editions
- 170,000+ players tracked simultaneously at peak
- 99.7% uptime on the monitoring system itself
- < 200ms average response time thanks to Redis caching
The full platform is live at minecraft-serverhub.com with a live statistics dashboard showing real-time ecosystem data.
Key Takeaways
- Direct protocol access beats APIs — no rate limits, faster, more reliable
-
Batch with concurrency control —
Promise.allSettledwith chunking handles failures gracefully - Multi-layer caching is essential — Redis + in-memory prevents cascade failures
- Composite indexes matter — a missing index on ping history turned a 2-second query into 200ms
- SVG badges are link magnets — server owners embed them everywhere
If you're building any kind of monitoring system, the pattern of "direct protocol ping → batch process → cache aggressively → serve from cache" scales surprisingly well with minimal infrastructure.
Check out the server badge generator if you run a Minecraft server — it's free and takes 30 seconds to set up.
Top comments (0)