When you're building a platform that indexes thousands of game servers with real-time data, your database design makes or breaks you. A bad schema means slow queries, missing data, and frustrated users. A good one scales effortlessly.
Here's how I designed the database behind Minecraft ServerHub, a server discovery platform that tracks 5000+ servers with live player counts, uptime history, and community voting.
The Core Problem
A Minecraft server listing needs to:
- Store server metadata (name, IP, port, version, description)
- Track real-time status (online/offline, player count, latency)
- Calculate uptime from historical pings
- Support voting, reviews, and rankings
- Handle search with filters (gamemode, version, country, player count)
- Aggregate statistics across the entire ecosystem
All of this needs to feel fast — sub-200ms response times for listing pages.
Schema Design: The Server Model
The Server model is the core of everything. Here's the simplified version:
model Server {
id String -- pk -- primary-key, cuid()
name String
slug String -- unique
ipAddress String
port Int -- default(25565)
edition String -- default("java")
// Status fields updated by ping system
isOnline Boolean -- default(false)
playersOnline Int -- default(0)
maxPlayers Int -- default(0)
version String?
software String?
uptime Float -- default(0)
latency Int?
// Ranking signals
votesThisMonth Int -- default(0)
totalClicks Int -- default(0)
uniqueVisitors Int -- default(0)
// Lifecycle
status ServerStatus -- default(PENDING)
tier ServerTier -- default(FREE)
// Relations
pingHistory PingHistory[]
reviews Review[]
votes Vote[]
tags ServerTag[]
@-- index([status, votesThisMonth])
@-- index([status, country])
@-- index([status, createdAt])
}
Why These Indexes?
The three composite indexes cover our most common query patterns:
-
[status, votesThisMonth]— The default server listing sorted by votes. This is the homepage query. -
[status, country]— Country-specific pages like/servers/country/us. We have 30+ country landing pages. -
[status, createdAt]— "Recently added" sorting.
Every query filters by status = ACTIVE first, which is why status is always the first field in the composite index.
The Dual-Table Strategy
We have two server tables: Server (user-submitted, with ownership) and SeedServer (pre-populated from discovery crawlers).
Why two tables instead of a source field on one table?
-
Different data models —
Serverhas ownership, verification, premium tiers, votes.SeedServeris just raw data. -
Query isolation — Listing queries only hit
Server. Search hits both withUNION. - No contamination — Seed data never appears in rankings until a user claims and verifies it.
Search across both tables:
async function searchServers(query: string) {
const [servers, seedServers] = await Promise.all([
prisma.server.findMany({
where: {
status: "ACTIVE",
OR: [
{ name: { contains: query, mode: "insensitive" } },
{ ipAddress: { contains: query } },
],
},
take: 20,
orderBy: { votesThisMonth: "desc" },
}),
prisma.seedServer.findMany({
where: {
OR: [
{ name: { contains: query, mode: "insensitive" } },
{ ipAddress: { contains: query } },
],
},
take: 10,
}),
]);
return [...servers, ...seedServers.filter(
(seed) => !servers.some((s) => s.ipAddress === seed.ipAddress)
)];
}
Ping History: Time-Series Without TimescaleDB
We track every ping in a PingHistory table. With 5000 servers pinged every 5 minutes, that's ~1.4 million rows per day.
model PingHistory {
id String -- pk -- primary-key, cuid()
serverId String
isOnline Boolean
players Int -- default(0)
latency Int?
createdAt DateTime -- auto-timestamp
server Server -- relation(fields: [serverId], references: [id])
@-- index([serverId, createdAt])
}
The composite index on [serverId, createdAt] is essential. Without it, calculating uptime for a single server would scan millions of rows.
For uptime, we use groupBy instead of loading all rows:
const pings = await prisma.pingHistory.groupBy({
by: ["isOnline"],
where: {
serverId,
createdAt: { gte: thirtyDaysAgo },
},
_count: true,
});
This returns just 2 rows (online count and offline count) regardless of how many pings exist. PostgreSQL handles the aggregation.
The Voting System: Preventing Abuse
Votes determine server ranking, so they're a target for manipulation:
model Vote {
id String -- pk -- primary-key, cuid()
serverId String
ipHash String // SHA256 of IP — no raw IPs stored
minecraftUsername String?
fingerprint String?
votedAt DateTime -- auto-timestamp
@-- unique([serverId, ipHash, votedAt])
@-- index([serverId, votedAt])
}
Key design decisions:
- IP hashing — We never store raw IP addresses. SHA256 hash is enough for deduplication.
- Composite unique constraint — One vote per IP per server per day.
- Fingerprinting — Browser fingerprint catches multi-account voting.
- Votifier integration — Servers can verify votes and reward players.
Aggregated Statistics
The statistics page shows ecosystem-wide data. We pre-aggregate this instead of computing it live:
const [
totalServers,
onlineServers,
totalPlayers,
gamemodeDistribution,
versionDistribution,
] = await Promise.all([
prisma.server.count({ where: { status: "ACTIVE" } }),
prisma.server.count({ where: { status: "ACTIVE", isOnline: true } }),
prisma.server.aggregate({ where: { status: "ACTIVE" }, _sum: { playersOnline: true } }),
prisma.serverTag.groupBy({ by: ["tagId"], _count: true }),
prisma.server.groupBy({ by: ["version"], _count: true, where: { status: "ACTIVE" } }),
]);
With Redis caching at a 5-minute TTL, the statistics page loads in under 100ms.
Lessons Learned
1. Composite indexes are not optional at scale. Our most important optimization was adding @-- index([status, votesThisMonth]). Before that, the homepage query took 800ms. After: 12ms.
2. Two tables > one table with a type field. The Server/SeedServer split keeps queries clean and prevents accidental mixing of verified and unverified data.
3. Aggregate, don't scan. Using groupBy for uptime calculation instead of loading all ping records reduced memory usage by 100x.
4. Hash sensitive data at write time. Storing IP hashes instead of raw IPs means a database breach doesn't expose user locations.
5. Pre-compute expensive stats. The statistics page runs 18 queries, but they're all served from cache 99% of the time.
The Numbers
The full schema has 70+ models covering servers, users, analytics, payments, and moderation. At peak:
- 5000+ servers indexed
- 170,000+ players tracked
- ~1.4M ping records/day
- < 200ms p95 response time
- PostgreSQL handling it all on a single instance
You can see the live data on the Minecraft ServerHub statistics dashboard.
Building something similar? The key is starting with the right indexes. Profile your top 5 queries, add composite indexes for each, and cache aggressively. Your database will thank you.
Top comments (0)