MongoDB Operations Toolkit
Everything you need to run MongoDB in production with confidence. This toolkit covers schema design patterns that prevent document bloat, aggregation pipelines for real analytics, sharding strategies with step-by-step deployment, automated backup and restore scripts, and monitoring dashboards that surface problems before your users notice. Built for MongoDB 6.0+ and tested on both Atlas and self-hosted deployments.
Key Features
- 6 schema design patterns — embedded, referenced, bucket, computed, subset, and extended reference with sizing guidelines
- 15 aggregation pipeline templates for reporting, time-series analysis, graph lookups, and windowed computations
- Sharding playbook covering shard key selection, chunk balancing, zone-based sharding for geo-distributed deployments
-
Automated backup scripts using
mongodump,mongorestore, and Atlas continuous backup with point-in-time recovery - Monitoring dashboard configs for Prometheus + Grafana with 12 panels covering oplog lag, connection pools, and WiredTiger cache
- Index analysis queries that identify unused indexes, missing indexes, and index intersection opportunities
- Connection pooling tuning for Mongoose, PyMongo, and the native driver with recommended settings per workload type
- Migration helpers for resharding live collections and rolling index builds without downtime
Quick Start
unzip mongodb-operations-toolkit.zip
cd mongodb-operations-toolkit/
# Connect to your MongoDB instance
mongosh "mongodb://admin:YOUR_PASSWORD_HERE@localhost:27017/admin"
# Run the diagnostic health check
load("src/diagnostics/health_check.js")
# Output: collection sizes, index stats, replication status
# Run the index analysis
load("src/indexes/unused_index_finder.js")
Quick index analysis to find waste:
// Find indexes that haven't been used since last server restart
db.getCollectionNames().forEach(function(coll) {
var stats = db[coll].aggregate([{ $indexStats: {} }]).toArray();
stats.forEach(function(idx) {
if (idx.accesses.ops.valueOf() === 0 && idx.name !== "_id_") {
print("UNUSED: " + coll + "." + idx.name +
" | size: " + db[coll].stats().indexSizes[idx.name]);
}
});
});
Architecture / How It Works
mongodb-operations-toolkit/
├── src/
│ ├── schema_patterns/
│ │ ├── embedded_pattern.js # One-to-few relationships
│ │ ├── bucket_pattern.js # Time-series bucketing
│ │ ├── computed_pattern.js # Pre-computed aggregations
│ │ └── subset_pattern.js # Hot/cold data separation
│ ├── aggregation/
│ │ ├── reporting_pipelines.js # Revenue, user activity, funnels
│ │ ├── time_series.js # Windowed aggregations
│ │ └── graph_lookups.js # Recursive $graphLookup examples
│ ├── sharding/
│ │ ├── shard_key_analysis.js # Evaluate candidate shard keys
│ │ ├── setup_sharded_cluster.sh
│ │ └── zone_sharding.js # Geo-based zone configuration
│ ├── backup/
│ │ ├── backup.sh # Automated mongodump with rotation
│ │ ├── restore.sh # Point-in-time restore procedure
│ │ └── verify_backup.sh # Backup integrity validation
│ ├── indexes/
│ │ └── unused_index_finder.js # Find and report unused indexes
│ └── diagnostics/
│ └── health_check.js # Comprehensive server diagnostics
├── examples/
│ ├── ecommerce_schema.js
│ └── iot_time_series.js
└── config.example.yaml
Usage Examples
Bucket pattern for time-series IoT data:
// Instead of one document per reading (millions of tiny docs),
// bucket into hourly documents (much fewer, larger docs)
db.sensor_readings.insertOne({
sensor_id: "sensor-42",
bucket_start: ISODate("2026-03-23T14:00:00Z"),
bucket_end: ISODate("2026-03-23T15:00:00Z"),
count: 60,
readings: [
{ ts: ISODate("2026-03-23T14:00:00Z"), temp: 22.4, humidity: 45 },
{ ts: ISODate("2026-03-23T14:01:00Z"), temp: 22.5, humidity: 44 }
// ... up to 60 readings per bucket
],
summary: { avg_temp: 22.45, min_temp: 21.8, max_temp: 23.1 }
});
// Index on sensor_id + bucket_start for efficient range queries
db.sensor_readings.createIndex(
{ sensor_id: 1, bucket_start: 1 },
{ name: "idx_sensor_time_range" }
);
Aggregation pipeline for monthly revenue report:
db.orders.aggregate([
{ $match: {
status: "completed",
created_at: { $gte: ISODate("2026-01-01"), $lt: ISODate("2026-04-01") }
}},
{ $group: {
_id: { $dateToString: { format: "%Y-%m", date: "$created_at" } },
total_revenue: { $sum: "$total" },
order_count: { $sum: 1 },
avg_order_value: { $avg: "$total" }
}},
{ $sort: { _id: 1 } },
{ $project: {
month: "$_id",
total_revenue: { $round: ["$total_revenue", 2] },
order_count: 1,
avg_order_value: { $round: ["$avg_order_value", 2] }
}}
]);
Automated backup script with retention:
#!/bin/bash
# backup.sh — run via cron: 0 2 * * * /opt/mongodb/backup.sh
BACKUP_DIR="/backups/mongodb/$(date +%Y%m%d_%H%M%S)"
RETENTION_DAYS=30
MONGO_URI="mongodb://backup_user:YOUR_PASSWORD_HERE@localhost:27017"
mkdir -p "$BACKUP_DIR"
mongodump --uri="$MONGO_URI" --out="$BACKUP_DIR" --gzip --oplog
# Verify backup integrity
mongorestore --uri="$MONGO_URI" --dir="$BACKUP_DIR" --dryRun --gzip 2>&1 \
| grep -q "done" && echo "Backup verified" || echo "BACKUP VERIFICATION FAILED"
# Rotate old backups
find /backups/mongodb -maxdepth 1 -type d -mtime +$RETENTION_DAYS -exec rm -rf {} \;
Configuration
# config.example.yaml
mongodb:
uri: "mongodb://admin:YOUR_PASSWORD_HERE@localhost:27017/admin"
database: myapp
auth_source: admin
backup:
schedule: "0 2 * * *" # daily at 2 AM
retention_days: 30
compression: gzip
include_oplog: true # required for point-in-time recovery
verify_after_backup: true
sharding:
enabled: false
config_servers: 3 # always use 3 for production
shards: 2
default_chunk_size_mb: 128
monitoring:
exporter_port: 9216
scrape_interval: 15s
alert_on_repl_lag_seconds: 10
alert_on_connections_percent: 80
Best Practices
- Choose shard keys based on query patterns, not data distribution alone. A monotonically increasing shard key creates a "hot shard" that handles all new writes.
- Cap embedded arrays at 500 elements. Beyond that, use the bucket pattern or move to a referenced design to avoid document growth limits.
-
Build indexes in the background on production replica sets. Use
db.collection.createIndex({field: 1}, {background: true})to avoid blocking reads. -
Always include
--oploginmongodumpfor replica sets. Without it, you cannot do point-in-time recovery. - Monitor WiredTiger cache usage. If cache dirty percentage stays above 20%, your write workload is exceeding disk flush capacity.
-
Use read preference
secondaryPreferredfor reporting queries to reduce load on the primary node.
Troubleshooting
| Problem | Cause | Fix |
|---|---|---|
| Slow queries despite correct indexes | Index not fitting in RAM | Check db.collection.stats().indexSizes and ensure total index size < WiredTiger cache |
| Replication lag increasing | Large write batches or slow secondary | Check oplog window with rs.printReplicationInfo() and resize oplog if under 24h |
mongodump fails mid-backup |
Insufficient disk space or auth error | Verify free space with df -h and ensure backup user has backup role |
| Document exceeds 16MB limit | Unbounded embedded array growth | Migrate to bucket pattern or referenced design; add app-level size guard |
This is 1 of 9 resources in the Database Admin Pro toolkit. Get the complete [MongoDB Operations Toolkit] with all files, templates, and documentation for $39.
Or grab the entire Database Admin Pro bundle (9 products) for $109 — save 30%.
Top comments (0)