DEV Community

Thesius Code
Thesius Code

Posted on • Originally published at datanest-stores.pages.dev

MongoDB Operations Toolkit

MongoDB Operations Toolkit

Everything you need to run MongoDB in production with confidence. This toolkit covers schema design patterns that prevent document bloat, aggregation pipelines for real analytics, sharding strategies with step-by-step deployment, automated backup and restore scripts, and monitoring dashboards that surface problems before your users notice. Built for MongoDB 6.0+ and tested on both Atlas and self-hosted deployments.

Key Features

  • 6 schema design patterns — embedded, referenced, bucket, computed, subset, and extended reference with sizing guidelines
  • 15 aggregation pipeline templates for reporting, time-series analysis, graph lookups, and windowed computations
  • Sharding playbook covering shard key selection, chunk balancing, zone-based sharding for geo-distributed deployments
  • Automated backup scripts using mongodump, mongorestore, and Atlas continuous backup with point-in-time recovery
  • Monitoring dashboard configs for Prometheus + Grafana with 12 panels covering oplog lag, connection pools, and WiredTiger cache
  • Index analysis queries that identify unused indexes, missing indexes, and index intersection opportunities
  • Connection pooling tuning for Mongoose, PyMongo, and the native driver with recommended settings per workload type
  • Migration helpers for resharding live collections and rolling index builds without downtime

Quick Start

unzip mongodb-operations-toolkit.zip
cd mongodb-operations-toolkit/

# Connect to your MongoDB instance
mongosh "mongodb://admin:YOUR_PASSWORD_HERE@localhost:27017/admin"

# Run the diagnostic health check
load("src/diagnostics/health_check.js")
# Output: collection sizes, index stats, replication status

# Run the index analysis
load("src/indexes/unused_index_finder.js")
Enter fullscreen mode Exit fullscreen mode

Quick index analysis to find waste:

// Find indexes that haven't been used since last server restart
db.getCollectionNames().forEach(function(coll) {
  var stats = db[coll].aggregate([{ $indexStats: {} }]).toArray();
  stats.forEach(function(idx) {
    if (idx.accesses.ops.valueOf() === 0 && idx.name !== "_id_") {
      print("UNUSED: " + coll + "." + idx.name +
            " | size: " + db[coll].stats().indexSizes[idx.name]);
    }
  });
});
Enter fullscreen mode Exit fullscreen mode

Architecture / How It Works

mongodb-operations-toolkit/
├── src/
│   ├── schema_patterns/
│   │   ├── embedded_pattern.js      # One-to-few relationships
│   │   ├── bucket_pattern.js        # Time-series bucketing
│   │   ├── computed_pattern.js      # Pre-computed aggregations
│   │   └── subset_pattern.js       # Hot/cold data separation
│   ├── aggregation/
│   │   ├── reporting_pipelines.js   # Revenue, user activity, funnels
│   │   ├── time_series.js          # Windowed aggregations
│   │   └── graph_lookups.js        # Recursive $graphLookup examples
│   ├── sharding/
│   │   ├── shard_key_analysis.js   # Evaluate candidate shard keys
│   │   ├── setup_sharded_cluster.sh
│   │   └── zone_sharding.js       # Geo-based zone configuration
│   ├── backup/
│   │   ├── backup.sh              # Automated mongodump with rotation
│   │   ├── restore.sh             # Point-in-time restore procedure
│   │   └── verify_backup.sh       # Backup integrity validation
│   ├── indexes/
│   │   └── unused_index_finder.js  # Find and report unused indexes
│   └── diagnostics/
│       └── health_check.js         # Comprehensive server diagnostics
├── examples/
│   ├── ecommerce_schema.js
│   └── iot_time_series.js
└── config.example.yaml
Enter fullscreen mode Exit fullscreen mode

Usage Examples

Bucket pattern for time-series IoT data:

// Instead of one document per reading (millions of tiny docs),
// bucket into hourly documents (much fewer, larger docs)
db.sensor_readings.insertOne({
  sensor_id: "sensor-42",
  bucket_start: ISODate("2026-03-23T14:00:00Z"),
  bucket_end: ISODate("2026-03-23T15:00:00Z"),
  count: 60,
  readings: [
    { ts: ISODate("2026-03-23T14:00:00Z"), temp: 22.4, humidity: 45 },
    { ts: ISODate("2026-03-23T14:01:00Z"), temp: 22.5, humidity: 44 }
    // ... up to 60 readings per bucket
  ],
  summary: { avg_temp: 22.45, min_temp: 21.8, max_temp: 23.1 }
});

// Index on sensor_id + bucket_start for efficient range queries
db.sensor_readings.createIndex(
  { sensor_id: 1, bucket_start: 1 },
  { name: "idx_sensor_time_range" }
);
Enter fullscreen mode Exit fullscreen mode

Aggregation pipeline for monthly revenue report:

db.orders.aggregate([
  { $match: {
    status: "completed",
    created_at: { $gte: ISODate("2026-01-01"), $lt: ISODate("2026-04-01") }
  }},
  { $group: {
    _id: { $dateToString: { format: "%Y-%m", date: "$created_at" } },
    total_revenue: { $sum: "$total" },
    order_count: { $sum: 1 },
    avg_order_value: { $avg: "$total" }
  }},
  { $sort: { _id: 1 } },
  { $project: {
    month: "$_id",
    total_revenue: { $round: ["$total_revenue", 2] },
    order_count: 1,
    avg_order_value: { $round: ["$avg_order_value", 2] }
  }}
]);
Enter fullscreen mode Exit fullscreen mode

Automated backup script with retention:

#!/bin/bash
# backup.sh — run via cron: 0 2 * * * /opt/mongodb/backup.sh
BACKUP_DIR="/backups/mongodb/$(date +%Y%m%d_%H%M%S)"
RETENTION_DAYS=30
MONGO_URI="mongodb://backup_user:YOUR_PASSWORD_HERE@localhost:27017"

mkdir -p "$BACKUP_DIR"
mongodump --uri="$MONGO_URI" --out="$BACKUP_DIR" --gzip --oplog

# Verify backup integrity
mongorestore --uri="$MONGO_URI" --dir="$BACKUP_DIR" --dryRun --gzip 2>&1 \
  | grep -q "done" && echo "Backup verified" || echo "BACKUP VERIFICATION FAILED"

# Rotate old backups
find /backups/mongodb -maxdepth 1 -type d -mtime +$RETENTION_DAYS -exec rm -rf {} \;
Enter fullscreen mode Exit fullscreen mode

Configuration

# config.example.yaml
mongodb:
  uri: "mongodb://admin:YOUR_PASSWORD_HERE@localhost:27017/admin"
  database: myapp
  auth_source: admin

backup:
  schedule: "0 2 * * *"        # daily at 2 AM
  retention_days: 30
  compression: gzip
  include_oplog: true          # required for point-in-time recovery
  verify_after_backup: true

sharding:
  enabled: false
  config_servers: 3            # always use 3 for production
  shards: 2
  default_chunk_size_mb: 128

monitoring:
  exporter_port: 9216
  scrape_interval: 15s
  alert_on_repl_lag_seconds: 10
  alert_on_connections_percent: 80
Enter fullscreen mode Exit fullscreen mode

Best Practices

  1. Choose shard keys based on query patterns, not data distribution alone. A monotonically increasing shard key creates a "hot shard" that handles all new writes.
  2. Cap embedded arrays at 500 elements. Beyond that, use the bucket pattern or move to a referenced design to avoid document growth limits.
  3. Build indexes in the background on production replica sets. Use db.collection.createIndex({field: 1}, {background: true}) to avoid blocking reads.
  4. Always include --oplog in mongodump for replica sets. Without it, you cannot do point-in-time recovery.
  5. Monitor WiredTiger cache usage. If cache dirty percentage stays above 20%, your write workload is exceeding disk flush capacity.
  6. Use read preference secondaryPreferred for reporting queries to reduce load on the primary node.

Troubleshooting

Problem Cause Fix
Slow queries despite correct indexes Index not fitting in RAM Check db.collection.stats().indexSizes and ensure total index size < WiredTiger cache
Replication lag increasing Large write batches or slow secondary Check oplog window with rs.printReplicationInfo() and resize oplog if under 24h
mongodump fails mid-backup Insufficient disk space or auth error Verify free space with df -h and ensure backup user has backup role
Document exceeds 16MB limit Unbounded embedded array growth Migrate to bucket pattern or referenced design; add app-level size guard

This is 1 of 9 resources in the Database Admin Pro toolkit. Get the complete [MongoDB Operations Toolkit] with all files, templates, and documentation for $39.

Get the Full Kit →

Or grab the entire Database Admin Pro bundle (9 products) for $109 — save 30%.

Get the Complete Bundle →


Related Articles

Top comments (0)