DEV Community

Cover image for Building a Production-Grade MongoDB Cluster on Kubernetes: A Complete Guide to Horizontal Scalability
Ahmed Rakan
Ahmed Rakan

Posted on

Building a Production-Grade MongoDB Cluster on Kubernetes: A Complete Guide to Horizontal Scalability

Introduction

Distributed systems expertise remains one of the most sought-after skills in software engineering. Engineers who can design, implement, and scale distributed databases command premium compensation for good reason—these systems form the backbone of modern applications serving millions of users.

In this comprehensive guide, we'll build a highly available, horizontally scalable MongoDB cluster using Kubernetes. You'll learn how to create a production-ready database infrastructure that can grow from a single node to hundreds of nodes, scaling seamlessly to meet demanding workloads.

Technology Stack

Our infrastructure leverages three powerful open-source technologies:

  • MongoDB: A distributed NoSQL database designed for horizontal scalability and high availability
  • MicroK8s: An ultra-lightweight Kubernetes distribution from Canonical (the creators of Ubuntu), optimized for both development and production environments
  • OpenEBS: A cloud-native distributed storage solution for Kubernetes that provides persistent volume management

This combination enables true horizontal scalability—you can expand your cluster's capacity by adding more nodes rather than being limited by vertical scaling constraints.

Prerequisites and Initial Setup

Installing MicroK8s

First, set up your MicroK8s cluster. The installation process is straightforward and well-documented:

MicroK8s Getting Started Guide

Follow the official documentation to install MicroK8s on your nodes. Once complete, verify your installation:

microk8s status --wait-ready
Enter fullscreen mode Exit fullscreen mode

Enabling OpenEBS Storage

OpenEBS integration with MicroK8s is remarkably simple, requiring just two commands:

microk8s enable community
microk8s enable openebs
Enter fullscreen mode Exit fullscreen mode

These commands enable the community addon repository and install OpenEBS components into your cluster.

Configuring Distributed Storage

Installing iSCSI on Every Node

OpenEBS relies on iSCSI (Internet Small Computer Systems Interface) for distributed block storage. This protocol enables nodes to access block-level storage over TCP/IP networks, which is essential for our distributed architecture.

Critical: Install iSCSI on every node in your cluster:

sudo apt update
sudo apt install open-iscsi
sudo systemctl enable open-iscsi
sudo systemctl enable iscsid
sudo systemctl start iscsid
Enter fullscreen mode Exit fullscreen mode

Verify that the iSCSI daemon is running:

systemctl status iscsid
Enter fullscreen mode Exit fullscreen mode

You should see output similar to:

● iscsid.service - iSCSI initiator daemon (iscsid)
     Loaded: loaded (/lib/systemd/system/iscsid.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2025-11-18 20:05:03 +03; 1 week 1 day ago
TriggeredBy: ● iscsid.socket
       Docs: man:iscsid(8)
   Main PID: 9887 (iscsid)
      Tasks: 2 (limit: 9298)
     Memory: 5.4M
        CPU: 9.154s
     CGroup: /system.slice/iscsid.service
             ├─9886 /sbin/iscsid
             └─9887 /sbin/iscsid
Enter fullscreen mode Exit fullscreen mode

Verifying Storage Classes

Check that OpenEBS storage classes are available in your cluster:

kubectl get storageclass
Enter fullscreen mode Exit fullscreen mode

Expected output includes multiple OpenEBS storage classes:

NAME                          PROVISIONER            RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
microk8s-hostpath (default)   microk8s.io/hostpath   Delete          WaitForFirstConsumer   false                  256d
openebs-device                openebs.io/local       Delete          WaitForFirstConsumer   false                  7d20h
openebs-hostpath              openebs.io/local       Delete          WaitForFirstConsumer   false                  7d20h
openebs-jiva                  jiva.csi.openebs.io    Delete          Immediate              true                   44m
openebs-jiva-csi-default      jiva.csi.openebs.io    Delete          Immediate              true                   7d20h
Enter fullscreen mode Exit fullscreen mode

Configuring High Availability with Replica Count

For production deployments, configure the replication factor for your storage. This ensures data redundancy and high availability:

cat <<EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: openebs-jiva
provisioner: jiva.csi.openebs.io
parameters:
  replicaCount: "3"  # Use 3 for production, 2 minimum for HA
  policy: openebs-policy-default
allowVolumeExpansion: true
EOF
Enter fullscreen mode Exit fullscreen mode

Replication guidelines:

  • 3 replicas: Recommended for production (tolerates 1 node failure)
  • 2 replicas: Minimum for high availability
  • Ensure your cluster has at least as many nodes as your replica count

Deploying MongoDB

Creating the Namespace and Service Account

First, create a dedicated namespace for your databases:

kubectl create namespace databases
Enter fullscreen mode Exit fullscreen mode

Now create a service account with appropriate permissions. MongoDB's sidecar container needs to discover other pods in the replica set:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: mongo
  namespace: databases

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: read-pod-service-endpoint
rules:
- apiGroups: [""]
  resources: ["pods", "services", "endpoints"]
  verbs: ["get", "list", "watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:serviceaccount:databases:mongo
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: read-pod-service-endpoint
subjects:
- kind: ServiceAccount
  name: mongo
  namespace: databases
Enter fullscreen mode Exit fullscreen mode

Apply the configuration:

kubectl apply -f service-account.yaml
Enter fullscreen mode Exit fullscreen mode

Creating Credentials Secret

Before deploying MongoDB, create a secret for authentication:

kubectl create secret generic mongo-secret \
  --from-literal=mongo-user=admin \
  --from-literal=mongo-password='YourSecurePassword123!' \
  -n databases
Enter fullscreen mode Exit fullscreen mode

Security note: In production, use a secrets management solution like HashiCorp Vault or sealed-secrets instead of plain Kubernetes secrets.

Deploying the StatefulSet

StatefulSets are designed for stateful applications like databases. Unlike Deployments, they provide:

  • Stable, unique network identifiers
  • Stable, persistent storage
  • Ordered, graceful deployment and scaling

Here's the complete MongoDB StatefulSet configuration:

apiVersion: v1
kind: Service
metadata:
  name: mongo
  namespace: databases
  labels:
    name: mongo
spec:
  selector:
    app: mongo
  ports:
  - protocol: TCP
    port: 27017
    targetPort: 27017
  clusterIP: None  # Headless service for StatefulSet

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mongo
  namespace: databases
  labels:
    app: mongo
spec:
  serviceName: "mongo"
  replicas: 3
  selector:
    matchLabels:
      app: mongo
  template:
    metadata:
      labels:
        app: mongo
        role: mongo
        environment: production
    spec:
      serviceAccountName: mongo
      automountServiceAccountToken: true
      terminationGracePeriodSeconds: 30

      containers:
      - name: mongodb
        image: mongo:5.0
        command:
        - mongod
        - "--replSet=rs0"
        - "--bind_ip=0.0.0.0"
        ports:
        - name: mongodb
          containerPort: 27017
          protocol: TCP

        resources:
          requests:
            memory: "1Gi"
            cpu: "1"
          limits:
            memory: "2Gi"
            cpu: "2"

        env:
        - name: MONGO_INITDB_ROOT_USERNAME
          valueFrom:
            secretKeyRef:
              name: mongo-secret
              key: mongo-user
        - name: MONGO_INITDB_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mongo-secret
              key: mongo-password

        volumeMounts:
        - name: mongo-persistent-storage
          mountPath: /data/db

        livenessProbe:
          exec:
            command:
            - mongosh
            - --eval
            - "db.adminCommand('ping')"
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5

        readinessProbe:
          exec:
            command:
            - mongosh
            - --eval
            - "db.adminCommand('ping')"
          initialDelaySeconds: 5
          periodSeconds: 5
          timeoutSeconds: 3

      - name: mongo-sidecar
        image: morphy/k8s-mongo-sidecar
        env:
        - name: KUBERNETES_POD_LABELS
          value: "app=mongo,role=mongo"
        - name: KUBERNETES_SERVICE_NAME
          value: "mongo"
        - name: KUBERNETES_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: MONGODB_USERNAME
          valueFrom:
            secretKeyRef:
              name: mongo-secret
              key: mongo-user
        - name: MONGODB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mongo-secret
              key: mongo-password

  volumeClaimTemplates:
  - metadata:
      name: mongo-persistent-storage
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: "openebs-jiva-csi-default"
      resources:
        requests:
          storage: 50Gi
Enter fullscreen mode Exit fullscreen mode

Key configuration details:

  • Headless Service (clusterIP: None): Provides stable DNS entries for each pod (mongo-0, mongo-1, mongo-2)
  • Volume Claim Templates: Each pod gets its own 50GB persistent volume
  • Health Probes: Liveness and readiness probes ensure pod health
  • Sidecar Container: Automatically manages replica set configuration
  • Resource Limits: Prevents resource exhaustion and enables proper scheduling

Deploy the StatefulSet:

kubectl apply -f statefulset.yaml
Enter fullscreen mode Exit fullscreen mode

Monitor the deployment:

kubectl get pods -n databases -w
Enter fullscreen mode Exit fullscreen mode

Wait for all pods to reach the Running state.

Initializing the Replica Set

Once all pods are running, initialize the MongoDB replica set. Connect to the first pod:

kubectl exec -it mongo-0 -n databases -- mongosh -u admin -p
Enter fullscreen mode Exit fullscreen mode

Enter your password when prompted, then initialize the replica set:

rs.initiate({
  _id: "rs0",
  version: 1,
  members: [
    { _id: 0, host: "mongo-0.mongo.databases.svc.cluster.local:27017" },
    { _id: 1, host: "mongo-1.mongo.databases.svc.cluster.local:27017" },
    { _id: 2, host: "mongo-2.mongo.databases.svc.cluster.local:27017" }
  ]
})
Enter fullscreen mode Exit fullscreen mode

The DNS names follow the pattern: <pod-name>.<service-name>.<namespace>.svc.cluster.local:27017

Verifying Replica Set Status

Check the replica set configuration:

rs.status()
Enter fullscreen mode Exit fullscreen mode

Look for these key indicators of a healthy cluster:

  • "ok": 1 at the end of the output
  • One PRIMARY member
  • Two SECONDARY members
  • All members showing "health": 1

Example output:

{
  set: 'rs0',
  members: [
    {
      _id: 0,
      name: 'mongo-0.mongo.databases.svc.cluster.local:27017',
      health: 1,
      state: 1,
      stateStr: 'PRIMARY',
      ...
    },
    {
      _id: 1,
      name: 'mongo-1.mongo.databases.svc.cluster.local:27017',
      health: 1,
      state: 2,
      stateStr: 'SECONDARY',
      ...
    },
    ...
  ],
  ok: 1
}
Enter fullscreen mode Exit fullscreen mode

Testing and Validation

Testing Data Persistence

Let's verify that data persists correctly across the cluster. Insert a test document:

kubectl exec -it -n databases mongo-1 -- mongosh -u admin -p --eval '
db = db.getSiblingDB("testdb");
db.testcollection.insertOne({
  name: "test-record",
  timestamp: new Date(),
  message: "Testing OpenEBS Jiva storage",
  node: "mongo-1"
});
db.testcollection.find().pretty();
'
Enter fullscreen mode Exit fullscreen mode

Expected output:

[
  {
    _id: ObjectId('6926f6c7d79787542f544ca7'),
    name: 'test-record',
    timestamp: ISODate('2025-11-26T12:47:03.263Z'),
    message: 'Testing OpenEBS Jiva storage',
    node: 'mongo-1'
  }
]
Enter fullscreen mode Exit fullscreen mode

Now verify the data is replicated by querying a different pod:

kubectl exec -it -n databases mongo-0 -- mongosh -u admin -p --eval '
db = db.getSiblingDB("testdb");
db.testcollection.find().pretty();
'
Enter fullscreen mode Exit fullscreen mode

You should see the same document, confirming replication is working.

Testing High Availability

The true test of high availability is failover. Let's simulate a node failure by deleting the primary pod.

First, identify the primary:

kubectl exec -n databases mongo-0 -- mongosh -u admin -p --eval "rs.isMaster().primary"
Enter fullscreen mode Exit fullscreen mode

Output example:

mongo-1.mongo.databases.svc.cluster.local:27017
Enter fullscreen mode Exit fullscreen mode

Now delete the primary pod:

kubectl delete pod mongo-1 -n databases
Enter fullscreen mode Exit fullscreen mode

The replica set should automatically elect a new primary. Check the new primary:

kubectl exec -n databases mongo-0 -- mongosh -u admin -p --eval "rs.isMaster().primary"
Enter fullscreen mode Exit fullscreen mode

Output:

mongo-0.mongo.databases.svc.cluster.local:27017
Enter fullscreen mode Exit fullscreen mode

What just happened?

  1. The primary pod was deleted
  2. The remaining members detected the failure within seconds
  3. An automatic election occurred
  4. A new primary was elected
  5. Kubernetes recreated the deleted pod
  6. The recreated pod rejoined as a secondary

This demonstrates true high availability—your application experiences minimal disruption during node failures.

Testing Read Operations During Failover

For a more realistic test, run continuous read operations while deleting a pod:

# In terminal 1, start continuous reads
while true; do 
  kubectl exec -n databases mongo-0 -- mongosh -u admin -p --quiet --eval '
    db.getSiblingDB("testdb").testcollection.findOne()
  ' 2>/dev/null && echo "✓ Read successful" || echo "✗ Read failed"
  sleep 1
done

# In terminal 2, delete the primary
kubectl delete pod mongo-1 -n databases
Enter fullscreen mode Exit fullscreen mode

You'll notice only a brief interruption (typically 5-10 seconds) during the election process.

Scaling Your Cluster

Horizontal Scaling

To scale your MongoDB cluster, simply increase the replica count:

kubectl scale statefulset mongo --replicas=5 -n databases
Enter fullscreen mode Exit fullscreen mode

After the new pods are running, add them to the replica set:

kubectl exec -it mongo-0 -n databases -- mongosh -u admin -p
Enter fullscreen mode Exit fullscreen mode
rs.add("mongo-3.mongo.databases.svc.cluster.local:27017")
rs.add("mongo-4.mongo.databases.svc.cluster.local:27017")
Enter fullscreen mode Exit fullscreen mode

Vertical Scaling

To increase resources for existing pods, update the StatefulSet:

resources:
  requests:
    memory: "2Gi"
    cpu: "2"
  limits:
    memory: "4Gi"
    cpu: "4"
Enter fullscreen mode Exit fullscreen mode

Apply the changes and perform a rolling update:

kubectl apply -f statefulset.yaml
kubectl rollout status statefulset/mongo -n databases
Enter fullscreen mode Exit fullscreen mode

Storage Expansion

OpenEBS Jiva supports volume expansion. To increase storage for an existing pod:

kubectl patch pvc mongo-persistent-storage-mongo-0 -n databases -p '{"spec":{"resources":{"requests":{"storage":"100Gi"}}}}'
Enter fullscreen mode Exit fullscreen mode

Note: Not all storage classes support volume expansion. Verify with kubectl get storageclass and check the ALLOWVOLUMEEXPANSION column.

Monitoring and Maintenance

Essential Monitoring Metrics

Deploy monitoring for these critical metrics:

  1. Replica Set Health
   kubectl exec -n databases mongo-0 -- mongosh -u admin -p --eval "rs.status()" | grep -A 3 "stateStr"
Enter fullscreen mode Exit fullscreen mode
  1. Pod Status
   kubectl get pods -n databases -o wide
Enter fullscreen mode Exit fullscreen mode
  1. Storage Usage
   kubectl exec -n databases mongo-0 -- df -h /data/db
Enter fullscreen mode Exit fullscreen mode
  1. Resource Consumption
   kubectl top pods -n databases
Enter fullscreen mode Exit fullscreen mode

Backup Strategy

Implement regular backups using mongodump:

kubectl exec -n databases mongo-0 -- mongodump \
  --username=admin \
  --password=YourPassword \
  --authenticationDatabase=admin \
  --out=/tmp/backup-$(date +%Y%m%d)
Enter fullscreen mode Exit fullscreen mode

For production environments, consider using:

  • Velero: Kubernetes-native backup solution
  • MongoDB Ops Manager: MongoDB's enterprise backup solution
  • Kanister: Application-level data management platform

Production Considerations

Security Hardening

  1. Enable TLS/SSL: Encrypt data in transit
   - "--tlsMode=requireTLS"
   - "--tlsCertificateKeyFile=/etc/mongodb/certs/mongodb.pem"
Enter fullscreen mode Exit fullscreen mode
  1. Network Policies: Restrict pod-to-pod communication
   apiVersion: networking.k8s.io/v1
   kind: NetworkPolicy
   metadata:
     name: mongo-netpol
     namespace: databases
   spec:
     podSelector:
       matchLabels:
         app: mongo
     policyTypes:
     - Ingress
     ingress:
     - from:
       - namespaceSelector:
           matchLabels:
             name: application
       ports:
       - protocol: TCP
         port: 27017
Enter fullscreen mode Exit fullscreen mode
  1. Pod Security Standards: Apply baseline security policies

  2. Secrets Management: Use external secrets management (Vault, AWS Secrets Manager)

Performance Optimization

  1. Anti-Affinity Rules: Distribute pods across nodes
   affinity:
     podAntiAffinity:
       requiredDuringSchedulingIgnoredDuringExecution:
       - labelSelector:
           matchExpressions:
           - key: app
             operator: In
             values:
             - mongo
         topologyKey: kubernetes.io/hostname
Enter fullscreen mode Exit fullscreen mode
  1. Resource Tuning: Adjust based on workload patterns
  2. WiredTiger Cache: Configure based on available memory
  3. Connection Pooling: Optimize application connection settings

Disaster Recovery

  1. Multi-Region Deployment: Deploy across availability zones
  2. Regular Backup Testing: Verify backup integrity and restoration procedures
  3. Runbook Documentation: Document recovery procedures
  4. Automated Failover Testing: Regularly test failover mechanisms

Conclusion

You've successfully built a production-grade, horizontally scalable MongoDB cluster on Kubernetes. This setup provides:

  • High Availability: Automatic failover with minimal downtime
  • Horizontal Scalability: Scale from 3 to hundreds of nodes
  • Data Durability: Replicated storage with OpenEBS
  • Operational Flexibility: Kubernetes-native management

While this guide makes the deployment appear straightforward, the reality of managing distributed databases in production is complex. This complexity explains why managed database services like MongoDB Atlas command premium pricing—they handle the operational burden of:

  • 24/7 monitoring and alerting
  • Automated backups and point-in-time recovery
  • Performance optimization and query analysis
  • Security patches and upgrades
  • Multi-region replication and disaster recovery
  • Expert support and SLA guarantees

When to self-host vs. use managed services:

Self-hosting makes sense when:

  • You have experienced DevOps and database engineers
  • You require specific configurations not available in managed services
  • Cost optimization is critical at scale (100+ nodes)
  • You need on-premises deployment for compliance reasons
  • You want complete control over your infrastructure

Managed services make sense when:

  • Your team lacks Kubernetes and database operations expertise
  • You want to focus on application development, not infrastructure
  • You need guaranteed uptime SLAs
  • You require enterprise support and consulting
  • Your workload doesn't justify a dedicated operations team

The skills you've developed in this guide—Kubernetes orchestration, distributed systems design, and operational excellence—are valuable regardless of your deployment choice. Understanding how these systems work at a fundamental level makes you a better engineer, whether you're managing your own infrastructure or architecting applications on managed platforms.

Remember: the goal isn't to rebuild MongoDB Atlas, but to understand the principles that make distributed databases resilient and scalable. This knowledge will serve you well in designing and operating any distributed system.

Next Steps

To further enhance your deployment:

  1. Implement Prometheus monitoring with MongoDB exporter
  2. Deploy Grafana dashboards for visualization
  3. Set up automated backups with Velero or Kanister
  4. Configure alerting with AlertManager
  5. Implement GitOps with ArgoCD or Flux for declarative management
  6. Explore sharding for extreme scale (10TB+ datasets)
  7. Test disaster recovery procedures regularly

The journey from a basic deployment to a robust, production-ready system is iterative. Start with this foundation and continuously improve based on your specific requirements and operational experience.


Resources:

Top comments (0)