Introduction
Kubernetes CronJob makes it very easy to run Jobs on a time-based schedule. These automated jobs run like Cron tasks on a Linux or UNIX system.
In this post, we’ll make use of Kubernetes CronJob to schedule a recurring backup of MongoDB database and upload the backup archive to AWS S3. All the source code is available in GitHub Repository.
tuladhar / k8s-backup-mongodb
Schedule MongoDB Backup to S3 using Kubernetes CronJob.
Get Started
Let’s go ahead and first create a user in MongoDB dedicated to perform the backup with minimum privileges.
Login to the MongoDB shell as a root user.
mongo admin --host <hostname> --authenticationDatabase admin -u root
Run the following command to create the backup user.
db.createUser({
user: 'backup_user',
pwd: 'oO9eV5cG6cF2oM1r',
roles: [{ role: 'backup',db:'admin'}]
})
Kubernetes Namespace
Create a dedicated namespace in Kubernetes to deploy the cronjob.
kubectl apply -f https://raw.githubusercontent.com/tuladhar/k8s-backup-mongodb/main/kubernetes/namespace.yaml
The output is similar to this:
namespace/backup-mongodb created
Let’s save the namespace for all subsequent kubectl commands to run in that context.
kubectl config set-context --current --namespace=backup-mongodb
Kubernetes Secrets
Kubernetes Secrets allows us to store and manage sensitive information. Storing confidential information in a Secret is safer and more flexible than putting it verbatim in a Pod definition or in a container image.
Store MongoDB URI
export MONGODB_URI=mongodb://backup_user:oO9eV5cG6cF2oM1r@<mongodb-hostname>:27017
kubectl create secret generic mongodb-uri --from-literal=MONGODB_URI=$MONGODB_URI
Store AWS credentials and S3 bucket URI
export AWS_ACCESS_KEY_ID=***
export AWS_SECRET_ACCESS_KEY=***
export BUCKET_URI=s3://bucket-name
export AWS_DEFAULT_REGION=us-east-1
kubectl create secret generic aws --from-literal=AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID
kubectl create secret generic aws --from-literal=AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY
kubectl create secret generic aws --from-literal=BUCKET_URI=$BUCKET_URI
kubectl create secret generic aws --from-literal=AWS_DEFAULT_REGION=$AWS_DEFAULT_REGION
Deploy CronJob
Now we can go ahead and deploy the MongoDB backup cronjob by running the following command:
kubectl apply -f https://raw.githubusercontent.com/tuladhar/k8s-backup-mongodb/main/kubernetes/cronjob.yaml
The output is similar to this:
cronjob.batch/backup-mongodb created
The default schedule is to run every hour. To adjust the schedule, run the following command and modify the schedule property:
kubectl edit cronjob backup-mongodb
After creating the cronjob, you can get its status by running the following command:
kubectl get cronjob
The output is similar to this:
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE
backup-mongodb 0 */1 * * * False 0 <none>
As you can see from the results of the command, the cronjob has not scheduled or run any jobs yet. You can list the jobs by running the following command:
kubectl get jobs
To view the Pod logs for a job, run the following command:
pods=**$(**kubectl get pods --selector=job-name=<job-name> --output=jsonpath={.items[*].metadata.name}**)**
kubectl logs $pods
Top comments (1)
Hi,
Thanks for this article!
I get crashloopbackoff on the cron job pod that says:
Failed: bad option: --oplog mode only supported on full dumps
Removing oplog ENV from the .yaml works.
However, there is still an error that it cannot connect to the server. I am guessing it as a problem with MONGO_URI
I am using mongodb://backup_user:password@default/mongodb-0:27017 as URI. I tried removing default/ and also just using the service name which is mongodb. I have mongo deployed as statefulset.
Error in the pod: Failed: can't create session: could not connect to server: server selection error: server selection timeout
Do you happen to know the reason? Thanks!
UPDATE: for those who have similar problems, here is the solution
The mongodb host inside the cluster can be accessed using service-name.namespace.svc.cluster.local
For example host becomes: mongodb.default.svc.cluster.local