DEV Community

Michael Lin
Michael Lin

Posted on • Edited on • Originally published at

3 2

How to backup Hashicorp Vault with Raft storage on Kubernetes


Our team is experimenting with Hashicorp Vault as our new credentials management solution. Thanks to the offical Vault Helm Chart, we are able to get an almost production-ready vault cluster running on our Kubernetes cluster with minimal effort.


Our 5-node vault cluster is highly available by using the provided Integrated Storage Raft backend. The vault cluster is run as a Kubernetes StatefulSet and each node has its own data storage. Each data storage is powered by a Block Storage on IBM Cloud via PersistentVolumeClaim.

The Problem

Unfortunately, the open source vault does not provide an out-of-the-box automated backup solution. It is only offer in Vault Enterprise. Apprently, our team doesn't have a deep pocket to pay for the license fee.

That said, the backup feature is still accessible from cli and HTTP API, just not automated. We utilize the snapshot save from vault cli to perform automated backup using a CronJob running along with the vault Kubernetes deployment. The cronjob will periodically take a snapshot of the vault cluster and upload to our S3 storage.



  • You have a working vault cluster
  • You have sufficient access to the cluster
  • You have a working S3 storage instance

Setup Policy and Authentication

This is mostly stolen from adfinis-sygroup/vault-raft-backup-agent#approle-authentication

Create a minimal policy for our snapshot agent to perform the backup job.

echo '
path "sys/storage/raft/snapshot" {
   capabilities = ["read"]
}' | vault policy write snapshot -
Enter fullscreen mode Exit fullscreen mode

The approle auth method allows machines or apps to authenticate with Vault-defined roles.

AppRole auth method is perfectly suited for the snapshot agent to authenticate with our vault cluster. Notes role-id and secret-id, you will need them later.

vault auth enable approle
vault write auth/approle/role/snapshot-agent token_ttl=2h token_policies=snapshot
vault read auth/approle/role/snapshot-agent/role-id -format=json | jq -r .data.role_id
vault write -f auth/approle/role/snapshot-agent/secret-id -format=json | jq -r .data.secret_id
Enter fullscreen mode Exit fullscreen mode

Prepare Secrets

Let's save all our sensitive information as Secrets. We will use them later.

apiVersion: v1
kind: Secret
  name: vault-snapshot-agent-token
type: Opaque
  # we use gotmpl here
  # you can replace them with base64-encoded value
  VAULT_APPROLE_ROLE_ID: {{ .Values.approle.secretId | b64enc | quote }}
  VAULT_APPROLE_SECRET_ID: {{ .Values.approle.secretId | b64enc | quote }}
Enter fullscreen mode Exit fullscreen mode
apiVersion: v1
kind: Secret
  name: vault-snapshot-s3
type: Opaque
  # we use gotmpl here
  # you can replace them with base64-encoded value
  AWS_ACCESS_KEY_ID: {{ .Values.backup.accessKeyId | b64enc | quote }}
  AWS_SECRET_ACCESS_KEY: {{ .Values.backup.secretAccesskey | b64enc | quote }}
  AWS_DEFAULT_REGION: {{ .Values.backup.region | b64enc | quote }}
Enter fullscreen mode Exit fullscreen mode

The CronJob

Let's create the CronJob that actually does the work.

We configure VAULT_ADDR environment variable to http://vault-active.vault.svc.cluster.local:8200. Using vault-active Service can make sure the snapshot request is made against the leader node, assuming you have enabled Service Registration, which is the default. The exact url may vary depending on your vault helm chart deployment release name and targer namespace, learn more.

I may have over-engineered the cronjob by using multiple containers to perform a simple backup and upload task. The intention is to avoid building custom images and I don't want to maintain yet another image.

apiVersion: batch/v1
kind: CronJob
  name: vault-snapshot-cronjob
  schedule: "@every 12h"
          - name: share
            emptyDir: {}
          - name: snapshot
            image: vault:1.7.2
            imagePullPolicy: IfNotPresent
            - /bin/sh
            - -ec
            # The offical vault docker image actually doesn't come with `jq`. You can 
            # - install it during runtime (not a good idea and your security team may not like it)
            # - ship `jq` static binary in a standalone image and mount it using a shared volume from `initContainers`
            # - build your custom `vault` image
            - |
              curl -sS | sh
              export VAULT_TOKEN=$(vault write auth/approle/login role_id=$VAULT_APPROLE_ROLE_ID secret_id=$VAULT_APPROLE_SECRET_ID -format=json | /jq/jq -r .auth.client_token);
              vault operator raft snapshot save /share/vault-raft.snap; 
            - secretRef:
                name: vault-snapshot-agent-token
            - name: VAULT_ADDR
              valut: http://vault-active.vault.svc.cluster.local:8200
            - mountPath: /share
              name: share
          - name: upload
            image: amazon/aws-cli:2.2.14
            imagePullPolicy: IfNotPresent
            - /bin/sh
            - -ec
            # the script wait untill the snapshot file is available
            # then upload to s3
            # for folks using non-aws S3 like IBM Cloud Object Storage service, add a `--endpoint-url` option
            # run `aws --endpoint-url <https://your_s3_endpoint> s3 cp ...`
            # change the s3://<path> to your desired location
            - |
              until [ -f /share/vault-raft.snap ]; do sleep 5; done;
              aws s3 cp /share/vault-raft.snap s3://vault/vault_raft_$(date +"%Y%m%d_%H%M%S").snap;
            - secretRef:
                name: vault-snapshot-s3
            - mountPath: /share
              name: share
          restartPolicy: OnFailure
Enter fullscreen mode Exit fullscreen mode

Wrapping Up

Now you have all resources needed to automate vault backup for Raft backend. You can either just run kubectl apply -f * or build your own Helm Chart and distribute on your private chart repository.


This site is built on Heroku

Join the ranks of developers at Salesforce, Airbase, DEV, and more who deploy their mission critical applications on Heroku. Sign up today and launch your first app!

Get Started

Top comments (0)


This site is built on Heroku

Join the ranks of developers at Salesforce, Airbase, DEV, and more who deploy their mission critical applications on Heroku. Sign up today and launch your first app!

Get Started

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!
