DEV Community

Cover image for Backup your Container Data
Raphaël Pinson for Camptocamp Infrastructure Solutions

Posted on • Edited on • Originally published at camptocamp.com

2

Backup your Container Data

Containers have become a great facility to easily deploy applications, whether locally or on orchestrated clusters.

However, containers are ephemeral, meaning their data should be stored externally. When possible, they can be stored using databases or object storage. Most often though, you will need to resort to using data volumes, mounted inside your containers. How then can be perform a backup of this data?

Data location is known

Contrarily to the traditional situation in application deployment, the location of critical data in containers is known, since it uses named volumes. We can thus connect to the Docker socket or the API managing the volumes to list them and perform the backups.

Introducing Bivac

Bivac is a tool created to do just that. It can be plugged to either a Docker socket, a Rancher API, or a Kubernetes server. It will then list the volumes on the platform and automatically back them up on a regular basis, using Restic to transfer the data to an object storage provider (e.g. AWS S3).

Bivac Logo

In addition, Bivac can provide metrics on the backup statuses as it exposes a Prometheus endpoint.

Using the REST client, backups can be listed, executed on demand, and it is also possible to restore volumes.

Installation

Bivac can easily be installed a binary or a container. Here are some examples, deploying it locally on Docker, or using Kubernetes.

Using Docker

The following docker-compose.yml file can be used to deploy the Bivac manager:

---
version: '3'
services:
  bivac:
    image: camptocamp/bivac:2.2
    command: "manager -v"
    ports:
      - "8182:8182"
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
    environment:
      BIVAC_AGENT_IMAGE: camptocamp/bivac:2.1
      BIVAC_SERVER_PSK: super-secret-psk
      RESTIC_PASSWORD: not-so-good-password
      BIVAC_TARGET_URL: s3:my-bucket
      AWS_ACCESS_KEY_ID: XXXXX
      AWS_SECRET_ACCESS_KEY: XXXXX
Enter fullscreen mode Exit fullscreen mode

Additionally, you can also deploy a local Prometheus server to retrieve the metrics. See the full example.

Using Kubernetes

The easiest way to deploy a Bivac manager on Kubernetes is to use Camptocamp's Helm chart:

$ helm repo add camptocamp http://charts.camptocamp.com
$ helm install camptocamp/bivac --version 1.0.0
Enter fullscreen mode Exit fullscreen mode

Using the CLI

The CLI can be downloaded from the releases page. Once the binary is installed, you can use it to list backups, perform backups, or restore data.

Connecting to the manager

The CLI needs to be connected to the Bivac manager, using its HTTP URL and PSK (defined in the deployment). This can be performed using either the --remote.address and --server.psk options, or by setting the BIVAC_REMOTE_ADDRESS and BIVAC_SERVER_PSK.

Listing backups

The bivac volumes command lets you list the volumes managed by Bivac:

$ bivac volumes
ID Name Hostname Mountpoint LastBackupDate LastBackupStatus Backing up
mysql mysql testing /var/lib/mysql 2019-06-13 01:33:44 Success false
ssh_config ssh_config testing /etc/ssh 2019-06-13 01:43:12 Success false

Perform backups

While Bivac automatically performs backups at a regular interval, the CLI can also be used to trigger backups manually:

$ bivac backup ssh_config
Backing up `ssh_config'...
ID: ssh_config
Name: ssh_sshconfig
Mountpoint: /etc/ssh
Backup date: 2019-06-13 09:35:38
Backup status: Success
Logs:
testInit
init
backup [0]
Files: 0 new, 0 changed, 11 unmodified
Dirs: 0 new, 1 changed, 0 unmodified
Added to the repo: 702 B
processed 11 files, 299.375 KiB in 0:01
snapshot 1c21ee5b saved
forget [0] Applying Policy: keep the last 15 daily snapshots
snapshots for (host [testing]):
keep 15 snapshots:
ID Time Host Tags Reasons Paths
--------------------------------------------------------------------------------------------------------------
1c21ee5b 2019-06-13 09:35:32 testing daily snapshot /etc/ssh
--------------------------------------------------------------------------------------------------------------
1 snapshots
repository contains 18 packs (44 blobs) with 317.585 KiB
processed 44 blobs: 0 duplicate blobs, 0B duplicate
load all snapshots
find data that is still in use for 15 snapshots
[0:00] 100.00% 15 / 15 snapshots
found 42 of 44 data blobs still in use, removing 2 blobs
will remove 0 invalid files
will delete 1 packs and rewrite 0 packs, this frees 763B
counting files in repo
[0:00] 100.00% 17 / 17 packs
finding old index files
saved new indexes as [f027febd]
remove 2 old index files
[0:00] 100.00% 1 / 1 packs deleted
done
view raw bivac-backup.sh hosted with ❤ by GitHub

Restore data

Bivac stores restic backups on object storage and lets you restore them using the backup restore command:

$ bivac restore canary
Restoring `canary'...
ID: canary
Name: canary
Mountpoint: /var/lib/docker/volumes/canary/_data
Backup date: 2019-06-13 07:56:36
Backup status: Success
Logs:
restore [0] restoring <Snapshot 15583d4b of [/var/lib/docker/volumes/canary/_data] at 2019-06-13 07:56:13.905600644 +0000 UTC by root@testing> to /var/lib/docker/volumes/canary/_data/h3bf5TfCxKtisKYF
snapshots [0] [{"time":"2019-06-13T07:56:13.905600644Z","tree":"e6790a6cf2fd100d01b3bcac795c8787411b0879c85d60514f109403d26890bf","paths":["/var/lib/docker/volumes/canary/_data"],"hostname":"testing","username":"root","id":"15583d4b11605ec552be08fd1fd76d7549aefa0104ab4111f629737d5c7f7a17","short_id":"15583d4b"}]
Manage a remote Restic repository
If you want to list volume's snapshots or retrieve some stats, you will have to use Restic and Bivac provides a good abstraction to do it.
Let's say you have volume called canary and you want to list the associate snapshots, then you'll simply run:
$ bivac restic --volume canary snapshots
ID Time Host Tags Paths
-------------------------------------------------------------------------------------------
9d22678e 2019-01-13 03:35:01 canary /mnt/geoserver_geodata
-------------------------------------------------------------------------------------------
1 snapshots
In case, you'd like to run a more complex command, you must use -- as follow:
$ bivac restic --volume canary -- forget --prune --keep-daily 15
Troubleshooting
My backup failed because the remote repository is locked.
The first thing to do is to check the date and the user who created the lock. From these informations, you should be able to determine if the lock is "legit" (a backup is running) or if it's a remnant of a forgotten backup. If you think it's safe to remove it, then you can run:
$ bivac backup [VOLUME_ID] --force
With the option --force, Bivac will unlock the Restic repository before doing a backup.
Add a custom footer
Pages 5
Overview
Installation
Docker
Rancher (Cattle)
Kubernetes
Usage
Backup a volume
Restore a volume
Manage a remote Restic repository
Troubleshooting
Providers
Monitoring
API
Clone this wiki locally

Going further

More features are available, such as the ability to manage a remote Restic repository. See the documentation for more information.

Conclusion

Bivac allows to easily backup data, monitor their status and restore them, whether you are using raw Docker, Rancher volumes or Kubernetes.

This post was originally published on https://www.camptocamp.com/actualite/backup-your-container-data/

Image of Datadog

The Essential Toolkit for Front-end Developers

Take a user-centric approach to front-end monitoring that evolves alongside increasingly complex frameworks and single-page applications.

Get The Kit

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more