DEV Community

Jonathan Mourtada
Jonathan Mourtada

Posted on • Originally published at mourtada.se

FreeNAS ZFS snapshot backup to Amazon S3

Originally posted at www.mourtada.se

I've been looking for a way to backup my FreeNAS ZFS snapshots to an offsite location. I didn't find much information how to do this so i had to come up with my own solution.

In this post I'm going to show you how to save your encrypted ZFS snapshots in Amazon S3. We're going to use a FreeBSD jail together with GnuPG and s3cmd.

Adding a jail in FreeNAS

Go to the FreeNAS web ui and click Jails. Click add and choose name. If you click advanced here you can change the ip-adress for the jail(I wanted to use DHCP).

Adding an empty jail in Freenas

Click ok and FreeNAS will setup a new jail for you which takes a minute or two.

From now on we will have to work in the FreeNAS shell(SSH must be enabled under services in the Web UI).

To list all the jails running on your FreeNAS host we can run:

$ jls

Verify that the jail you created is listed.

To enter the jail run:

$ jexec your_jail_name
$ # Verify that your in the jail
$ hostname
backup

We're going to need to install some packages. First we need GnuPG which we'll use to encrypt our snapshots. Then we will need the s3cmd which is used for uploading our snapshots to Amazon s3.

$ pkg install security/gnupg
$ pkg install net/py-s3cmd
I'm going to use symmetric AES256 encryption with a passphrase file because i don't want to store my data in the cloud unencrypted. So generate a random passphrase which you will need to store at multiple locations(not just inside the jail). Because if the passphrase is lost your backups will be worthless. The passphrase file needs to be accessible by the backupscript. I have placed my passphrase file in in the root directory.
$ echo "mypassphrase" > /root/snapshot-gpg-passphrase
$ chmod 400 /root/snapshot-gpg-passphrase

Next we'll create a folder holding our current list of snapshots that we're going to keep synced with S3.

$ mkdir /root/s3_sync_bucket
$ chmod 600 /root/s3_sync_bucket

We also need to configure s3cmd so run this and answer all questions:

$ s3cmd --configure

The backup script

This script should be run on the FreeNAS host. What is does:

  1. Creates a snapshot of the specified dataset
  2. Sends it to the backup jail where it's encrypted and saved to file
  3. Removes the snapshot on the FreeNAS host
  4. Removes all snapshots older than 7 days
  5. Syncs the local s3 bucket directory with S3 using s3cmd
#!/bin/sh

# The first argument should be the name of the dataset to backup
if test -z $1; then
  echo "Please specify a valid dataset"
  exit 1
fi

# Passphrase file
passphrase_file=/root/snapshot-gpg-passphrase

# Local directory to syn with Amazon S3 
bucket_dir=/root/s3_sync_bucket

# The S3 bucket url
s3_bucket="s3://your-bucket/"

# Generate a snapshot name
snapshot_name="$1@$(date +%Y-%m-%d)"

# Convert to valid filename
filename=$(echo $snapshot_name | sed "s/\//-/g").gpg

echo "Using bucket_dir: $bucket_dir"
echo "Using s3 bucket: $s3_bucket"
echo "Using snapshot_name: $snapshot_name"
echo "Using filename: $filename"

# Create the snapshot, send it to the jail, encrypt it 
zfs snapshot $snapshot_name
zfs send -v $snapshot_name | jexec backup gpg --batch --symmetric --cipher-algo AES256 --passphrase-file $passphrase_file --output $bucket_dir/$filename
zfs destroy $snapshot_name

# Remove old snapshots
search_word=$(echo $filename | sed "s/@.*$//g")
echo "Searching for \"$search_word\" in $bucket_dir older than 7 days"
jexec backup find $bucket_dir -name $search_word* -mtime +7 -exec rm {} \;

# Sync to s3
jexec backup s3cmd sync --delete-removed $bucket_dir $s3_bucket

Create the script and run it manually or with crontab

$ touch /root/backup_script.sh
$ chmod 700 /root/backup_script.sh
$ /root/backup_script.sh my-pool/my-dataset

Edit the script to fit your needs.

Decrypting a backup

To decrypt a backup:

$ gpg --batch --decrypt --passphrase-file /root/pass-gpg < backup_file

Top comments (2)

Collapse
 
aandreev0 profile image
andrey a

Hi Jonathan! Thanks for writing about ZFS and offsite backups.
Could you comment on bandwidth of such setup? What is the time it requires to upload let's say 1TB-large snapshot?
ZFS is nice in that snapshots are incremental, but we still have a lot of data day-to-day

Also, could you comment on cost of such backup? Say, 3 years storage of 1TB of data. Including cost of uploading.

Collapse
 
jmourtada profile image
Jonathan Mourtada

Hi!

I just did these simple tests to see if this idea was doable and haven't moved on to actually use it. So can't give any insights of uploading times or costs. If i had i don't have that amount of data that was important to have backed up.

A big factor is how much the data is changing and how long the retention of snapshots that is needed. Data that doesn't change much will not create big incremental snapshots because of how ZFS works.

For your information I did try replicating the snapshots to another machine that was running ubuntu with ZFS on linux(ZOL). Which also seemed to work well the small amount of time i tried it.

At work we have an offsite freenas which we have periodic snapshots and snapshot replication set up with the freenas GUI. This works well and we replicate around 2 TB of data. This is done over the internet and not over big geographical distances. We haven't had any problems here with throughput or long running replications. Both endpoints are 100/100 Mbit connections though i don't think the actual measured bandwidth gets up to that speed. The costs of this for us is only the hardware.