ZFS Backup Strategy with Sanoid and Syncoid

November 16, 2025 4 minute read

In my previous post, I discussed how I’ve migrated VMs to new storage. This gave me cause to also take a look at my backup configuration, to ensure I can still come back from catastrophic events.

I use Sanoid and Syncoid for this purpose. Let’s see how that may look.

Primary backup

My backup strategy is multi-pronged. I take frequent snapshots of the file system where my VMs live. This is convenient for oupsies, but of course doesn’t protect me should a catastrophic storage failure occur. Snapshots and the eventual cleanup of them are managed by Sanoid.

The relevant parts of my /etc/sanoid/sanoid.conf:

[ssdpool]
        use_template = ssdpool
        recursive = yes
        process_children_only = no

[template_ssdpool]
        frequently = 0
        hourly = 36
        daily = 30
        monthly = 0
        yearly = 0
        autosnap = yes
        autoprune = yes

We can see that the ssdpool file system gets recursive snapshots every hour plus daily. Thirty-six of these hourly snapshots are kept, and then I keep 30 of the daily snapshots on the primary storage.

In every step, I can verify that this actually works by listing ZFS snapshots:

sudo zfs list -t snapshot

Secondary backup

To protect against pool failures, I have a secondary pool in my hypervisor host, set up for backup storage. I call Syncoid from cron for this purpose.

This is handled by this part of my /etc/cron.d/syncoid:

02 * * * * root syncoid -r --no-sync-snap --compress=lzo --quiet ssdpool backuppool/ssdpoolbackup

On the second minute of every hour of every day, I recursively send ZFS snapshots from ssdpool to backuppool/ssdpoolbackup.

Of course, I also have to ensure the backup volume doesn’t fill up with snapshots. Sanoid to the rescue again:

        
[backuppool/ssdpoolbackup]
        use_template = backuppoolssd
        recursive = yes
        process_children_only = no

[template_backuppoolssd]
        frequently = 0
        hourly = 36
        daily = 30
        monthly = 0
        yearly = 0
        autosnap = no
        autoprune = yes

For backuppool/ssdpoolbackup I don’t create new snapshots, but I keep the same pruning configuration that I have for the primary snapshots.

Tertiary backup

Finally I have a completely separate physical server where I store an additional copy of my backups. This one is configured to connect to the primary server and fetch snapshots, again using syncoid. This way of doing things makes it slightly harder to mess things up on the backup server from the primary server.

The cron configuration line for syncoid on this machine looks like this:

20 * * * * root syncoid --sshkey=/root/.ssh/syncoid -r --no-sync-snap --compress=lzo --quiet syncoid@prodsrv1:backuppool/ssdpoolbackup backuppool/prodsrv1/ssdpool

It’s very similar to the one running on the main server, but fetches the snapshots from the backup pool on the main server over an ssh tunnel, and then stores them in a local backup pool. As you see, I’ve given it a delay of a few minutes compared to the crontab on the main server, so in most circumstances we should manage to have the backup copy done in a timely manner for minimal data loss in case of an issue.

Of course here too, I make sure to clean up snapshots to avoid filling the disks over time:

[backuppool/prodsrv1/ssdpool]
        use_template = backupssd
        recursive = yes
        process_children_only = no
        
[template_backupssd]
        frequently = 0
        hourly = 36
        daily = 90
        monthly = 0
        yearly = 0
        autosnap = no
        autoprune = yes

The only big difference to the primary and secondary backup copies, is that I store daily snapshots for longer on the backup server.

Testing backups

To verify that the backup works, we can perform a small experiment: I have an otherwise empty VM, where I’ve simply created a text file in my home directory:

$ echo Test > is_this_file_gone.txt
$ cat is_this_file_gone.txt
Test

After waiting for a snapshot to be made, I remove the test file:

$ rm is_this_file_gone.txt
$ cat is_this_file_gone.txt
cat: is_this_file_gone.txt: No such file or directory

Now let’s restore the VM from my tertiary copy. If that works, we know that the intermediate backup routines work too.

From my backup server, I will send the snapshot back to the production machine, then see if I can start the VM from the snapshot.

First we find the relevant snapshots:

$ zfs list -t snapshot
NAME                                                                                 USED  AVAIL  REFER  MOUNTPOINT
backuppool/prodsrv1/ssdpool/vdisks/testsrv1@autosnap_2025-11-16_11:00:25_daily         0B      -    96K  -
backuppool/prodsrv1/ssdpool/vdisks/testsrv1@autosnap_2025-11-16_11:00:25_hourly        0B      -    96K  -
backuppool/prodsrv1/ssdpool/vdisks/testsrv1@autosnap_2025-11-16_12:00:25_hourly        0B      -  2.84G  -

Now let’s take a specific snapshot and send it back:

sudo syncoid --sshkey=/root/.ssh/syncoid \
    -r \
    --no-sync-snap \
    --compress=lzo \
    --include-snaps=autosnap_2025-11-16_12:00:25_hourly \
    backuppool/prodsrv1/ssdpool/vdisks/testsrv1 \
    syncoid@prodsrv1:ssdpool/restoretest

To test the contents of the snapshot non-destructively, on the main server, we’ll stop the VM, move its expected disk image out of the way, and copy in the disk image we pulled from our backup.

cd /ssdpool/vdisks/testsrv1
sudo virsh shutdown testsrv1
# After waiting for the VM to stop:
sudo mv testsrv1.qcow2 testsrv1.qcow2.actual
sudo cp --sparse=always /ssdpool/restoretest/testsrv1.qcow2 ./
sudo virsh start testsrv1

Once the machine starts up, we can test our backup:

$cat is_this_file_gone.txt
Test

As we’re happy everything works, we clean up the VM directory and restart the machine

sudo virsh shutdown testsrv1
sudo rm testsrv1.qcow2
sudo mv testsrv1.qcow2.actual testsrv1.qcow2
sudo virsh start testsrv1

I always like to perform a dry-run before running destructive ZFS changes:

$ sudo zfs destroy -nvpr ssdpool/restoretest
destroy	ssdpool/restoretest@autosnap_2025-11-16_12:00:25_hourly
destroy	ssdpool/restoretest

This looks like exactly what we want to do. Remove the n argument to execute the command:

sudo zfs destroy -vpr ssdpool/restoretest

And that’s it, really. I’ve confirmed that snapshots work and that I can restore them.

Mikael Hansson

ZFS Backup Strategy with Sanoid and Syncoid

Primary backup

Secondary backup

Tertiary backup

Testing backups

You may also enjoy

Creating VMs in separate ZFS filesystems

Making bookmarks work in Safari 26

Thoughts on openSUSE Tumbleweed

GitOps website deployment with Forgejo and Jekyll on FreeBSD