ZFS snapshots and cold storage

OK, this isn’t really a practice I personally have any use for. I much, much prefer replicating snapshots to live systems using syncoid as a wrapper for zfs send and receive. But if you want to use cold storage, or just want to understand conceptually the way snapshots and streams work, read on.

Our working dataset

Let’s say you have a ZFS pool and dataset named, creatively enough, poolname/setname, and let’s say you’ve taken snapshots over a period of time, named @0 through @9.

root@banshee:~# zfs list -rt all poolname/setname
NAME                USED  AVAIL  REFER  MOUNTPOINT
poolname/setname    1004M  21.6G    19K  /poolname/setname
poolname/setname@0   100M      -   100M  -
poolname/setname@1   100M      -   100M  -
poolname/setname@2   100M      -   100M  -
poolname/setname@3   100M      -   100M  -
poolname/setname@4   100M      -   100M  -
poolname/setname@5   100M      -   100M  -
poolname/setname@6   100M      -   100M  -
poolname/setname@7   100M      -   100M  -
poolname/setname@8   100M      -   100M  -
poolname/setname@9   100M      -   100M  -

By looking at the USED and REFER columns, we can see that each snapshot has 100M of data in it, unique to that snapshot, which in total adds up to 1004M of data.

Sending a full backup to cold storage

Now if we want to use zfs send to move stuff to cold storage, we have to start out with a full backup of the oldest snapshot, @0:

root@banshee:~# zfs send poolname/setname@0 | pv > /coldstorage/setname@0.zfsfull
 108MB 0:00:00 [ 297MB/s] [<=>                                                 ]
root@banshee:~# ls -lh /coldstorage
total 1.8M
-rw-r--r-- 1 root root 108M Sep 15 16:26 setname@0.zfsfull

Sending incremental backups to cold storage

Let’s go ahead and take some incrementals of setname now:

root@banshee:~# zfs send -i poolname/setname@0 poolname/setname@1 | pv > /coldstorage/setname@0-@1.zfsinc
 108MB 0:00:00 [ 316MB/s] [<=>                                                 ]
root@banshee:~# zfs send -i poolname/setname@1 poolname/setname@2 | pv > /coldstorage/setname@1-@2.zfsinc
 108MB 0:00:00 [ 299MB/s] [<=>                                                 ]
root@banshee:~# ls -lh /coldstorage
total 5.2M
-rw-r--r-- 1 root root 108M Sep 15 16:33 setname@0-@1.zfsinc
-rw-r--r-- 1 root root 108M Sep 15 16:26 setname@0.zfsfull
-rw-r--r-- 1 root root 108M Sep 15 16:33 setname@1-@2.zfsinc

OK, so now we have one full and two incrementals. Before we do anything that works, let’s look at some of the things we can’t do, because these are really important to know about.

Without a full, incrementals are useless

First of all, we can’t restore an incremental without a full:

root@banshee:~# pv < /coldstorage/setname@0-@1.zfsinc | zfs receive poolname/restore
cannot receive incremental stream: destination 'poolname/restore' does not exist
  64kB 0:00:00 [94.8MB/s] [>                                   ]  0%            

Good thing we’ve got that full backup of @0, eh?

Restoring a full backup from cold storage

root@banshee:~# pv < /coldstorage/setname@0.zfsfull | zfs receive poolname/restore
 108MB 0:00:00 [ 496MB/s] [==================================>] 100%            
root@banshee:~# zfs list -rt all poolname/restore
NAME                USED  AVAIL  REFER  MOUNTPOINT
poolname/restore     100M  21.5G   100M  /poolname/restore
poolname/restore@0      0      -   100M  -

Simple – we restored our full backup of poolname to a new dataset named restore, which is in the condition poolname was in when @0 was taken, and contains @0 as a snapshot. Now, keeping in tune with our “things we cannot do” theme, can we skip straight from this full of @0 to our incremental from @1-@2?

Without ONE incremental, following incrementals are useless

root@banshee:~# pv < /coldstorage/setname@1-@2.zfsinc | zfs receive poolname/restore
cannot receive incremental stream: most recent snapshot of poolname/restore does not
match incremental source
  64kB 0:00:00 [49.2MB/s] [>                                   ]  0%

No, we cannot restore a later incremental directly onto our full of @0. We must use an incremental which starts with @0, which is the only snapshot we actually have present in full. Then we can restore the incremental from @1 to @2 after that.

Restoring incrementals in order

root@banshee:~# pv < /coldstorage/setname@0-@1.zfsinc | zfs receive poolname/restore
 108MB 0:00:00 [ 452MB/s] [==================================>] 100%            
root@banshee:~# pv < /coldstorage/setname@1-@2.zfsinc | zfs receive poolname/restore
 108MB 0:00:00 [ 448MB/s] [==================================>] 100%
root@banshee:~# zfs list -rt all poolname/restore
NAME                                                 USED  AVAIL  REFER  MOUNTPOINT
poolname/restore                                      301M  21.3G   100M  /poolname/restore
poolname/restore@0                                    100M      -   100M  -
poolname/restore@1                                    100M      -   100M  -
poolname/restore@2                                       0      -   100M  -

There we go – first do a full restore of @0 onto a new dataset, then do incremental restores of @0-@1 and @1-@2, in order, and once we’re done we’ve got a restore of setname in the condition it was in when @2 was taken, and containing @0, @1, and @2 as snapshots within it.

Incrementals that skip snapshots

Let’s try one more thing – what if we take, and restore, an incremental directly from @2 to @9?

root@banshee:~# zfs send -i poolname/setname@2 poolname/setname@9 | pv > /coldstorage/setname@2-@9.zfsinc
 108MB 0:00:00 [ 324MB/s] [<=>                                                 ]
root@banshee:~# pv < /coldstorage/setname@2-@9.zfsinc | zfs receive banshee/restore
 108MB 0:00:00 [ 464MB/s] [==================================>] 100%            
root@banshee:~# zfs list -rt all poolname/restore
NAME                USED  AVAIL  REFER  MOUNTPOINT
poolname/restore     402M  21.2G   100M  /poolname/restore
poolname/restore@0   100M      -   100M  -
poolname/restore@1   100M      -   100M  -
poolname/restore@2   100M      -   100M  -
poolname/restore@9      0      -   100M  -

Note that when we restored an incremental of @2-@9, we did restore our dataset to the condition it was in when @9 was taken, but we did not restore the actual snapshots between @2 and @9. If we’d wanted to save those in a single operation, we should’ve used an incremental snapshot stream.

Sending incremental streams to cold storage

To save an incremental stream, we use -I instead of -i when we invoke zfs send. Let’s send an incremental stream from @0 through @9 to cold storage:

root@banshee:~# zfs send -I poolname/setname@0 poolname/setname@9 | pv > /coldstorage/setname@0-@9.zfsincstream
 969MB 0:00:03 [ 294MB/s] [   <=>                                              ]
root@banshee:~# ls -lh /coldstorage
total 21M
-rw-r--r-- 1 root root 108M Sep 15 16:33 setname@0-@1.zfsinc
-rw-r--r-- 1 root root 969M Sep 15 16:48 setname@0-@9.zfsincstream
-rw-r--r-- 1 root root 108M Sep 15 16:26 setname@0.zfsfull
-rw-r--r-- 1 root root 108M Sep 15 16:33 setname@1-@2.zfsinc

Incremental streams can’t be used without a full, either

Again, what can’t we do? We can’t use our incremental stream by itself, either: we still need that full of @0 to do anything with the stream of @0-@9.
Let’s demonstrate that by starting over from scratch.

root@banshee:~# zfs destroy -R poolname/restore
root@banshee:~# pv < /coldstorage/setname@0-@9.zfsincstream | zfs receive poolname/restore cannot receive incremental stream: destination 'poolname/restore' does not exist 256kB 0:00:00 [ 464MB/s] [> ] 0%

Restoring from cold storage using snapshot streams

All right, let’s restore from our full first, and then see what the incremental stream can do:

root@banshee:~# pv < /coldstorage/setname@0.zfsfull | zfs receive poolname/restore
 108MB 0:00:00 [ 418MB/s] [==================================>] 100%            
root@banshee:~# pv < /coldstorage/setname@0-@9.zfsincstream | zfs receive poolname/restore
 969MB 0:00:06 [ 155MB/s] [==================================>] 100%            
root@banshee:~# zfs list -rt all poolname/restore
NAME                USED  AVAIL  REFER  MOUNTPOINT
poolname/restore    1004M  20.6G   100M  /poolname/restore
poolname/restore@0   100M      -   100M  -
poolname/restore@1   100M      -   100M  -
poolname/restore@2   100M      -   100M  -
poolname/restore@3   100M      -   100M  -
poolname/restore@4   100M      -   100M  -
poolname/restore@5   100M      -   100M  -
poolname/restore@6   100M      -   100M  -
poolname/restore@7   100M      -   100M  -
poolname/restore@8   100M      -   100M  -
poolname/restore@9      0      -   100M  -

There you have it – our incremental stream is a single file containing incrementals for all snapshots transitioning from @0 to @9, so after restoring it on top of a full from @0, we have a restore of setname which is in the condition setname was in when @9 was taken, and contains all the actual snapshots from @0 through @9.

Conclusions about cold storage backups

I still don’t particularly recommend doing backups this way due to the ultimate fragility of requiring lots of independent bits and bobs. If you delete or otherwise lose the full of @0, ALL incrementals you take since then are useless – which means eventually you’re going to want to take a new full, maybe of @100, and start over again. The problem is, as soon as you do that, you’re going to take up double the space on cold storage you were using in production, since you have all the data from @0-@99 in incrementals, AND all of the surviving data from those snapshots in @100, along with any new data in @100!

You can delete the full of @0 after the full of @100 finishes, and as soon as you do, all those incrementals of @1 to @99 become completely useless and need to be deleted as well. If you screw up somehow and think you have a good full of @100 when you really don’t, and delete @0… all of your backups become useless, and you’re completely dead in the water. So be careful.

By contrast, if you use syncoid to replicate directly to a second server or pool, you never need more space than you took up in production, you no longer have the nail-biting period when you delete old fulls, and you no longer have to worry about going agonizingly through ten or a hundred separate incremental restores to get back where you were – you can delete snapshots from the back or the middle of a working dataset on another pool without any concern that it’ll break snapshots that come later than the one you’re deleting, and you can restore them all directly in a single operation by replicating in the opposite direction.

But, if you’re still determined to send snapshots to cold storage – local, Glacier, or whatever – at least now you know how.