Alex headshot

AlBlue’s Blog

Macs, Modularity and More

ZFS on Mac - using a ZFS pool

Howto Zfs Mac 2008

In my previous post, I discussed getting ZFS up and running. I'll assume you've followed those instructions and have a pool called dyskWorld available. (If you've rebooted since you originally created it, you might need to zpool destroy dyskWorld and start again, since the /tmp filing system will have been cleaned.)

Now that we've got a pool, what can we do? Well, a pool contains one-to-many file systems. Each file system may be given its own set of properties (quotas etc.) mounted, unmounted and so forth. In fact, a pool is merely a container for file systems; all the interesting stuff you do at the file system level.

At this point, it's worth noting that other uses of the word 'file system' more commonly imply an on-disk structure that can't be nested. That's not really the same in ZFS. A better analogy might be directories, in that directories can be nested and can have different permissions. Furthermore, because ZFS is fairly fluid, one can create file systems on the fly and decommission them on the fly fairly quickly, and with minimal overhead. So in a traditional Unix install where there may be a single file system for everything (or in a more structured install, a separate partition for home and a separate partition for var etc.), ZFS allows you to go to town and create a new file system per user if you want. Think of a ZFS file system as 'subset of data' and a ZFS pool as where the ZFS file systems are stored, and you're getting the right feel for what it is.

Back to our pool. At the moment, we've only got the root file system which we've been creating data on. We can do much better than that. Let's say we want to create spaces for different places on our dyskWorld pool:


apple[~] zfs create dyskWorld/AnkhMorpork
apple[~] zfs create dyskWorld/Pseudopolis
apple[~] zfs create dyskWorld/Quirm
apple[~] zfs create dyskWorld/StoLat

Note that each of these commands is pretty quick, returning sub-second. Creating a ZFS file system is a cheap operation, and is expected to be done pretty regularly. You'll notice in Mac's Finder that the dyskWorld mount now looks like a regular disk, and that inside, each of these shows up as a shared disk.

Now that we've got these separate file systems, what can we do with them? Well, time to introduce zfs properties. These are values that can be set on a filesystem as a whole, and are inherited by children. Let's look at the quota property to get a feel for how it's used:


apple[~] zfs set quota=10m dyskWorld/AnkhMorpork
apple[~] zfs set quota=5m dyskWorld/Pseudopolis
apple[~] zfs set quota=5m dyskWorld/Quirm
apple[~] zfs set quota=5m dyskWorld/StoLat
apple[~] zfs list
NAME                    USED  AVAIL  REFER  MOUNTPOINT
dyskWorld               279K   123M    71K  /Volumes/dyskWorld
dyskWorld/AnkhMorpork    22K  9.98M    22K  /Volumes/dyskWorld/AnkhMorpork
dyskWorld/Pseudopolis    22K  4.98M    22K  /Volumes/dyskWorld/Pseudopolis
dyskWorld/Quirm          22K  4.98M    22K  /Volumes/dyskWorld/Quirm
dyskWorld/StoLat         22K  4.98M    22K  /Volumes/dyskWorld/StoLat

Even though the dyskWorld partition has got a lot of space available, the file systems can be restricted and given quotas on a case-by-case basis. Let's put one to the test:


apple[~] mkfile 10m /Volumes/dyskWorld/Quirm/cheese
mkfile: (/Volumes/dyskWorld/Quirm/cheese removed) Write Error: Disc quota exceeded

So far so good. We've got any number of file systems, and we can quota them all we like. If you've ever managed a Linux system and wondered about putting /var and /usr onto different partitions, then ZFS' answer is to create a separate file system for each and manage the file system. But the quota isn't the only thing we can control. We can also control whether compression is enabled for the file system, which means we can do the impossible:


apple[~] zfs set compression=on dyskWorld/Quirm
apple[~] mkfile 10m /Volumes/dyskWorld/Quirm/cheese
apple[~] ls -lh /Volumes/dyskWorld/Quirm/
total 1
-rw-------  1 me  me    10M  2 Apr 01:43 cheese
apple[Volumes] zfs list
NAME                    USED  AVAIL  REFER  MOUNTPOINT
dyskWorld               310K   123M    71K  /Volumes/dyskWorld
dyskWorld/AnkhMorpork    22K  9.98M    22K  /Volumes/dyskWorld/AnkhMorpork
dyskWorld/Pseudopolis    22K  4.98M    22K  /Volumes/dyskWorld/Pseudopolis
dyskWorld/Quirm          22K  4.98M    22K  /Volumes/dyskWorld/Quirm
dyskWorld/StoLat         22K  4.98M    22K  /Volumes/dyskWorld/StoLat

Yes, we now have an empty 10M file in a file system that has a maximum quota of 5M, and in addition, isn't taking up any space. Of course, that's not really happening - but when you use 'mkfile', it creates an empty file full of zeros, which is pretty easy to compress (try 'mkfile 10m /tmp/foo; gzip -9 /tmp/foo; ls -lh /tmp/foo.gz').

The point is that compression can be enabled for the file system as a whole. Unlike Window's implementation (right-click a folder and select “Compress contents of this folder”), setting the compression=on property doesn't actually compress everything that was there beforehand. It simply applies to newly written files. In fact, that's generally true of the ZFS properties on the whole; they don't change what's on disk, but affect subsequent operations (just like reducing the quota isn't going to get rid of any files that are there already).

Now, there's a bunch of stuff that falls into the 'compressible' category that you might have on disk. There's also a bunch of less compressible content. Photos and music are generally not well suited to compression, but text-based documentation (including web pages) generally are. In fact, on a Mac, /Documentation and /Developer/Documentation are pretty easily compressible, and are good candidates for hosting on a ZFS compressed partition.

The compression strategy is lzjb by default, but you can also use gzip instead if you'd prefer. For the smallest possible size, setting zfs set compression=gzip-9 will give you the biggest compression benefit for the data at the (potential) expense of the time it takes to compress it.

Now for something completely different. Let's say we've put together our dyskWorld structure (including the cheese) and we want to take a backup. No problem; in ZFS terms, these are called snapshots:


apple[~] zfs snapshot dyskWorld/Quirm@initial
apple[~] zfs list
NAME                      USED  AVAIL  REFER  MOUNTPOINT
dyskWorld                 314K   123M    71K  /Volumes/dyskWorld
dyskWorld/AnkhMorpork      22K  9.98M    22K  /Volumes/dyskWorld/AnkhMorpork
dyskWorld/Pseudopolis      22K  4.98M    22K  /Volumes/dyskWorld/Pseudopolis
dyskWorld/Quirm          23.5K  4.98M  23.5K  /Volumes/dyskWorld/Quirm
dyskWorld/Quirm@initial      0      -  23.5K  -
dyskWorld/StoLat           22K  4.98M    22K  /Volumes/dyskWorld/StoLat

We've got our named snapshot (“initial” is the name under the dyskWorld/Quirm file system), and it's currently taking up zero space. The reason it's taking up zero space is that it's currently sharing the same data as the file system; there haven't been any changes. We can simulate some data changes to do different things:


apple[Quirm] rm cheese
apple[Quirm] mkfile 10k blue
apple[Quirm] zfs list
NAME                      USED  AVAIL  REFER  MOUNTPOINT
dyskWorld                 346K   123M    71K  /Volumes/dyskWorld
dyskWorld/AnkhMorpork      22K  9.98M    22K  /Volumes/dyskWorld/AnkhMorpork
dyskWorld/Pseudopolis      22K  4.98M    22K  /Volumes/dyskWorld/Pseudopolis
dyskWorld/Quirm          41.5K  4.96M  23.5K  /Volumes/dyskWorld/Quirm
dyskWorld/Quirm@initial    18K      -  23.5K  -
dyskWorld/StoLat           22K  4.98M    22K  /Volumes/dyskWorld/StoLat

Our initial snapshot now takes up more than it did before, even though nothing has changed, because the file we removed (cheese) has now gone from being owned by the dyskWorld/Quirm parent to dyskWorld/Quirm@initial instead. It also means that the data is still there should you want to see it; you should be able to do cd .zfs/Quirm@initial/ and browse the contents as they were at that time. At the present time though, this is one of the known issues that's being worked on.

What can we do with a snapshot? Well, we can mount it as a clone in the meantime which gives us a read-write copy of the snapshot (but without changing the snapshot's contents).


apple[~] zfs clone dyskWorld/Quirm@initial dyskWorld/QuirmInitial
apple[~] ls -lh /Volumes/dyskWorld/Quirm
total 1
-rw-------  1 me  me    10K  2 Apr 02:12 blue
apple[~] ls -lh /Volumes/dyskWorld/QuirmInitial/
total 1
-rw-------  1 me  me    10M  2 Apr 01:43 cheese

When we're finishing recovering (or diffing) whatever files we wanted at the time, we can get rid of the newly cloned data:


apple[~] zfs list
NAME                      USED  AVAIL  REFER  MOUNTPOINT
dyskWorld                 354K   123M    71K  /Volumes/dyskWorld
dyskWorld/AnkhMorpork      22K  9.98M    22K  /Volumes/dyskWorld/AnkhMorpork
dyskWorld/Pseudopolis      22K  4.98M    22K  /Volumes/dyskWorld/Pseudopolis
dyskWorld/Quirm          41.5K  4.96M  23.5K  /Volumes/dyskWorld/Quirm
dyskWorld/Quirm@initial    18K      -  23.5K  -
dyskWorld/QuirmInitial       0   123M  23.5K  /Volumes/dyskWorld/QuirmInitial
dyskWorld/StoLat           22K  4.98M    22K  /Volumes/dyskWorld/StoLat
apple[~] zfs destroy dyskWorld/QuirmInitial
apple[~] zfs list
NAME                      USED  AVAIL  REFER  MOUNTPOINT
dyskWorld                 348K   123M    71K  /Volumes/dyskWorld
dyskWorld/AnkhMorpork      22K  9.98M    22K  /Volumes/dyskWorld/AnkhMorpork
dyskWorld/Pseudopolis      22K  4.98M    22K  /Volumes/dyskWorld/Pseudopolis
dyskWorld/Quirm          41.5K  4.96M  23.5K  /Volumes/dyskWorld/Quirm
dyskWorld/Quirm@initial    18K      -  23.5K  -
dyskWorld/StoLat           22K  4.98M    22K  /Volumes/dyskWorld/StoLat

Note that the snapshot is always there, ready to go back to if we need it. Also, when we created the clone, we had a zero usage because we were sharing the part of the data with Quirm@initial snapshot. Much like the way that Time Machine doesn't backup unchanged files, ZFS doesn't needlessly copy data; yet manages it so that if any data goes from one place, it's inherited by the other.

Lastly, we can rollback to a snapshot:


apple[~] zfs rollback dyskWorld/Quirm@initial

That's about it for this instalment. Next time, I'll look at the RAID characteristics of ZFS and how you might use a ZFS system yourself.