So, now we've got ZFS installed on Mac, what can we do with it? Well, not a lot initially:
apple[~] zpool You must be root in order to load the ZFS kext internal error: failed to initialize ZFS library
At the moment, you have to be root to load the ZFS kernel extension. It's not clear whether this is a temporary requirement or if it will be automatically loaded at boot time; but for now, you have to be root to load the ZFS kernel.
In order to create a ZFS filesystem, we have two choices:
- Use an existing disk (or partition on the disk)
- Use a file to act as a pseudo-partition for experimentation
It's possible to mount ZFS on USB devices; but owing to issues with removing them it might not be what everyone wants to do. The advantage of the latter is that you also get to play around with ZFS without having to commit to any major repartitioning of disks.
apple[~] sudo zpool missing command usage: zpool command args ... where 'command' is one of the following: ... apple[~] mkfile 64m /tmp/dyskA apple[~] ls -l /tmp/dyskA -rw------- 1 me me 67108864 20 Mar 00:25 /tmp/dyskA apple[~] zpool create dyskWorld /tmp/dyskA apple[~] cd /Volumes/dyskWorld apple[dyskWorld] mount dyskWorld on /Volumes/dyskWorld (zfs, local, nodev, nosuid, mounted by me)
What was all that? Well, the
mkfile says 'create an empty file with a size of 64 megabytes' which we can use as a (virtual) disk, and the
zpool create creates a ZFS pool called
dyskWorld based on the (virtual) disk
Note that ZFS automatically mounts the drive (in the default location,
/Volumes/dyskWorld) and makes it available for our use. We can then do all the things we'd normally expect to do with a filesystem; copy files and the like.
So, why do we bother doing this? Well, it turns out that ZFS pools are neat. Firstly, the formatting was quick; if you've ever had to format a disk on Linux and watch it create nodes and supernodes and superblock pointers, you'll know it can take a while to go through and set up the disk ready for use. A ZFS disk, on the other hand, is ready to go with a few minor structures in place; so the setup time is much less affected by the size of the disk. As disks (and people's media collections) grow, the quick use is a key aspect.
Secondly, ZFS pools allow us to throw more storage at the disk. If we try to create a 100m file on our newly initialized pool, it's not going to work (clearly). Still, we can give it a go:
apple[dyskWorld] mkfile 100m vimes mkfile: (vimes removed) Write Error: No space left on device
Note that it'll take a while to do this - performance of an FS on top of an FS is never likely to be that good, so don't take much notice at this stage.
OK, we've got a 100m file and no-where to put it. Hang on, have we got another disk anywhere? We can use that disk and add it to the pool.
apple[dyskWorld] mkfile 250m /tmp/dyskB apple[dyskWorld] zpool add dyskWorld /tmp/dyskB apple[dyskWorld] mkfile 100m vimes
We found a spare disk somewhere (
dyskB) and lobbed it at the pool
zpool add dyskWorld, and now suddenly we've got more space; enough to create our 100m file. Neat things to observe:
- If you run out of space, whack in another disk
- Adding the disk, like initial formatting, was almost instantaneous
- The filesystem was live throughout; in fact, we were in the
dyskWorlddirectory all along
The key thing is we could do this all on the fly without having to unmount the filesystem whilst we did it. There's no
/etc/raidtab files to edit; it just happens. I think it's summed up in this quote from The Future of Filesystems:
When you have a server and you want to upgrade its memory, the process is pretty straightforward. You power down the server, plug in some DIMMs, power it back on, and you're done. You don't run
dimmconfig, you don't edit
/etc/dimmtab, and you don't create virtual DIMMs that you mount on applications. The memory is simply a pooled resource that's managed by the operating system on behalf of the application.
With ZFS, we asked this question: why can't your on-disk storage be the same way? That's exactly what we do in ZFS. We have a pooled storage model. The disks are like DIMMs, and the file systems are like applications. You add devices into the storage pool, and now the file system is no longer tied to the concept of a physical disk. It grabs data from the pool as it needs to store your files, and as you remove or delete your files, it releases that storage back to the pool for other file systems to use.
The most important property of the filesystem is the fact that it can detect (otherwise silent) corruption. We can, at any time, issue a
zpool status command:
apple[dyskWorld] zpool status pool: dyskWorld state: ONLINE status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on older software versions. scrub: none requested config: NAME STATE READ WRITE CKSUM dyskWorld ONLINE 0 0 0 /tmp/dyskA ONLINE 0 0 0 /tmp/dyskB ONLINE 0 0 0 errors: No known data errors
The key part is No known data errors. Our disks are healthy, and they're working fine. If we're not sure, we can run a scrub which is akin to a
fsck or a
chkdsk, except that the filesystem remains on-line rather than having to boot into single-user mode:
apple[dyskWorld] zpool scrub dyskWorld apple[dyskWorld] zpool status pool: dyskWorld state: ONLINE status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on older software versions. scrub: scrub completed with 0 errors on Thu Mar 20 01:03:16 2008 config: NAME STATE READ WRITE CKSUM dyskWorld ONLINE 0 0 0 /tmp/dyskA ONLINE 0 0 0 /tmp/dyskB ONLINE 0 0 0 errors: No known data errors
If you catch it during the scrub (on larger disks, for example), you'll see a scrub message there telling you it's in progress. In this case, it was too quick for me.
There's also another message saying that the pool is using an older on-disk format. Mac OS X 10.5 has kernel modules to read ZFS on-disk format 6, and as a result, the Mac OS X drivers are configured to use that by default. However, if we want to upgrade our array (to take advantage of newer features) then we can do so:
apple[dyskWorld] zpool upgrade dyskWorld This system is currently running ZFS pool version 8. Successfully upgraded 'dyskWorld' from version 6 to version 8 apple[dyskWorld] zpool status pool: dyskWorld state: ONLINE scrub: scrub completed with 0 errors on Thu Mar 20 01:03:16 2008 config: NAME STATE READ WRITE CKSUM dyskWorld ONLINE 0 0 0 /tmp/dyskA ONLINE 0 0 0 /tmp/dyskB ONLINE 0 0 0 errors: No known data errors
That's about it for this instalment; here are the key takeaways:
- From the time that the filesystem was originally created on
dyskWorldto the end of this post, the filesystem has been on-line and usable by others throughout. Even though we've upgraded the on-disk format, checked the disk for errors and even added extra storage, the filesystem has been up and serving requests.
- If you run out of space on a ZFS pool, throw more disks at the problem. The ZFS pool will grow automatically to take advantage of the storage. (Astute readers will be asking questions to do with RAID levels; however, the examples here show a RAID-0 or disk-spanning like use of ZFS. I'll cover more about redundancy later.)
- Formatting a disk is quick; scrubbing (checksumming) is proportional to the size of the data, not the size of the disk.
- ZFS pools are automatically mounted upon creation in the normal place. (They can be mounted anywhere, but I didn't show that here.)
- There's lots more good things about ZFS which I'll tell you later ...