Alex headshot

AlBlue’s Blog

Macs, Modularity and More

Data loss on Leopard USB drives

2007, crap, mac, rant, zfs

One of the cardinal rules in upgrading operating systems is always wait for at least the .2 release before using it for production use. Leopard is no different in this matter.

There is a massive bug in the USB drivers for external drives that will cause data loss. If you are using an external USB drive for storing any important data, disconnect it from any Leopard (10.5.1) systems that you have, and don't use it until you know this bug has been fixed. Note that this also applies to any USB-based hardware, such as a camera.

Frankly, I have no idea whether the upcoming 10.5.2 solves the problem or not; but when it's made public, I'll find out.

I discovered this whilst building a ZFS array on a pair of external USB drives on a 10.5.1 system. One of the great things about ZFS is that it has built-in checksums for data on disks that can detect when a block has gone bad on the disk drive. I discovered that after copying a few Gb of data from my existing HFS array to the ZFS array, and doing a customary zfs scrub and discovering that there were several hundred disk read errors. Now that's not what you expect to see on a brand new disk, a brand new USB enclosure, and a new Mac mini.

To avoid the possibilities of other parts of the system being contaminated, I ran the same test on a second drive and second enclosure (always makes sense to buy them in at least pairs; that way, you've got hardware in case one fails) and got the same results. I suspected (hoped?) that it was a memory problem on that machine; but running the Tech Tool Deluxe memory test resulted in no problems found. I tried it on a second Mac running Leopard, to see if I could replicate the problem; and the problem was visible on the second Mac too. I also re-built the drives as HFS+ to ensure that it wasn't an issue with the (beta) ZFS implementation.

Having narrowed down the issue to the USB enclosure or hard-drive, I wondered whether the problem is to do with the specific enclosure or not. It's unlikely to get drive errors on two new disks (although not impossible) so the next test I ran was on a Mac still running 10.4. To my great surprise, there were no disk errors reported on 10.4, yet the problem was easily identifiable on 10.5.

This, frankly, came as somewhat of a surprise. Although the .0 releases are usually to be avoided in principle, and stability comes along later, losing data is a heinous failure. We've not seen this kind of failure since the Panther FireWire debacle. It seems that I'm not the only one noticing problems and there's a lot of chatter on the cooler master forums regarding lack of support in Leopard.

I'm leaving my file system hosted on a 10.4 server at the moment, although I'd really like to get to ZFS as soon as possible. I might even have to investigate what the likelihood of getting a back-port of the ZFS 10.5 to 10.4 is (from zfs.macosforge.org) but I suspect that a more recent release of Leopard will introduce ZFS stability anyway. I've raised radar bug 5665635 to represent the issue.

The conclusion of this story is; always have data checking (I used md5 to determine whether files were corrupted or not in a non-ZFS environment) and ensure that you have several backups of data, especially before moving from 10.4 to 10.5.