Alex headshot

AlBlue’s Blog

Macs, Modularity and More

Merged ZFS from OpenSolaris to OSX

2010, mac, zfs

Wow, that took a long time. But I've finally managed to get the OpenSolaris codebase (shadowed at http://github.com/alblue/onnv-gate-zfs/tree/onnv_72, in case you're interested) merged in with the original Apple code changes of zfs119. The merged code is available at http://github.com/alblue/mac-zfs/commit/6d20fbb74f11a6765ca41d0144bd31609c15c5a9 if you want to see the changes in all their glory. Here's the announcement on the mailing list.

Caution: The merge is not ready for production use. There's a critical bug affecting unmounting of ZFS shares, which should prevent anyone wanting to use this in a real environment.

What does this mean for the ZFS project on Mac OS X? Well, for one thing, that it's still being worked on; even if progress isn't that great. Furthermore, whilst it builds on 10.5 and 10.6, there's a functionality that is still missing (for a list of open issues, see the Google Project Issues page), but having got to a stage where it's as merged as it can get with the OpenSolaris codebase, we should be able to roll forward with the changes over the coming weeks and months.

It's not all plain sailing, though; there are a significant number of changes in the Mac OS X codebase (for example, the way you free a node or determine if something is a directory, as well as ACL type permission checks) which will need to be re-applied to each incoming change from the upstream OpenSolaris codebase. And, there's a lot of Mac-specific code in there which isn't referenced (of course) and may need further changes.

It's also worth noting that this set of changes still needs to have the tyres kicked; whilst it's simple enough to build and install, it needs to undergo a lot more testing in order to determine whether it's safe for general use (and I'd prefer to hold off generating the installers until that time). If you're willing to give it a go, though, you can download the project, compile and build within Xcode, and then test it out for yourself.

Things I've learnt Merging is never easy at the best of times, but really, if you aren't working with a DVCS then you're shooting yourself in the feet before you even start. Even if you're forced to work in a CVCS by day, if you need to do a massive merge like this, consider importing and building a DVCS for the purposes of the merge, then exporting back again afterwards. You can even keep it around; I know of people who use Git on their desktops to get stuff in and out, and then synchronise with Git's svn module or export to the filing system for CVS's benefit.

Secondly, whether the repository is Git or Mercurial doesn't really matter, but it's a pain to work with both. The OpenSolaris codebase is hosted with Mercurial (at http://hub.opensolaris.org/bin/view/Project+onnv/) but you can't do merges directly between repositories. Instead, I cloned it locally, filtered it (using the hg.convert.filemap, at least, when it doesn't have errors in it – oops) into a second Hg repository, and then pushed it via http://hg-git.github.com/, which makes GitHub look like a Mercurial DVCS. Heck, if you prefer the Hg clients, you can still use GitHub. So, we now have http://github.com/alblue/onnv-gate-zfs/tree/onnv_72, which I plan to sync up with other versions in the future.

The key with a DVCS is the ease of merging. 708fa1 was the commit tree for the older ZFS-119 build from Apple, whilst fe4492 corresponds with the onnv-gate-zfs tree at onnv_72. And, with a hop, step and git merge, the two were brought together.

...except merging is never quite that easy. Yes, there's automatic symmetry if the files are in the same places/locations (which, thanks to a fair amount of earlier effort had already been done); but even so, the files were different and had evolved.

It seems that earlier irrelevant code was simply deleted from the Apple codebase, which meant the merges were much more difficult than it needed to be. Fortunately, as time has gone on, the coding style appears to have been to #ifdef sections of code applicable to Apple or not, with the result that the changes are much easier to process.

Tools I missed a good Git GUI for showing merge changes. I spent a lot of time with multiple terminal windows open and results of diff as I was going through. I guess we'll never see it in Xcode, but even other tools (like Eclipse EGit) aren't quite ready for showing diffs for merge conflict resolution at this stage. I also really wish I'd found diff --ifdef earlier; that would have saved me a lot of time in some of the simpler merge cases; though the right answer would probably be to accept the OpenSolaris implementation at times. However, all the automated tools in the world don't help when Apple's functions – particularly those in zfs_vnops.c and zfs_vfsops.c – have completely been renamed and their signatures changed.

In the end, I caught the last few (panic-inducing) errors using a little utility I created to do diffs. It's basically along the lines of:

cat ${INPUT} | 
  sed -e 's/\#include/@include/'
      -e 's/\#pragma.*//' |
  gcc -D_KERNEL -P -E - >
  ${OUTPUT}

What this does is runs through a file, changes the #includes to @includes (so that they don't get picked up by the pre-processor) and then through gcc's preprocessor to check the set of #ifdefs. That way, I can test the after effects of my #ifdefs (rather than just git diff) to see what I had done wrong. (The other bit strips out #pragmas, since they were causing unnecessary noise in the process.) With a quick find . -type f -exec diffone on my tree, combined with the outputs of the original codebase, I could see the exactly program diffs, rather than the source code diffs.

It turns out I had made 4 panic inducing errors in the merge, which is an symptom of doing some of these merges late into the night – or in some cases, early into the morning:

  • Refactoring char *tmp into user32_addr_t *tmp instead of user32_addr_t tmp (zfs_vfsops.c:1084)
  • Putting a *vpp = ZTOV(zp); just after, instead of just before, an #ifdef, with the result it was effectively removed (zfs_vnops.c:1716)
  • Missing off a return(error) somewhere
  • Missing a _ off one #ifdef __APPLE__ somewhere

The fact that I managed to get by with just these few merge errors is a combination of the power of a DVCS, as well as the hard work and dedication by those on the original ZFS Apple team (thanks again for everything, Noël et al). It was also made possible by the serious refactoring ahead of item to get the folders to match up – without which, the merge wouldn't have been able to happen.

Lastly, I'd just like to thank the others involved in the Mac ZFS project – both those that are helping with the code and the enthusiasm behind a filing system which clearly Apple has left behind. We've had a bit of a lull getting to this point, but it's all downhill from here.