Wednesday, June 4, 2008

Sharing a ZFS pool between Linux and Solaris

If you are multi-booting between Linux and Solaris (and others like FreeBSD, OpenBSD and Mac OS X, I expect) you will sooner or later encounter the problem of how to share disk space between the operating systems. FAT32 is not satisfactory due to its lack of POSIX features, in particular file ownership and access modes, not to mention its sub par performance. ext2/3 is not an option because you only get read-only support for it in Solaris, and similarly UFS enjoys only read-only support in Linux. The whole situation is rather depressing.


Enter ZFS.


This all started because I discovered that I can have a ZFS root file system without having to install OpenSolaris. The trick as some of you may know, is to select "Solaris Express" from the first menu on booting the install disk, and then select one of the two "Interactive Text" options from the next menu. This puts you back into 1984 in terms of installers, but you get the option of using ZFS for root!


Note: It might be possible to do this with the default installer, but on my computer the installer just would not run (I got some daft error about fonts and mouse themes). With a ZFS root, the Swap and Dump automatically goes onto dedicated vdevs, and you save a lot in terms of pre-allocated space.


I have of course used ZFS on my laptop previously as a test, but the benefits were limited by the fact that I still had "slices" for the OS and a small ZFS pool on a spare slice.


I'm not sure which build of Nevada first introduced the ZFS root option in the installer, but it is available in build 90 at least.


My choice of Linux distribution is Ubuntu 8.04. The steps to setting up a ZFS pool shared across operating systems are as follow:


1. Select a Partitioning scheme with minimal space allocated to each of Ubuntu and Nevada.
I decided to put Ubuntu in an Extended partition with a 10 GB Logical Partition for the OS, /var and /home, and a 1 GB Logical partition for Swap.
For Solaris I allocated a 24 GB primary partition to become the ZFS root pool, which includes Swap, Dump, OS and Live-upgrade space.
The balance of the 100 GB disk will be shared between Ubuntu and Solaris using ZFS.


Note: Linux and Solaris has got some different views on how disk partitioning works.
Due to historical reasons, in particular due to compatibility with Solaris on SPARC hardware, Solaris slices live in a single primary partition with an identifier of 0x82 (SOLARIS) or 0xbf (SOLARIS2) somewhat like how logical fdisk partitions live inside an "extended partition".


2. Install Ubuntu first, creating only the partitions for it. Remember to not have any external drives connected as it can screw up the order in which drives are detected and as a result bugger up the Grub menu list.


During the installation you create an Admin user. This will eventually in the future become a "backup" admin user.


3. Reboot and load patches/updates, and backup the Grub /boot/grub/menu.lst file to an external media such as a USB thumb drive for easy access. The Ubuntu Grub does not understand ZFS, so you need to use Nevada's Grub to manage the multi-booting.


4. Also set Ubuntu to use the hardware clock as local time in stead of UTC. (This is what Solaris uses) To do this change UTC=yes to UTC=no in /etc/default/rcS, then reboot.


5. Install Nevada. Use either of the Interactive Text installer options, but for simplicity's sake specify the system as non-networked.


6. Reboot and create a user for every-day use, and add this user to the "Primary Administrator" role using usermod -P "Primary Administrator" <userid>


7. Add the Ubuntu Grub entries you saved in step 3 to the end of the Nevada grub menu.lst file. This will be stored in /boot/grub/menu.lst (The default pool name is rpool)


8. Reboot back into Ubuntu, then follow the Linux ZFS-FUSE instalation instructions to get ZFS-FUSE installed. I used the trunk to get the latest ZFS updates from Opensolaris.org included. Also see this Ubuntu Wiki page, and Ralf Hildebrand's blog for more info.


For reference, this is the procedure I used

apt-get install mercurial build-essential scons libfuse-dev libaio-dev devscripts build-essential zlib1g-dev
cd ~
hg clone http://www.wizy.org/mercurial/zfs-fuse/trunk
cd trunk/src
scons
sudo scons install


9. Create an fdisk partition for the shared ZFS pool using the remaining disk space. I used a primary partition and set the identifier to W95 FAT32, though this is probably unimportant.


10. While still running running Ubuntu, create a ZFS pool on this new fdisk partition using a command like this:

sudo /usr/local/sbin/zfs-fuse
sudo /usr/local/sbin/zpool create -m /shared SHARED


I like to give my ZFS pools names in all-capitals, purely because it makes the ZFS pool devices stand out better in the output from df and mount.


WARNING: I found that if I created the ZFS pool under Solaris, it refused to import into Ubuntu, but if I created it under Linux it imports/exports just fine in both directions. Both pools are created as version 10 pools, so the reason for this is not obvious. If you do decide to experiment with creating the pool under Solaris, when you want to realy get rid of the pool you will discover you need to dd zeros over the pool before creating it again, otherwise the condition remains unchanged despite destroying and re-creating the pool. If you do experiment with this please do share your results!


11. Export the ZFS pool using

/usr/local/sbin/zpool export SHARED


12. Reboot into Nevada and import the pool using

/usr/local/sbin/zpool import SHARED


Note: If you forget to export before you shut down, you will need to add -f to force the import after booting into the other OS.


At this point I just sat there and stared in wonder at how well it actually works. There is beauty in finally seeing this working!


13. Create some init.d / rc scripts to automate the import/export on shutdown/startup.


14. Now you can start customizing both operating environments. You may want to setup Automatic network configuration by enabling the SMF for NWAM in Solaris, eg by doing:

pfexec svcadm disable physical:default
pfexec svcadm enable physical:nwam


I'm looking forward to testing Live Upgrade on my setup with ZFS root, and to getting a shared home directory to work well for both Solaris and Ubuntu. I have created a login ID with the same gid/uid and a home directory under the shared ZFS pool, but after a few changes it got broken under Ubuntu, probably due to subtle differences in how Gnome/Desktop config items are stored and/or expected.


Despite my initial sceptism about FUSE, it is actually quite functional. All-in-all I love being able to share a file system, well, many files systems, between the two operating environments!



No comments: