Monthly Archives: January 2014

Why and how to run ZFS on Linux

I’m writing a bit about ZFS these days, and I thought I’d write a bit about why I am using it, why it might or might not be interesting for you, and what you might do about it.

ZFS Features and Background

ZFS is not just a filesystem in the traditional sense, though you can use it that way. It is an integrated storage stack, which can completely replace the need for LVM, md-raid, and even hardware RAID controllers. This permits quite a bit of flexibility and optimization not present when building a stack involving those components. For instance, if a drive in a RAID fails, it needs only rebuild the parts that have actual data stored on them.

Let’s look at some of the features of ZFS:

  • Full checksumming of all data and metadata, providing protection against silent data corruption. The only other Linux filesystem to offer this is btrfs.
  • ZFS is a transactional filesystem that ensures consistent data and metadata.
  • ZFS is copy-on-write, with snapshots that are cheap to create and impose virtually undetectable performance hits. Compare to LVM snapshots, which make writes notoriously slow and require an fsck and mount to get to a readable point.
  • ZFS supports easy rollback to previous snapshots.
  • ZFS send/receive can perform incremental backups much faster than rsync, particularly on systems with many unmodified files. Since it works from snapshots, it guarantees a consistent point-in-time image as well.
  • Snapshots can be turned into writeable “clones”, which simply use copy-on-write semantics. It’s like a cp -r that completes almost instantly and takes no space until you change it.
  • The datasets (“filesystems” or “logical volumes” in LVM terms) in a zpool (“volume group”, to use LVM terms) can shrink or grow dynamically. They can have individual maximum and minimum sizes set, but unlike LVM, where if, say, /usr gets bigger than you thought, you have to manually allocate more space to it, ZFS datasets can use any space available in the pool.
  • ZFS is designed to run well in big iron, and scales to massive amounts of storage. It supports SSDs as L2 cache and ZIL (intent log) devices.
  • ZFS has some built-in compression methods that are quite CPU-efficient and can yield not just space but performance benefits in almost all cases involving compressible data.
  • ZFS pools can host zvols, a block device under /dev that stores its data in the zpool. zvols support TRIM/DISCARD, so are ideal for storing VM images, as they can instantly release space released by the guest OS. They can also be snapshotted and backed up like the rest of ZFS.

Although it is often considered a server filesystem, ZFS has been used in plenty of other situations for some time now, with ports to FreeBSD, Linux, and MacOS. I find it particularly useful:

  • To have faith that my photos, backups, and paperwork archives are intact. zpool scrub at any time will read the entire dataset and verify the integrity of every bit.
  • I can create snapshots of my system before running apt-get dist-upgrade, making it easy to track down issues or roll back to a known-good configuration. Ideal for people tracking sid or testing. One can also easily simply boot from a previous snapshot.
  • Many scripts exist that make frequent snapshots, and retain the for a period of time as a way of protecting work in progress against an accidental rm. There is no reason not to snapshot /home every 5 minutes, for instance. It’s almost as good as storing / in git.

The added level of security in having cheap snapshots available is almost worth it by itself.

ZFS drawbacks

Compared to other Linux filesystems, there are a few drawbacks of ZFS:

  • CDDL will prevent it from ever being part of the Linus kernel tree
  • It is more RAM-hungry than most, although with tuning it can even run on the Raspberry Pi.
  • A 64-bit kernel is strongly preferred, even in low-memory situations.
  • Performance on many small files may be less than ext4
  • The ZFS cache does not shrink and expand in response to changing RAM usage conditions on the system as well as the normal Linux cache does.
  • Compared to btrfs, ZFS lacks some features of btrfs, such as being able to shrink an existing pool or easily change storage allocation on the fly. On the other hand, the features in ZFS have never caused me a kernel panic, and half the things I liked about btrfs seem to have.
  • ZFS is already quite stable on Linux. However, the GRUB, init, and initramfs code supporting booting from a ZFS root and /boot is less stable. If you want to go 100% ZFS, be prepared to tweak your system to get it to boot properly. Once done, however, it is quite stable.

Converting to ZFS

I have written up an extensive HOWTO on converting an existing system to use ZFS. It covers workarounds for all the boot-time bugs I have encountered as well as documenting all steps needed to make it happen. It works quite well.

Additional Hints

If setting up zvols to be used by VirtualBox or some such system, you might be interested in managing zvol ownership and permissions with udev.

Debian-Live Rescue image with ZFS On Linux; Ditched btrfs

I’m a geek. I enjoy playing with different filesystems, version control systems, and, well, for that matter, radios.

I have lately started to worry about the risks of silent data corruption, and as such, looked to switch my personal systems to either ZFS or btrfs, both of which offer built-in checksumming of all data and metadata. I initially opted for btrfs, because of its tighter integration into the Linux kernel and ability to shrink an existing btrfs filesystem.

However, as I wrote last month, that experiment was not a success. I had too many serious performance regressions and one too many kernel panics and decided it wasn’t worth it. And that the SuSE people got it wrong, deeply wrong, when they declared btrfs ready for production. I never lost any data, to its credit. But it simply reduces uptime too much.

That left ZFS. Before I build a system, I always want to make sure I can repair it. So I started with the Debian Live rescue image, and added the zfsonlinux.org repository to it, along with some key packages to enable the ZFS kernel modules, GRUB support, and initramfs support. The resulting image is described, and can be downloaded from, my ZFS Rescue Disc wiki page, which also has a link to my source tree on github.

In future blog posts in the series, I will describe the process of converting existing Debian installations to use ZFS, of getting them to boot from ZFS, some bugs I encountered along the way, and some surprising performance regressions in ZFS compared to ext4 and btrfs.

Married!

One week before the wedding, to Laura: “Mono won’t just clear up right away.”

One week before the wedding, to me: “That’s going to need stitches.”

Yes, not long before the wedding, Laura had come down with mononucleosis and I had cut into my finger with a very sharp knife while cutting bread requiring a trip to the emergency room to get stitches. Two days before we got married, instead of moving furniture, I was getting stitches out of my finger.

It wasn’t the kind of week we had planned.

But it was the happiest, most amazing occasion I could have ever imagined.

As I wrote last month, I am richly blessed indeed.

Our wedding was three days after Christmas. The church was still decorated for Christmas, with the tree in on corner, glittering stars suspended in mid-air on cables from the walls, wreaths and candles in the windows, and it was a joy-filled day.

Before the ceremony, we took pictures — the only part of the day Jacob and Oliver weren’t thrilled with. Nevertheless, we got some fun ones.

Laura and I seem to know quite a few pastors between us – and not just because Laura is a pastor. My brother officiated with the wedding vows, his wife with scripture and a prayer of blessing, and the church’s pastor gave the message.

Laura and I wrote in our wedding program, “Music has long been a thread running through both our lives. We have enjoyed singing together, playing piano and pennywhistle duets, attending concerts, and even exploring old hymnals. Music is also one of the best ways to have a conversation – even a conversation with God.” We wrote a page in the program about each of the hymns that were a part of the wedding, and why we picked them. The combined church choirs of my home church and Laura’s church sang John Rutter’s beautiful arrangement of For the Beauty of the Earth (click here to listen to a different choir). Hearing “For the beauty of each hour”, “For the joy of human love”, and “Lord of all, to thee we raise this our joyful hymn of praise” was perfect for the day.

It was with such great happiness that we walked out of the sanctuary, a married couple, to the sound of the congregation singing Joy to the World!

Jacob and Oliver were so very excited on our wedding day. They happily explored the church while waiting for things to happen. We had them help us light our unity candle, and they were pleased with that. Jacob loved his suit, which made him look just like me. And they were, of course, delighted with the cake and in the middle of it all.

For our honeymoon, we managed to get two weeks of vacation, and spent about half of it at home. We had looked at various options for retreats in the country, but eventually concluded that our house is a retreat in the country, so might as well enjoy it at home.

We also went to the Palo Duro Canyon area near Amarillo in the Texas panhandle, staying in a small B&B in Canyon, TX. Palo Duro is the second-largest canyon in North America, and quite colorful year-round. What a beautiful place to go for our honeymoon! By the time we got there, Laura was getting past mono, and we went for hikes in the canyon on two different days — hiking a total of 10 miles, including a hike up the side of the canyon.

After we got back home, on the last weekday of our honeymoon, we went back to the Flint Hills of Kansas, to some of the same places we had spent our third date. We climbed the windy staircase at the Chase County Courthouse, the oldest courthouse still in use in Kansas.

And peered out its famous oval window.

We found the last remnant of the old ghost town of Elk, ate at the same restaurant we had that day. It brought back wonderful memories, and it was a good day in itself. Because even though it was a gold, drizzly, overcast day in January, this time, we were married.

And this year, Thanksgiving is all year.