Category Archives: Technology

Detailed Smart Card Cryptographic Token Security Guide

August 6, 2015Debian, Hardware, Linux, Technologyencryption, securityJohn Goerzen

After my first post about smartcards under Linux, I thought I would share some information I’ve been gathering.

This post is already huge, so I am not going to dive into — much — specific commands, but I am linking to many sources with detailed instructions.

I’ve reviewed several types of cards. For this review, I will focus on the OpenPGP card and the Yubikey NEO, since the Cardomatic Smartcard-HSM is not supported by the gpg version in Jessie.

Both cards are produced by people with strong support for the Free Software ecosystem and have strong cross-platform support with source code.

OpenPGP card: Basics with GnuPG

The OpenPGP card is well-known as one of the first smart cards to work well on Linux. It is a single-application card focused on use with GPG. Generally speaking, by the way, you want GPG2 for use with smartcards.

Basically, this card contains three slots: decryption, signing, and authentication slots. The concept is that the private key portions of the keys used for these items are stored only on the card, can never be extracted from the card, and the cryptographic operations are performed on the card. There is more information in my original post. In a fairly rare move for smartcards, this card supports 4096-byte RSA keys; most are restricted to 2048-byte keys.

The FSF Europe hands these out to people and has a lot of good information about them online, including some HOWTOs. The official GnuPG smart card howto is 10 years old, and although it has some good background, I’d suggest using the FSFE instructions instead.

As you’ll see in a bit, most of this information also pertains to the OpenPGP mode of the Yubikey Neo.

OpenPGP card: Other uses

Of course, this is already pretty great to enhance your GPG security, but there’s a lot more that you can do with this card to add two-factor authentication (2FA) to a lot of other areas. Here are some pointers:

OpenPGP card: remote authentication with ssh

You can store the private part of your ssh key on the card. Traditionally, this was only done by using the ssh agent emulation mode of gnupg-agent. This is still possible, of course.

Now, however, the OpenSC project now supports the OpenPGP card as a PKCS#11 and PKCS#15 card, which means it works natively with ssh-agent as well. Try just ssh-add -s /usr/lib/x86_64-linux-gnu/pkcs11/opensc-pkcs11.so if you’ve put a key in the auth slot with GPG. ssh-add -L will list its fingerprint for insertion into authorized_keys. Very simple!

As an aside: Comments that you need scute for PKCS#11 support are now outdated. I do not recommend scute. It is quite buggy.

OpenPGP card: local authentication with PAM

You can authenticate logins to a local machine by using the card with libpam-poldi — here are some instructions.

Between the use with ssh and the use with PAM, we have now covered 2FA for both local and remote use in Unix environments.

OpenPGP card: use on Windows

Let’s move on to Windows environments. The standard suggestion here seems to be the mysmartlogon OpenPGP mini-driver. It works with some sort of Windows CA system, or the local accounts using EIDAuthenticate. I have not yet tried this.

OpenPGP card: Use with X.509 or Windows Active Directory

You can use the card in X.509 mode via these gpgsm instructions, which apparently also work with Windows Active Directory in some fashion.

You can also use it with web browsers to present a certificate from a client for client authentication. For example, here are OpenSC instructions for Firefox.

OpenPGP card: Use with OpenVPN

Via the PKCS#11 mode, this card should be usable to authenticate a client to OpenVPN. See the official OpenVPN HOWTO or these other instructions for more.

OpenPGP card: a note on PKCS#11 and PKCS#15 support

You’ll want to install the opensc-pkcs11 package, and then give the path /usr/lib/x86_64-linux-gnu/pkcs11/opensc-pkcs11.so whenever something needs the PKCS#11 library. There seem to be some locking/contention issues between GPG2 and OpenSC, however. Usually killing pcscd and scdaemon will resolve this.

I would recommend doing manipulation operations (setting PINs, generating or uploading keys, etc.) via GPG2 only. Use the PKCS#11 tools only to access.

OpenPGP card: further reading

Background on how these things work
Fellowship homepage, HOWTO
Debian portal page with lots of info
g10code page from the person behind the hardware
Ordering information

Kernel Concepts also has some nice readers; you can get this card in a small USB form-factor by getting the mini-card and the Gemalto reader.

Yubikey Neo Introduction

The Yubikey Neo is a fascinating device. It is a small USB and NFC device, a little smaller than your average USB drive. It is a multi-application device that actually has six distinct modes:

OpenPGP JavaCard Applet (pc/sc-compatible)
Personal Identity Verification [PIV] (pc/sc-compatible, PKCS#11-compatible in Windows and OpenSC)
Yubico HOTP, via your own auth server or Yubico’s
OATH, with its two sub-modes:
- OATH TOTP, with a mobile or desktop helper app (drop-in for Google Authenticator
- OATH HOTP
Challenge-response mode
U2F (Universal 2nd Factor) with Chrome

There is a ton to digest with this device.

Yubikey Neo Basics

By default, the Yubikey Neo is locked to only a subset of its features. Using the yubikey-personalization tool (you’ll need the version in stretch; jessie is too old), you can use ykpersonalize -m86 to unlock the full possibilities of the card. Run that command, then unplug and replug the device.

It will present itself as a USB keyboard as well as a PC/SC-compatible card reader. It has a capacitive button, which is used to have it generate keystrokes to input validation information for HOTP or HMAC validation. It has two “slots” that can be configured with HMAC and HOTP; a short button press selects the default slot #1 and a long press selects slot #2.

But before we get into that, let’s step back at some basics.

opensc-tool –list-algorithms claims this card supports RSA with 1024, 2048, and 3072 sizes, and EC with 256 and 384-bit sizes. I haven’t personally verified anything other than RSA-2048 though.

Yubikey Neo: OpenPGP support

In this mode, the card is mostly compatible with the physical OpenPGP card. I say “mostly” because there are a few protocol differences I’ll get into later. It is also limited to 2048-byte keys.

Support for this is built into GnuPG and the GnuPG features described above all work fine.

In this mode, it uses firmware from the Yubico fork of the JavaCard OpenPGP Card applet. There are Yubico-specific tutorials available, but again, most of the general GPG stuff applies.

You can use gnupg-agent to use the card with SSH as before. However, due to some incompatibilities, the OpenPGP applet on this card cannot be used as a PKCS#11 card with either scute or OpenSC. That is not exactly a huge problem, however, as the card has another applet (PIV) that is compatible with OpenSC and so this still provides an avenue for SSH, OpenVPN, Mozilla, etc.

It should be noted that the OpenPGP applet on this card can also be used with NFC on Android with the OpenKeychain app. Together with pass (or its Windows, Mac, or phone ports), this makes a nicely secure system for storing passwords.

Yubikey Neo: PKCS#11 with the PIV applet

There is also support for the PIV standard on the Yubikey Neo. This is supported by default on Linux (via OpenSC) and Windows and provides a PKCS#11-compabible store. It should, therefore, be compatible with ssh-agent, OpenVPN, Active Directory, and all the other OpenPGP card features described above. The only difference is that it uses storage separate from the OpenPGP applet.

You will need one of the Yubico PIV tools to configure the key for it; in Debian, the yubico-piv-tool from stretch does this.

Here are some instructions on using the Yubikey Neo in PIV mode:

A final note: for security, it’s important to change the management key and PINs before deploying the PIV mode.

I couldn’t get this to work with Firefox, but it worked pretty much everywhere else.

Yubikey Neo: HOTP authentication

This is the default mode for your Yubikey; all other modes require enabling with ykpersonalize. In this mode, a 128-bit AES key stored on the Yubikey is used to generate one-time passwords (OTP). (This key was shared in advance with the authentication server.) A typical pattern would be for three prompts: username, password, and Yubikey HOTP. The user clicks in the Yubikey HOTP field, touches the Yubikey, and their one-time token is pasted in.

In the background, the service being authenticated to contacts an authentication server. This authentication server can be either your own (there are several open source implementations in Debian) or the free Yubicloud.

Either way, the server decrypts the encrypted part of the OTP, performs validity checks (making sure that the counter is larger than any counter it’s seen before, etc) and returns success or failure back to the service demanding authentication.

The first few characters of the posted auth contain the unencrypted key ID, and thus it can also be used to provide username if desired.

Yubico has provided quite a few integrations and libraries for this mode. A few highlights:

Windows auth
libpam-yubico, a quite versatile PAM module

You can also find some details on the OTP mode. Here’s another writeup.

This mode is simple to implement, but it has a few downsides. One is that it is specific to the Yubico line of products, and thus has a vendor lock-in factor. Another is the dependence on the authentication server; this creates a potential single point of failure and can be undesireable in some circumtances.

Yubikey Neo: OATH and HOTP and TOTP

First, a quick note: OATH and OAuth are not the same. OATH is an authentication protocol, and OAuth is an authorization protocol. Now then…

Like Yubikey HOTP, OATH (both HOTP and TOTP) modes rely on a pre-shared key. (See details in the Yubico article.) Let’s talk about TOTP first. With TOTP, there is a pre-shared secret with each service. Each time you authenticate to that service, your TOTP generator combines the timestamp with the shared secret using a HMAC algorithm and produces a OTP that changes every 30 seconds. Google Authenticator is a common example of this protocol, and this is a drop-in replacement for it. Gandi has a nice description of it that includes links to software-only solutions on various platforms as well.

With the Yubikey, the shared secrets are stored on the card and processed within it. You cannot extract the shared secret from the Yubikey. Of course, if someone obtains physical access to your Yubikey they could use the shared secret stored on it, but there is no way they can steal the shared secret via software, even by compromising your PC or phone.

Since the Yubikey does not have a built-in clock, TOTP operations cannot be completed solely on the card. You can use a PC-based app or the Android application (Play store link) with NFC to store secrets on the device and generate your TOTP codes. Command-line users can also use the yubikey-totp tool in the python-yubico package.

OATH can also use HOTP. With HOTP, an authentication counter is used instead of a clock. This means that HOTP passwords can be generated entirely within the Yubikey. You can use ykpersonalize to configure either slot 1 or 2 for this mode, but one downside is that it can really only be used with one service per slot.

OATH support is all over the place; for instance, there’s libpam-oath from the OATH toolkit for Linux platforms. (Some more instructions on this exist.)

Note: There is another tool from Yubico (not in Debian) that can apparently store multiple TOTP and HOTP codes in the Yubikey, although ykpersonalize and other documentation cannot. It is therefore unclear to me if multiple HOTP codes are supported, and how..

Yubikey Neo: Challenge-Response Mode

This can be useful for doing offline authentication, and is similar to OATH-HOTP in a sense. There is a shared secret to start with, and the service trying to authenticate sends a challenge to the token, which must supply an appropriate response. This makes it only suitable for local authentication, but means it can be done fairly automatically and optionally does not even require a button press.

To muddy the waters a bit, it supports both “Yubikey OTP” and HMAC-SHA1 challenge-response modes. I do not really know the difference. However, it is worth noting that libpam-yubico works with HMAC-SHA1 mode. This makes it suitable, for instance, for logon passwords.

Yubikey Neo: U2F

U2F is a new protocol for web-based apps. Yubico has some information, but since it is only supported in Chrome, it is not of interest to me right now.

Yubikey Neo: Further resources

Yubico has a lot of documentation, and in particular a technical manual that is actually fairly detailed.

Closing comments

Do not think a hardware security token is a panacea. It is best used as part of a multi-factor authentication system; you don’t want a lost token itself to lead to a breach, just as you don’t want a compromised password due to a keylogger to lead to a breach.

These things won’t prevent someone that has compromised your PC from abusing your existing ssh session (or even from establishing new ssh sessions from your PC, once you’ve unlocked the token with the passphrase). What it will do is prevent them from stealing your ssh private key and using it on a different PC. It won’t prevent someone from obtaining a copy of things you decrypt on a PC using the Yubikey, but it will prevent them from decrypting other things that used that private key. Hopefully that makes sense.

One also has to consider the security of the hardware. On that point, I am pretty well satisfied with the Yubikey; large parts of it are open source, and they have put a lot of effort into hardening the hardware. It seems pretty much impervious to non-government actors, which is about the best guarantee a person can get about anything these days.

I hope this guide has been helpful.

First steps with smartcards under Linux and Android — hard, but it works

July 16, 2015Debian, LinuxJohn Goerzen

Well this has been an interesting project.

It all started with a need to get better password storage at work. We wound up looking heavily at a GPG-based solution. This prompted the question: how can we make it even more secure?

Well, perhaps, smartcards. The theory is this: a smartcard holds your private keys in a highly-secure piece of hardware. The PC can never actually access the private keys. Signing and decrypting operations are done directly on the card to prevent the need to export the private key material to the PC. There are lots of “standards” to choose from (PKCS#11, PKCS#15, and OpenPGP card specs) that are relevant here. And there are ways to use SSH and OpenVPN with some of these keys too. Access to the card is protected by a passphrase (called a “PIN” in smartcard lingo, even though it need not be numeric). These smartcards might be USB sticks, or cards you pop into a reader. In any case, you can pop them out when not needed, pop them in to use them, and… well, pretty nice, eh?

So that’s the theory. Let’s talk a bit of reality.

First of all, it is hard for a person like me to evaluate how secure my data is in hardware. There was a high-profile bug in the OpenPGP JavaCard applet used by Yubico that caused the potential to use keys without a PIN, for instance. And how well protected is the key in the physical hardware? Granted, in most of these cards you’re talking serious hardware skill to compromise them, but still, this is unknown in absolute terms.

Here’s the bigger problem: compatibility. There are all sorts of card readers, but compatibility with pcsc-tools and pcscd on Linux seems pretty good. But the cards themselves — oh my. PKCS#11 defines an interface API, but each vendor would provide their own .so or .dll file to interface. Some cards (for instance, the ACOS5-64 mentioned on the Debian wiki!) are made by vendors that charge $50 for the privilege of getting the drivers needed to make them work… and they’re closed-source proprietary drivers at that.

Some attempts

I ordered several cards to evaluate: the OpenPGP card, specifically designed to support GPG; the ACOS5-64 card, the JavaCOS A22, the Yubikey Neo, and a simple reader listed on the GPG smartcard howto.

The OpenPGP card and ACOS5-64 are the only ones in the list that support 4096-bit RSA keys due to the computational demands of them. The others all support 2048-bit RSA keys.

The JavaCOS requires the user to install a JavaCard applet to the card to make it useable. The Yubico OpenPGP applet works here, along with GlobalPlatform to install it. I am not sure just how solid it is. The Yubikey Neo has yet to arrive; it integrates some interesting OAUTH and TOTP capabilities as well.

I found that Debian’s wiki page for smartcards lists a bunch of them that are not really useable using the tools in main. The ACOS5-64 was such a dud. But I got the JavaCOS A22 working quite nicely. It’s also NFC-enabled and works perfectly with OpenKeyChain on Android (looking like a “Yubikey Neo” to it, once the OpenPGP applet is installed). I’m impressed! Here’s a way to be secure with my smartphone without revealing everything all the time.

Really the large amount of time is put into figuring out how all this stuff fits together. I’m getting there, but I’ve got a ways to go yet.

Update: Corrected to read “signing and decrypting” rather than “signing and encrypting” operations are being done on the card. Thanks to Benoît Allard for catching this error.

Roundup of remote encrypted deduplicated backups in Linux

June 11, 2015LinuxJohn Goerzen

Since I wrote last about Linux backup tools, back in a 2008 article about BackupPC and similar toools and a 2011 article about dedpulicating filesystems, I’ve revisited my personal backup strategy a bit.

I still use ZFS, with my tool “simplesnap” that I wrote about in 2014 to perform local backups to USB drives, which get rotated offsite periodically. This has the advantage of being very fast and very secure, but I also wanted offsite backups over the Internet. I began compiling criteria, which ran like this:

Remote end must not need any special software installed. Storage across rsync, sftp, S3, WebDAV, etc. should all be good candidates. The remote end should not need to support hard links or symlinks, etc.
Cross-host deduplication at at least the file level is required, so if I move a 4GB video file from one machine to another, my puny DSL wouldn’t have to re-upload it.
All data that is stored remotely must be 100% encrypted 100% of the time. I must not need to have any trust at all in the remote end.
Each backup after the first must send only an incremental’s worth of data across the line. No periodic re-uploading of the entire data set can be done.
The repository format must be well-documented and stable.

So, how did things stack up?

Didn’t meet criteria

A lot of popular tools didn’t meet the criteria. Here are some that I considered:

BackupPC requires software on the remote end and does not do encryption.
None of the rsync hardlink tree-based tools are suitable here.
rdiff-backup requires software on the remote end and does not do encryption or dedup.
duplicity requires a periodic re-upload of a full backup, or incremental chains become quite long and storage-inefficient. It also does not support dedup, although it does have an impressive list of “dumb” storage backends.
ZFS, if used to do backups the efficient way, would require software to be installed on the remote end. If simple “zfs send” images are used, the same limitations as with duplicity apply.
The tools must preserve POSIX attributes like uid/gid, permission bits, symbolic links, hard links, etc. Support for xattrs is also desireable but not required.
bup and zbackup are both interesting deduplicators, but do not yet have support for removing old data, so are impractical for this purpose.
burp requires software on the server side.

Obnam and Attic/Borg Backup

Obnam and Attic (and its fork Borg Backup) are both programs that have a similar concept at their heart, which is roughly this: the backup repository stores small chunks of data, indexed by a checksum. Directory trees are composed of files that are assembled out of lists of chunks, so if any given file matches another file already in the repository somewhere, the added cost is just a small amount of metadata.

Obnam was eventually my tool of choice. It has built-in support for sftp, but its reliance on local filesystem semantics is very conservative and it works fine atop davfs2 (and, I’d imagine, other S3-backed FUSE filesystems). Obnam’s repository format is carefully documented and it is very conservatively designed through and through — clearly optimized for integrity above all else, including speed. Just what a backup program should be. It has a lot of configurable options, including chunk size, caching information (dedup tables can be RAM-hungry), etc. These default to fairly conservative values, and the performance of Obnam can be significantly improved with a few simple config tweaks.

Attic was also a leading contender. It has a few advantages over Obnam, actually. One is that it uses an rsync-like rolling checksum method. This means that if you add 1 byte at the beginning of a 100MB file, Attic will upload a 1-byte chunk and then reference the other chunks after that, while Obnam will have to re-upload the entire file, since its chunks start at the beginning of the file in fixed sizes. (The only time Obnam has chunks smaller than its configured chunk size is with very small files or the last chunk in a file.) Another nice feature of Attic is its use of “packs”, where it groups chunks together into larger pack files. This can have significant performance advantages when backing up small files, especially over high-latency protocols and links.

On the downside, Attic has a hardcoded fairly small chunksize that gives it a heavy metadata load. It is not at all as configurable as Obnam, and unlike Obnam, there is nothing you can do about this. The biggest reason I avoided it though was that it uses a single monolithic index file that would have to be uploaded from scratch after each backup. I calculated that this would be many GB in size, if not even tens of GB, for my intended use, and this is just not practical over the Internet. Attic assumes that if you are going remote, you run Attic on the remote so that the rewrite of this file doesn’t have to send all the data across the network. Although it does work atop davfs2, this support seemed like an afterthought and is clearly not very practical.

Attic did perform much better than Obnam in some ways, largely thanks to its pack support, but the monolothic index file was going to make it simply impractical to use.

There is a new fork of Attic called Borg that may, in the future, address some of these issues.

Brief honorable mentions: bup, zbackup, syncany

There are a few other backup tools that people are talking about which do dedup. bup is frequently mentioned, but one big problem with it is that it has no way to delete old data! In other words, it is more of an archive than a backup tool. zbackup is a really neat idea — it dedups anything you feed at it, such as a tar stream or “zfs send” stream, and can encrypt, too. But it doesn’t (yet) support removing old data either.

syncany is fundamentally a syncing tool, but can also be used from the command line to do periodic syncs to a remote. It supports encryption, sftp, webdave, etc. natively, and runs on quite a number of platforms easily. However, it doesn’t store a number of POSIX attributes, such as hard links, uid/gid owner, ACL, xattr, etc. This makes it impractical for use for even backing up my home directory; I make fairly frequent use of ln, both with and without -s. If there were some tool to create/restore archives of metadata, that might work out better.

First impressions and review of OwnCloud

May 8, 2015Linux, SoftwareJohn Goerzen

In my recent post (I give up on Google), a lot of people suggested using OwnCloud as a replacement for several Google services. I’ve been playing around with it for a few days, and it is something of a mix of awesome and disappointing, in my opinion.

Files

OwnCloud started as a file-sync tool, somewhat akin to Google Drive and Dropbox. It has clients for every platform, and it is also a client for every platform: you can have subfolders of your OwnCloud installation stored on WebDav, *FTP*, Google Drive, Dropbox, you name it. It is a pretty nice integrator of other storage services, and provides the only way to use some of them on Linux (*cough* Google Drive *cough*)

One particularly interesting feature is the live editing in the browser of ODT, DOCX, and TXT files. This is somewhat similar to Google Docs and the only such thing I’ve seen in Open Source software. It writes changes directly back to the documents and, in my limited testing, seems to work well. A very nice feature!

I’ve tested the syncing only on Linux so far, but it looks solid.

There are two surprising issues, however: there is no deduplication and no delta-uploads. Add 10 bytes to the end of a 1GB file, and you re-upload the 1GB file. Thankfully the OwnCloud GUI client is smart enough to use inotify to notice an mv, but my guess is — and I haven’t tested this, but apparently OwnCloud doesn’t use hashes at all — that the CLI client would require a reupload after any mv, because it doesn’t run continuously.

In some situations, Syncany may be a useful work-around for this, as it does chunk-based dedup and client-side encryption. However, you would lose a lot of the sharing features inside OwnCloud by doing this, and the integration with the OwnCloud “apps” for photos, videos, and music.

The Android/mobile apps support all the usual auto-upload options.

Calendar

A lot of people report using OwnCloud as a calendar server, and it does indeed use CalDAV. With a program like DAVDroid or Mozilla Lightning, this makes, in theory, a full-functioning calendar syncing tool. There is, of course, also a web interface to the calendar. It, sadly, is limited. Or shall we say, VERY limited. Even something like sending an invite is missing — and in fact, the GUI for sharing an event is baffling. You can share it with someone, they get no say in whether or not it shows up, and it shows up on their calendar on the web only (not on synced copies) and they have no way to remove it!

Sharing calendars is similar; you can hide the display of any one of your calendars on the web interface, but not of any calendars shared with you. Baffling.

Address Book

I haven’t tested this yet, but there’s not much to test, I suspect. It can be shared with others, which I could see as a nice feature.

Bookmarks

An interesting bookmarks manager, though mysteriously not with Firefox sync support. There is Chrome sync support, and a separate Mozilla Sync support, but it doesn’t provide cross-browser syncing, apparently.

Music

It is designed to present an interface to music that is stored in Files. It provides an Ampache-compatible API, so there are a lot of clients that can stream music. It has very few options, not even for transcoding, so I don’t see how it would be useful for my FLAC collection.

Pictures

Sort of a gallery view of photos synced up with Files. Very basic. Has a sharing button to share a link to an entire folder, but no option to embed photos in blog posts at a lower resolution or shortcut to sharing individual photos.

Notes, Tasks, etc.

I haven’t had the chance to look at this much. Some of them sync to various clients. The Notes are saved as HTML files that get synced down.

Clients overall

There is a very helpful page that lists all the sync clients for OwnCloud — not just for files, but also for calendars, contacts, etc. The list is extensive!

Other options

The two other Open Source options mentioned on my blog post were Kolab and Sogo, and there is also Zimbra which also has a community edition. The Debian Groupware page lists a number of other groupware options as well. Citadel caught my eye (wow, it’s still around!). Sogo has ActiveSync support, which might make phone integration a lot easier. It is not dead-simple to set up like OwnCloud is, though, so I haven’t tried it out, but I will probably be looking at it and Citadel next.

“Has Linux lost its way?” comments prompt a Debian developer to revisit FreeBSD after 20 years

February 17, 2015DebianJohn Goerzen

I’ll admit it. I have a soft spot for FreeBSD. FreeBSD was the first Unix I ran, and it was somewhere around 20 years ago that I did so, before I switched to Debian. Even then, I still used some of the FreeBSD Handbook to learn Linux, because Debian didn’t have the great Reference that it does now.

Anyhow, some comments in my recent posts (“Has modern Linux lost its way?” and Reactions to that, and the value of simplicity), plus a latent desire to see how ZFS fares in FreeBSD, caused me to try it out. I installed it both in VirtualBox under Debian, and in an old 64-bit Thinkpad sitting in my basement that previously ran Debian.

The results? A mixture of amazing and disappointing. I will say that I am quite glad that both exist; there is plenty of innovation happening everywhere and neat features exist everywhere, too. But I can also come right out and say that the statement that FreeBSD doesn’t have issues like Linux does is false and misleading. In many cases, it’s running the exact same stack. In others, it’s better, but there are also others where it’s worse. Perhaps this article might dispell a bit of the FUD surrounding jessie, while also showing off some of the nice things FreeBSD does. My conclusion: Both jessie and FreeBSD 10.1 are awesome Free operating systems, but both have their warts. This article is more about FreeBSD than Debian, but it will discuss a few of Debian’s warts as well.

The experience

My initial reaction to FreeBSD was: wow, this feels so familiar. It reminds me of a commercial Unix, or maybe of Linux from a few years ago. A minimal, well-documented base system, everything pretty much in logical places in the filesystem, and solid memory management. I felt right at home. It was almost reassuring, even.

Putting together a FreeBSD box is a lot of package installing and config file editing. The FreeBSD Handbook, describing how to install X, talks about editing this or that file for this or that feature. I like being able to learn directly how things fit together by doing this.

But then you start remembering the reasons you didn’t like Linux a few years ago, or the commercial Unixes: maybe it’s that programs like apache are still not as well supported, or maybe it’s that the default vi has this tendency to corrupt the terminal periodically, or perhaps it’s that root’s default shell is csh. Or perhaps it’s that I have to do a lot of package installing and config file editing. It is not quite the learning experience it once was, either; now there are things like “paste this XML file into some obscure polkit location to make your mouse work” or something.

Overall, there are some areas where FreeBSD kills it in a way no other OS does. It is unquestionably awesome in several areas. But there are a whole bunch of areas where it’s about 80% as good as Linux, a number of areas (even polkit, dbus, and hal) where it’s using the exact same stack Linux is (so all these comments about FreeBSD being so differently put together strike me as hollow), and frankly some areas that need a lot of work and make it hard to manage systems in a secure and stable way.

The amazing

Let’s get this out there: I’ve used ZFS too much to use any OS that doesn’t support it or something like it. Right now, I’m not aware of anything like ZFS that is generally stable and doesn’t cost a fortune, so pretty much: if your Unix doesn’t do ZFS, I’m not interested. (btrfs isn’t there yet, but will be awesome when it is.) That’s why I picked FreeBSD for this, rather than NetBSD or OpenBSD.

ZFS on FreeBSD is simply awesome. They have integreated it extremely well. The installer supports root on zfs, even encrypted root on zfs (though neither is a default). top on a FreeBSD system shows a line of ZFS ARC (cache) stats right alongside everything else. The ZFS defaults for maximum cache size, readahead, etc. auto-tune themselves at boot (unless overridden) based on the amount of RAM in a system and the system type. Seriously, these folks have thought of everything and it just reeks of solid. I haven’t seen ZFS this well integrated outside the Solaris-type OSs.

I have been using ZFSOnLinux for some time now, but it is just not as mature as ZFS on FreeBSD. ZoL, for instance, still has some memory tuning issues, and is not really suggested for 32-bit machines. FreeBSD just nails it. ZFS on FreeBSD even supports TRIM, which is not available in ZoL and I think fairly unique even among OpenZFS platforms. It also supports delegated administration of the filesystem, both to users and to jails on the system, seemingly very similar to Solaris zones.

FreeBSD also supports beadm, which is like a similar tool on Solaris. This lets you basically use ZFS snapshots to make lightweight “boot environments”, so you can select which to boot into. This is useful, say, before doing upgrades.

Then there are jails. Linux has tried so hard to get this right, and fallen on its face so many times, a person just wants to take pity sometimes. We’ve had linux-vserver, openvz, lxc, and still none of them match what FreeBSD jails have done for a long time. Linux’s current jail-du-jour is LXC, though it is extremely difficult to configure in a secure way. Even its author comments that “you won’t hear any of the LXC maintainers tell you that LXC is secure” and that it pretty much requires AppArmor profiles to achieve reasonable security. These are still rather in flux, as I found out last time I tried LXC a few months ago. My confidence in LXC being as secure as, say, KVM or FreeBSD is simply very low.

FreeBSD’s jails are simple and well-documented where LXC is complex and hard to figure out. Its security is fairly transparent and easy to control and they just work well. I do think LXC is moving in the right direction and might even get there in a couple years, but I am quite skeptical that even Docker is getting the security completely right.

The simply different

People have been throwing around the word “distribution” with respect to FreeBSD, PC-BSD, etc. in recent years. There is an analogy there, but it’s not perfect. In the Linux ecosystem, there is a kernel project, a libc project, a coreutils project, a udev project, a systemd/sysvinit/whatever project, etc. You get the idea. In FreeBSD, there is a “base system” project. This one project covers the kernel and the base userland. Some of what they use in the base system is code pulled in from elsewhere but maintained in their tree (ssh), some is completely homegrown (kernel), etc. But in the end, they have a nicely-integrated base system that always gets upgraded in sync.

In the Linux world, the distribution makers are responsible for integrating the bits from everywhere into a coherent whole.

FreeBSD is something of a toolkit to build up your system. Gentoo might be an analogy in the Linux side. On the other end of the spectrum, Ubuntu is a “just install it and it works, tweak later” sort of setup. Debian straddles the middle ground, offering both approaches in many cases.

There are pros and cons to each approach. Generally, I don’t think either one is better. They are just different.

The not-quite-there

I said that there are a lot of things in FreeBSD that are about 80% of where Linux is. Let me touch on them here.

Its laptop support leaves something to be desired. I installed it on a few-years-old Thinkpad — basically the best possible platform for working suspend in a Free OS. It has worked perfectly out of the box in Debian for years. In FreeBSD, suspend only works if it’s in text mode. If X is running, the video gets corrupted and the system hangs. I have not tried to debug it further, but would also note that suspend on closed lid is not automatic in FreeBSD; the somewhat obscure instuctions tell you what policykit pkla file to edit to make suspend work in XFCE. (Incidentally, it also says what policykit file to edit to make the shutdown/restart options work).

Its storage subsystem also has some surprising misses. Its rough version of LVM, LUKS, and md-raid is called GEOM. GEOM, however, supports only RAID0, RAID1, and RAID3. It does not support RAID5 or RAID6 in software RAID configurations! Linux’s md-raid, by comparison, supports RAID0, RAID1, RAID4, RAID5, RAID6, etc. There seems to be a highly experimental RAID5 patchset floating around for many years, but it is certainly not integrated into the latest release kernel. The current documentation makes no mention of RAID5, although it seems that a dated logical volume manager supported it. In any case, RAID5 does not seem to be well-supported in software like it is in Linux.

ZFS does have its raidz1 level, which is roughly the same as RAID5. However, that requires full use of ZFS. ZFS also does not support some common operations, like adding a single disk to an existing RAID5 group (which is possible with md-raid and many other implementations.) This is a ZFS limitation on all platforms.

FreeBSD’s filesystem support is rather a miss. They once had support for Linux ext* filesystems using the actual Linux code, but ripped it out because it was in GPL and rewrote it so it had a BSD license. The resulting driver really only works with ext2 filesystems, as it doesn’t work with ext3/ext4 in many situations. Frankly I don’t see why they bothered; they now have something that is BSD-licensed but only works with a filesystem so old nobody uses it anymore. There are only two FreeBSD filesystems that are really useable: UFS2 and ZFS.

Virtualization under FreeBSD is also not all that present. Although it does support the VirtualBox Open Source Edition, this is not really a full-featured or fast enough virtualization environment for a server. Its other option is bhyve, which looks to be something of a Xen clone. bhyve, however, does not support Windows guests, and requires some hoops to even boot Linux guest installers. It will be several years at least before it reaches feature-parity with where KVM is today, I suspect.

One can run FreeBSD as a guest under a number of different virtualization systems, but their instructions for making the mouse work best under VirtualBox did not work. There may have been some X.Org reshuffle in FreeBSD that wasn’t taken into account.

The installer can be nice and fast in some situations, but one wonders a little bit about QA. I had it lock up on my twice. Turns out this is a known bug reported 2 months ago with no activity, in which the installer attempts to use a package manger that it hasn’t set up yet to install optional docs. I guess the devs aren’t installing the docs in testing.

There is nothing like Dropbox for FreeBSD. Apparently this is because FreeBSD has nothing like Linux’s inotify. The Linux Dropbox does not work in FreeBSD’s Linux mode. There are sketchy reports of people getting an OwnCloud client to work, but in something more akin to rsync rather than instant-sync mode, if they get it working at all. Some run Dropbox under wine, apparently.

The desktop environments tend to need a lot more configuration work to get them going than on Linux. There’s a lot of editing of polkit, hal, dbus, etc. config files mentioned in various places. So, not only does FreeBSD use a lot of the same components that cause confusion in Linux, it doesn’t really configure them for you as much out of the box.

FreeBSD doesn’t support as many platforms as Linux. FreeBSD has only two platforms that are fully supported: i386 and amd64. But you’ll see people refer to a list of other platforms that are “supported”, but they don’t have security support, official releases, or even built packages. They includ arm, ia64, powerpc, and sparc64.

The bad: package management

Roughly 20 years ago, this was one of the things that pulled me to Debian. Perhaps I am spolied from running the distribution that has been the gold standard for package management for so long, but I find FreeBSD’s package management — even “pkg-ng” in 10.1-RELEASE — to be lacking in a number of important ways.

To start with, FreeBSD actually has two different package management systems: one for the base system, and one for what they call the ports/packages collection (“ports” being the way to install from source, and “packages” being the way to install from binaries, but both related to the same tree.) For the base system, there is freebsd-update which can install patches and major upgrades. It also has a “cron” option to automate this. Sadly, it has no way of automatically indicating to a calling script whether a reboot is necessary.

freebsd-update really manages less than a dozen packages though. The rest are managed by pkg. And pkg, it turns out, has a number of issues.

The biggest: it can take a week to get security updates. The FreeBSD handbook explains pkg audit -F which will look at your installed packages (but NOT the ones in the base system) and alert you to packages that need to be updates, similar to a stripped-down version of Debian’s debsecan. I discovered this myself, when pkg audit -F showed a vulnerability in xorg, but pkg upgrade showed my system was up-to-date. It is not documented in the Handbook, but people on the mailing list explained it to me. There are workarounds, but they can be laborious.

If that’s not bad enough, FreeBSD has no way to automatically install security patches for things in the packages collection. Debian has several (unattended-upgrades, cron-apt, etc.) There is “pkg upgrade”, but it upgrades everything on the system, which may be quite a bit more than you want to be upgraded. So: if you want to run Apache with PHP, and want it to just always apply security patches, FreeBSD packages are not up to the job like Debian’s are.

The pkg tool doesn’t have very good error-handling. In fact, its error handling seems to be nonexistent at times. I noticed that some packages had failures during install time, but pkg ignored them and marked the package as correctly installed. I only noticed there was a problem because I happened to glance at the screen at the right moment during messages about hundreds of packages. In Debian, by contrast, if there are any failures, at the end of the run, you get a nice report of which packages failed, and an exit status to use in scripts.

It also has another issue that Debian resolved about a decade ago: package scripts displaying messages that are important for the administrator, but showing so many of them that they scroll off the screen and are never seen. I submitted a bug report for this one also.

Some of these things just make me question the design of pkg. If I can’t trust it to accurately report if the installation succeeded, or show me the important info I need to see, then to what extent can I trust it?

Then there is the question of testing of the ports/packages. It seems that, automated tests aside, basically everyone is running off the “master” branch of the ports/packages. That’s like running Debian unstable on your servers. I am distinctly uncomfortable with this notion, though it seems FreeBSD people report it mostly works well.

There are some other issues, too: FreeBSD ports make no distinction between development and runtime files like Debian’s packages do. So, just by virtue of wanting to run a graphical desktop, you get all of the static libraries, include files, build scripts, etc for XOrg installed.

For a package as concerned about licensing as FreeBSD, the packages collection does not have separate sections like Debian’s main, contrib, and non-free. It’s all in one big pot: BSD-license, GPL-license, proprietary without source license. There is /usr/local/share/licenses where you can look up a license for each package, but there is no way with FreeBSD, like there is with Debian, to say “never even show me packages that aren’t DFSG-free.” This is useful, for instance, when running in a company to make sure you never install packages that are for personal use only or something.

The bad: ABI stability

I’m used to being able to run binaries I compiled years ago on a modern system. This is generally possible in Linux, assuming you have the correct shared libraries available. In FreeBSD, this is explicitly NOT possible. After every major version upgrade, you must reinstall or recompile every binary on your system.

This is not necessarily a showstopper for me, but it is a hassle for a lot of people.

Update 2015-02-17: Some people in the comments are pointing out compat packages in the ports that may help with this situation. My comment was based on advice in the FreeBSD Handbook stating “After a major version upgrade, all installed packages and ports need to be upgraded”. I have not directly tried this, so if the Handbook is overstating the need, then this point may be in error.

Conclusions

As I said above, I found little validation to the comments that the Debian ecosystem is noticeably worse than the FreeBSD one. Debian has its warts too — particularly with keeping software up-to-date. You can see that the two projects are designed around a different passion: FreeBSD’s around the base system, and Debian’s around an integrated whole system. It would be wrong to say that either of those is always better. FreeBSD’s approach clearly produces some leading features, especially jails and ZFS integration. Yet Debian’s approach also produces some leading features in the way of package management and security maintainability beyond the small base.

My criticism of excessive complexity in the polkit/cgmanager/dbus area still stands. But to those people commenting that FreeBSD hasn’t “lost its way” like Linux has, I would point out that FreeBSD mostly uses these same components also, and FreeBSD has excessive complexity in its ports/package system and system management tools. I think it’s a draw. You pick the best for your use case. If you’re looking for a platform to run a single custom app then perhaps all of the Debian package management benefits don’t apply to you (you may not even need FreeBSD’s packages, or just a few). The FreeBSD ZFS support or jails may well appeal. If you’re looking to run a desktop environment, or a server with some application that needs a ton of PHP, Python, Perl, or C libraries, then Debian’s package management and security handling may well be attractive.

I am disappointed that Debian GNU/kFreeBSD will not be a release architecture in jessie. That project had the promise to provide a best of both worlds for those that want jails or tight ZFS integration.

Reactions to “Has modern Linux lost its way?” and the value of simplicity

February 11, 2015LinuxJohn Goerzen

Apparently I touched a nerve with my recent post about the growing complexity of issues.

There were quite a few good comments, which I’ll mention here. It’s provided me some clarity on the problem, in fact. I’ll try to distill a few more thoughts here.

The value of simplicity and predictability

The best software, whether it’s operating systems or anything else, is predictable. You read the documentation, or explore the interface, and you can make a logical prediction that “when I do action X, the result will be Y.” grep and cat are perfect examples of this.

The more complex the rules in the software, the more hard it is for us to predict. It leads to bugs, and it leads to inadvertant security holes. Worse, it leads to people being unable to fix things themselves — one of the key freedoms that Free Software is supposed to provide. The more complex software is, the fewer people will be able to fix it by themselves.

Now, I want to clarify: I hear a lot of talk about “ease of use.” Gnome removes options in my print dialog box to make it “easier to use.” (This is why I do not use Gnome. It actually makes it harder to use, because now I have to go find some obscure way to just make the darn thing print.) A lot of people conflate ease of use with ease of learning, but in reality, I am talking about neither.

I am talking about ease of analysis. The Linux command line may not have pointy-clicky icons, but — at least at one time — once you understood ls -l and how groups, users, and permission bits interacted, you could fairly easily conclude who had access to what on a system. Now we have a situation where the answer to this is quite unclear in terms of desktop environments (apparently some distros ship network-manager so that all users on the system share the wifi passwords they enter. A surprise, eh?)

I don’t mind reading a manpage to learn about something, so long as the manpage was written to inform.

With this situation of dbus/cgmanager/polkit/etc, here’s what it feels like. This, to me, is the crux of the problem:

It feels like we are in a twisty maze, every passage looks alike, and our flashlight ran out of battieries in 2013. The manpages, to the extent they exist for things like cgmanager and polkit, describe the texture of the walls in our cavern, but don’t give us a map to the cave. Therefore, we are each left to piece it together little bits at a time, but there are traps that keep moving around, so it’s slow going.

And it’s a really big cave.

Other user perceptions

There are a lot of comments on the blog about this. It is clear that the problem is not specific to Debian. For instance:

Christopher writes that on Fedora, “annoying, niggling problems that used to be straightforward to isolate, diagnose and resolve by virtue of the platform’s simple, logical architecture have morphed into a morass that’s worse than the Windows Registry.” Alessandro Perucchi adds that he’s been using Linux for decades, and now his wifi doesn’t work, suspend doesn’t work, etc. in Fedora and he is surprisingly unable to fix it himself.
Nate bargman writes, in a really insightful comment, “I do feel like as though I’m becoming less a master of and more of a slave to the computing software I use. This is not a good thing.”
Singh makes the valid point that this stuff is in such a state of flux that even if a person is one of the few dozen in the world that understand what goes into a session today, the knowledge will be outdated in 6 months. (Hal, anyone?)

This stuff is really important, folks. People being able to maintain their own software, work with it themselves, etc. is one of the core reasons that Free Software exists in the first place. It is a fundamental value of our community. For decades, we have been struggling for survival, for relevance. When I started using Linux, it was both a question and an accomplishment to have a useable web browser on many platforms. (Netscape Navigator was closed source back then.) Now we have succeeded. We have GPL-licensed and BSD-licensed software running on everything from our smartphones to cars.

But we are snatching defeat from the jaws of victory, because just as we are managing to remove the legal roadblocks that kept people from true mastery of their software, we are erecting technological ones that make the step into the Free Software world so much more difficult than it needs to be.

We no longer have to craft Modelines for X, or compile a kernel with just the right drivers. This is progress. Our hardware is mostly auto-detected and our USB serial dongles work properly more often on Linux than on Windows. This is progress. Even our printers and scanners work pretty darn well. This is progress, too.

But in the place of all these things, now we have userspace mucking it up. We have people with mysterious errors that can’t be easily assisted by the elders in the community, because the elders are just as mystified. We have bugs crop up that would once have been shallow, but are now non-obvious. We are going to leave a sour taste in people’s mouth, and stir repulsion instead of interest among those just checking it out.

The ways out

It’s a nasty predicament, isn’t it? What are your ways out of that cave without being eaten by a grue?

Obviously the best bet is to get rid of the traps and the grues. Somehow the people that are working on this need to understand that elegance is a feature — a darn important feature. Sadly I think this ship may have already sailed.

Software diagnosis tools like Enrico Zini’s seat-inspect idea can also help. If we have something like an “ls for polkit” that can reduce all the complexity to something more manageable, that’s great.

The next best thing is a good map — good manpages, detailed logs, good error messages. If software would be more verbose about the permission errors, people could get a good clue about where to look. If manpages for software didn’t just explain the cavern wall texture, but explain how this room relates to all the other nearby rooms, it would be tremendously helpful.

At present, I am unsure if our problem is one of very poor documentation, or is so bad that good documentation like this is impossible because the underlying design is so complex it defies being documented in something smaller than a book (in which case, our ship has not just sailed but is taking on water).

Counter-argument: progress

One theme that came up often in the comments is that this is necessary for progress. To a certain extent, I buy that. I get why udev is important. I get why we want the DE software to interact well. But here’s my thing: this already worked well in wheezy. Gnome, XFCE, and KDE software all could mount/unmount my drives. I am truly still unsure what problem all this solved.

Yes, cloud companies have demanding requirements about security. I work for one. Making security more difficult to audit doesn’t do me any favors, I can assure you.

The systemd angle

To my surprise, systemd came up quite often in the discussion, despite the fact that I mentioned I wasn’t running systemd-sysv. It seems like the new desktop environemt ecosystem is “the systemd ecosystem” in a lot of people’s minds. I’m not certain this is justified; systemd was not my first choice, but as I said in an earlier blog post, “jessie will still boot”.

A final note

I still run Debian on all my personal boxes and I’m not going to change. It does awesome things. For under $100, I built a music-playing system, with Raspberry Pis, fully synced throughout my house, using a little scripting and software. The same thing from Sonos would have cost thousands. I am passionate about this community and its values. Even when jessie releases with polkit and all the rest, I’m still going to use it, because it is still a good distro from good people.

Has modern Linux lost its way? (Some thoughts on jessie)

February 9, 2015DebianJohn Goerzen

For years, I used to run Debian sid (unstable) on all my personal machines. Laptops, workstations, sometimes even my personal servers years ago ran sid. Sid was, as its name implies, unstable. Sometimes things broke. But it wasn’t a big deal, because I could always get in there and fix it fairly quickly, whatever it was. It was the price I paid for the latest and greatest.

For the last number of months, I’ve dealt with a small but annoying issue in jessie: None of Nautilus, Thunar, or digikam (yes, that represents Gnome, XFCE, and KDE) can mount USB drives I plug in anymore. I just get “Not authorized to perform operation.” I can, of course, still mount -o uid=1000 /dev/sdc1 /mnt, but I miss the convenience of doing it this way.

One jessie system I switched to systemd specifically to get around this problem. It worked, but I don’t know why. I haven’t had the time to switch my workstation, and frankly I am concerned about it.

Here’s the crux of the issue: I don’t even know where to start looking. I’ve googled this issue, and found all sorts of answers pointing to polkit, or dbus, or systemd-shim, or cgmanager, or lightdm, or XFCE, or… I found a bug report of this exact problem — Debian #760281, but it’s marked fixed, and nobody replied to my comment that I’m still seeing it.

Nowhere is it documented that a Digikam mounting issue should have me looking in polkit, let alone cgmanager. And even once I find those packages, their documentation suffers from Bad Unix Documentation Disease: talking about the nitty-gritty weeds view of something, without bothering to put it in context. Here is the mystifying heading for the cgmanager(8) manpage:

cgmanager is a daemon to manage cgroups. Programs and users can make D-Bus requests to administer cgroups over which they have privilege. To ensure that users may not exceed their privilege in manipulating cgroups, the cgroup manager accepts regular D-Bus requests only from tasks within its own process-id and user namespaces. For tasks in private namespaces (such as containers), SCM-enhanced D-Bus calls are available. Using these manually is not recommended. Rather, each container is advised to run a cgproxy, which will forward plain D-Bus requests as SCM-enhanced D-Bus requests to the host cgmanager.

That’s about as comprehensible as Vogon poetry to me. How is cgmanager started? What does “SCM-enhanced” mean? And I even know what a cgroup is.

This has been going on for months, which has me also wondering: is it only me? (Google certainly suggests it’s not, and there are plenty of hits for this exact problem with many distros, and some truly terrible advice out there to boot.) And if not, why is something so basic and obvious festering for so long? Have we built something that’s too complex to understand and debug?

This is, in my mind, orthogonal to the systemd question. I used to be able to say Linux was clean, logical, well put-together, and organized. I can’t really say this anymore. Users and groups are not really determinitive for permissions, now that we have things like polkit running around. (Yes, by the way, I am a member of plugdev.) Error messages are unhelpful (WHY was I not authorized?) and logs are nowhere to be found. Traditionally, one could twiddle who could mount devices via /etc/fstab lines and perhaps some sudo rules. Granted, you had to know where to look, but when you did, it was simple; only two pieces to fit together. I’ve even spent time figuring out where to look and STILL have no idea what to do.

systemd may help with some of this, and may hurt with some of it; but I see the problem more of an attitude of desktop environments to add features fast without really thinking of the implications. There is something to be said for slower progress if the result is higher quality.

Then as I was writing this, of course, my laptop started insisting that it needed the root password to suspend when I close the lid. And it’s running systemd. There’s another quagmire…

Update: Part 2 with some reactions to this and further thoughts is now available.

Home Automation, part 2: Z-Wave and ISY programming

January 31, 2015Home AutomationJohn Goerzen

In my part 1 post yesterday, I wrote about the start of the home automation project. I mentioned that I was using Insteon switches, and they mostly were working well (I forgot to mention an annoyance: you can dim, but not totally shut off, their LED status light.) Anyhow, the Insteon battery-operated sensors seem to not be as good as their Z-Wave competition.

Setting up Z-Wave

Z-Wave devices get joined to the Z-Wave network in much the same way Bluetooth devices get paired with each other. The first time you use a Z-Wave device, you join it to the controller. The controller assigns it an ID on your network, and both devices discover the best route to communicate with each other.

I discovered that the Z-Wave module on the ISY-994i has particularly poor reception. Combined with the generally short range of battery-operated Z-wave devices, this meant that my sensors didn’t work reliably. As with Insteon, AC-powered Z-wave devices tend to be repeaters, but I didn’t have any AC-powered Z-wave devices. I went looking for Z-wave repeaters, and found some. But then I discovered that a Z-wave relay-based appliance switch was actually $5 cheaper, acted as a repeater, and could be used as a switch down the road if needed. A couple of those solved my communications issues. You join them to the network as usual, but then either re-join the battery-powered sensors (so they see the new route) or do a “network heal” (every device on the network re-learns about its neighbors and routes to the other devices) so they see the new route.

Z-Wave Motion / Occupancy Sensors

If you have been in new buildings, chances are you have seen light switches with built-in occupancy sensors. My doctor’s office has these. They are typically the same size as a regular light switch, but with a passive infrared (PIR) motion sensor used to automatically turn the lights off after a timeout (or even on, when someone walks into the room.) These work fine for small rooms, but if you’ve ever been in a bathroom with a PIR lightswitch that goes off while you’re still in the room, you are aware of their faults for larger rooms!

Most of the time, when you read about “occupancy sensors”, it’s really a PIR sensor, and the term is used interchangeably with “motion detector.” Motion detectors in home automation systems are commonly used for a number of purposes: detecting occuption of rooms for automatic control of lighting or HVAC systems, triggering alerts, or even locking or unlocking doors.

My friend told me he had poor luck with the Insteon PIR sensors, so I tried two different Z-Wave models: the $48 Aeon/Aeotec multi-sensor and the $30 Ecolink PIRZWAVE2-ECO. The Aeon device is clearly the more capable; it can be used indoors or out, and has a lot of options that can be configured by Z-Wave configuration parameters. It also has a ball/socket mount, so it can be easily aimed in different directions. It can draw power from 4 AAA batteries, or a mini USB cable (cable, but not power supply, included). The Aeon multi-sensor also has a temperature, humidity, and illumination (lux) sensor – but, as you’ll see, they have some drawbacks.

The Ecolink device is more basic; it has a few settings that can be altered via jumpers, but none that can be altered via Z-Wave configuration commands. That makes it a bit of a hassle, since you have to open it up to change, and when you do, it triggers a “tamper alarm” that is both undocumented and never seems to go away. It is powered by a single CR123A 3V Lithium battery, which are about $2 each on Amazon.

Both units have a default PIR timeout of 4 minutes. That means that after sensing motion, they will transmit the “on” signal (meaning motion detected) and then transmit no further signals for at least 4 minutes. This is because operating the radio consumes far more battery power than simply monitoring the PIR sensor, and it cuts down on repeated on/off transmissions. In this configuration, both units should have battery life of 6 months to a year, I figure, with an edge for the Ecolink, perhaps.

Both can also be configured to transmit more frequently; the Ecolink has a jumper that can change its timeout to 10 seconds, whereas the Aeon can be configured over Z-Wave for any timeout between 10 seconds and 65535 seconds (values above 4 minutes are rounded to the nearest minute, for some reason.)

Both also have adjustable sensitivity; on the Aeon this is via an adjustment knob, and on the Ecolink it’s via jumpers. And both can report their battery level to enable software alerts when it’s getting low.

The Aeon ships with its temperature, humidity, battery, and lux sensors disabled. This appears to be the cause of much confusion online, as they send one transmission and then no more. Sadly, one has to resort to the hard-to-find but very helpful MultiSensor engineering spec document to figure out the way to enable those sensors and set their interval. (It must be said, however, that I doubt most consumers will understand how to set bitfields, and this is either not covered or covered incorrectly in their other manuals.) This can be a real power drain, so I just set it to report battery level every 6 hours (so I can alert myself if it gets low) and leave it at that.

The Ecolink manual claims a detection radius of 39 feet and a total angle of 90deg (45deg left or right from center) and a 3-year battery life (I’m skeptical). It also is rated for indoor use only. Aeotec claims a detection radius of 16 feet and a total angle of 120 deg. Be skeptical of all these figures.

Once set up with good reception, both devices have been working fine.

Programming the controller

The ISY-994i-Zw Pro controller I mentioned in Part 1 has its own sort of programming language. It’s a lot better than the sort of GUI clicky mess that is found in most of these things, but it still has a sort of annoying Java-based editor where you select keywords from a menu and such. Although you can backup and restore the device, and import and export the programs, the file formats are XML and not really suitable for hand-editing. Sigh.

The language is limited, but gets the job done. Here is a simple program that turns off the fan in a bathroom 30 minutes after the light was turned off:

If
        Status  '1st Floor / Bathroom + Laundry / Main Bath Fan' is On
    And Status  '1st Floor / Bathroom + Laundry / Main Bath Light' is Off
 
Then
        Wait  30 minutes 
        Set '1st Floor / Bathroom + Laundry / Main Bath Fan' Off
 
Else
   - No Actions - (To add one, press 'Action')

The if/then/else clauses should probably be more properly called when/do/finally. The “if” clauses are evaluated whenever a relevant event occurs. If the If evaluates to true, if executes the Then section. The program in “Then” is uninterruptible except for “wait” and “repeat” statements. So in this case, the program begins, and if the light stays off and the fan stays on, 30 minutes later it turns off the fan. But if the light comes back on, the program aborts (since there is nothing in “else”). Similarly, if the fan goes off, the program aborts.

A more complicated example: motion-activated lamp

There is a table lamp in our living room controlled by Insteon. By adding a motion sensor in that room, it can be automatically turned on by someone walking through the room. Now, to be useful, I don’t want it to turn on during the day. I also don’t want it to turn on or adjust itself if other lights are on in the room, or if I turned it on myself; that could cause it to go on and off while I’m watching TV, for instance. It’s just to be helpful at night. Because the ISY-994i is pretty limited, having almost no control flow operations, this — as with many tasks — requires several “programs”.

First, here’s my “main”:

If
        Status  '1st Floor / Living + Dining Room / LR Table Lamp' is Off
    And $LR_Lamp_Lockout is 0
    And Status  '1st Floor / Living + Dining Room / ZW 005 Binary Sensor' is On
    And From    Sunset 
        To      Sunrise (next day)
    And $LR_Lamp_Motion_Working is not 0
    And Status  '1st Floor / Living + Dining Room / LR Floor Lamp' is Off
    And Status  '1st Floor / Living + Dining Room / Dining Light' is Off
    And Status  '1st Floor / Living + Dining Room / LR Light' is Off
 
Then
        Set '1st Floor / Living + Dining Room / LR Table Lamp' 22%
        Set '1st Floor / Living + Dining Room / LR S Remote (Table Lamp)' 22%
 
Else
   - No Actions - (To add one, press 'Action')

So, let’s look at the conditions. This program triggers if the lamp is off, the sensor is on, the time is between sunset and sunrise, and three other lights are off. (I will explain the lockout and motion_working variables later). If this is the case, it sets the lamp to 22% (and also informs the wall switch for it that the lamp is at 22%). I pick this precise value because it is unlikely I would manually set it via the wall switch, and therefore “lamp is 22% bright” doubles as a “lamp was turned on by this program” flag.

So this is the turning the lamp on bit. Let’s look at the program that turns it back off later:

If
        Status  '1st Floor / Living + Dining Room / LR Table Lamp' is 22%
    And (
             Status  '1st Floor / Living + Dining Room / ZW 005 Binary Sensor' is Off
          Or $LR_Lamp_Motion_Working is 0
          Or Status  '1st Floor / Living + Dining Room / LR Light' is not Off
          Or Status  '1st Floor / Living + Dining Room / LR Floor Lamp' is not Off
          Or Status  '1st Floor / Living + Dining Room / Dining Light' is not Off
        )
 
Then
        Wait  15 seconds
        Set '1st Floor / Living + Dining Room / LR S Remote (Table Lamp)' Off
        Set '1st Floor / Living + Dining Room / LR Table Lamp' Off
 
Else
   - No Actions - (To add one, press 'Action')

This is using a sensor with a 4-minute timeout, so the lamp will always be on for at least 4 minutes and 15 seconds. This program runs if the lamp is still at the program-set level (so if, for instance, I turned the lamp full on, the program does nothing to override my setting.) Then, it looks for a condition to trigger turning the lamp off, which could be any of the sensor indicating no more motion, another program detecting it’s lost communication with the sensor, or somebody turning on one of the bigger lights in the room. Then it simply sets the light (and the wall switch controlling it) to off.

There are a couple more bits to this puzzle. What if the system turned the lamp on, but I really want it off? If I walked up to the wall switch and pushed “off”, the lamp would go off. And then, a couple seconds later, come back on, since the state of the system met the conditions for the lamp-on program. So we need a lockout that prevents this from happening. Here’s my “trigger lockout” program:

If
        Control '1st Floor / Living + Dining Room / LR Table Lamp' is switched Off
     Or Control '1st Floor / Living + Dining Room / LR Table Lamp' is switched Fast Off
     Or Control '1st Floor / Living + Dining Room / LR S Remote (Table Lamp)' is switched Fast Off
     Or Control '1st Floor / Living + Dining Room / LR S Remote (Table Lamp)' is switched Off
 
Then
        $LR_Lamp_Lockout += 1
        Run Program 'LR Lamp Clear Lockout' (If)
 
Else
   - No Actions - (To add one, press 'Action')

So if someone pushes the “off” button at the lamp switch box at the outlet it’s plugged into (unlikely), or at the wall switch, it increments the lockout variable and runs another program. This other program is unique in that it is flagged “disabled”, meaning it is never run automatically by the system, only when called by another program. Here’s the clear lockout program:

If
        $LR_Lamp_Lockout > 0
 
Then
        Wait  5 minutes 
        $LR_Lamp_Lockout  = 0
 
Else
   - No Actions - (To add one, press 'Action')

Thus by pushing the “off” button on the switch, the motion-triggered program won’t turn the lamp back on for at least 5 minutes.

Before I had reliable Z-Wave communication to the device, I had some times where it would simply drop off the Z-Wave network until a reboot. This was particularly annoying if it occured after having detected motion, since the state of the sensor in the ISY controller is simply whatever state it last received. Therefore, I wrote this program to check if it believes the motion sensor is working:

If
        Status  '1st Floor / Living + Dining Room / ZW 005 Binary Sensor' is On
 
Then
        $LR_Lamp_Motion_Working  = 1
        Wait  1 hour 
        $LR_Lamp_Motion_Working  = 0
 
Else
        $LR_Lamp_Motion_Working  = 1

We don’t really care if the motion sensor is broken when the status is off; all that happens then is no lamp turns on. So this program activates when the status is set to on. It flips the working flag to true, then waits for an hour. If the sensor shows no motion within that hour, the program skips to the else (keeping the flag true). But if it is still on after an hour, it decides “it must not be working” and sets the working flag to false. You can see that flag used in the other programs’ logic. But because of the Else, which is run whenever the conditions that caused the Then clause to run become false, as soon as the system receives “no motion”, it will flag the sensor as working again.

The final piece to this puzzle is a program flagged to run at boot time of the controller:

If
   - No Conditions - (To add one, press 'Schedule' or 'Condition')
 
Then
        $LR_Lamp_Motion_Working  = 1
 
Else
   - No Actions - (To add one, press 'Action')

This simply initializes the motion working variable to a known state.

Keypads

The Insteon KeypadLinc is a nice device. It can control a single load directly, but all 8 buttons are fully responsive in the Insteon system. They each also have individually-addressible LED backlights. They are commonly used to do things like “ALL OFF”, “TV” (to set lights for watching TV), “AWAY” (to set the system for everyone being out of the house for awhile), etc. They are the size of a regular Decora switch, and I have installed one already, but haven’t programmed much of it yet.

REST API

The ISY has an extensive REST API, which I’ve used to integrate it a bit with my Debian systems. More on that another time.

Mobile app

Mobile apps are a common thing people look for in these systems. You can’t use the Insteon app with the ISY, but they recommend Mobilinc Pro. It does the job. Mobilinc tries to sell a $10/mo connection service, which is totally unnecessary if you can figure out SSL, and has on-screen buttons to bypass, but judging by the Google Play reviews, a lot of people thought they had to pay for that and uninstalled it afterwards.

Future directions

Many people put in electronic door locks. I don’t plan to do that. I do plan to have the house systems be aware of the general state of things (is the house empty? is everyone asleep?) and do appropriate things with lighting and HVAC. I don’t really expect the savings in power for lighting control to pay for the system anytime soon. However, if it can achieve some savings in heating and cooling, it may well be able to pay for itself in a few years. So my big next step is thermostats that can integrate with all this.

I have had a water mess in my basement before, and water leak sensors are a very common item people deploy in these setups. I certainly plan to add a few of them.

Door-open sensors are also useful; they trigger more instantly and reliably than motion detectors and can be used in some nice ways (is it after dark and the door is opening when the house is vacant? If so, turn on the light nearby in case their hands are full.)

Issues

Some issues I ran into so far are already discussed above. One other major one involves SSL on the ISY-994i Pro. The method for adding SSL keys is cumbersome, but the processor on the device — which appears to run some sort of Java — is just not up to working with SSL. Apparently they only recently got it fast enough to work with 2048-bit keys. This is rather undocumented, though, so I obtained a cert for a 4096-bit key, my usual. Attempting to connect to the box with SSL appeared to hang not just that but confuse a lot of other things on it as well. Turned out it wasn’t hung; it just too a minute and 45 seconds to complete the SSL handshake. Moral of the story: use 2048-bit keys, or stunnel4 or some such to re-wrap the SSL communications with a stronger key.

The KeypadLinc backlights can be completely shut off, and both their on and off levels can be customized. I have it set to shut off the backlight during the day and turn it on at night. The wall switches, however, can’t have their brightness status LED bar entirely shut off. They can be made dim, but don’t ever go away. That’s rather annoying.

Also annoying is that Insteon doesn’t make switches in the traditional toggle switch style in colors other than white. As our house had mostly black switches, I was forced into the Decora style.

Overall thoughts

This has been a great learning experience for me in a number of ways. I have only begun to tap what the system can do, and the real benefits will probably come once I get the heating and cooling into the mix. It’s quite a nice way for a geek to go, and the improvements in lighting have also been popular with everyone else in the house.

First Steps with Home Automation and LED Lighting

January 30, 2015Home AutomationJohn Goerzen

I’ve been thinking about home automation — automating lights, switches, thermostats, etc. — for years. Literally decades, in fact. When I was a child, my parents had a RadioShack X10 control module and one or two target devices. I think I had fun giving people a “light show” turning on or off one switch and one outlet remotely.

But I was stuck — there are a daunting number of standards for home automation these days. Zigbee, UPB, Z-Wave, Insteon, and all sorts of Wifi-enabled things that aren’t really compatible with each other (hellooooo, Nest) or have their own “ecosystem” that isn’t all that open (helloooo, Apple). Frankly I don’t think that Wifi is a great home automation protocol; its power drain completely prohibits it being used in a lot of ways.

Earlier this month, my awesome employer had our annual meeting and as part of that our technical teams had some time for anyone to talk about anything geeky. I used my time to talk about flying quadcopters, but two of my colleagues talked about home automation. I had enough to have a place to start, and was hooked.

People use these systems to do all sorts of things: intelligently turn off lights when rooms aren’t occupied, provide electronic door locks (unlockable via keypad, remote, or software), remote control lighting and heating/cooling via smartphone apps, detect water leakage, control switches with awkward wiring environments, buttons to instantly set multiple switches to certain levels for TV watching, turning off lights left on, etc. I even heard examples of monitoring a swamp cooler to make sure it is being used correctly. The possibilities are endless, and my geeky side was intrigued.

Insteon and Z-Wave

Based on what I heard from my colleagues, I decided to adopt a hybrid network consisting of Insteon and Z-Wave.

Both are reliable protocols (with ACKs and retransmit), so they work far better than X10 did. Both have all sorts of controls and sensors available (browse around on smarthome.com for some ideas).

Insteon is a particularly interesting system — an integrated dual-mesh network. It has both powerline and RF signaling, and most hardwared Insteon devices act as repeaters for both the wired and RF network simultaneously. Insteon packets contain a maximum hop count that is decremented after each relay, and the packets repeat in such as way that they collide and strengthen one another. There is no need to maintain routing tables or anything like that; it simply scales nicely.

This system addresses all sorts of potential complexities. It addresses the split-phase problem of powerline-only systems, but using an RF bridge. It addresses long distances and outbuildings by using the powerline signaling. I found it to work quite well.

The downside to Insteon is that all the equipment comes from one vendor: Insteon. If you don’t like their thermostat or motion sensor, you don’t have any choice.

Insteon devices can be used entirely without a central controller. Light switches can talk to each other directly, and you can even set them up so that one switch controls dozens of others, if you have enough patience to go around your house pressing tiny “set” buttons.

Enter Z-Wave. Z-Wave is RF-only, and while it is also a mesh network, it is source-routed, meaning that if you move devices around, you have to “heal” your network as all your nodes have to re-learn the path to each other. It also doesn’t have the easy distance traversal of Insteon, of course. On the other hand, hundreds of vendors make Z-Wave products, and they mostly interoperate well. Z-Wave is said to scale practically to maybe two or three dozen devices, which would have been an issue for me, buut with Insteon doing the heavy lifting and Z-Wave filling in the gaps, it’s worked out well.

Controlling it all

While both Insteon (and, to a certain extent, Z-Wave) devices can control each other, to really spread your wings, you need more centralized control. This lets you have programs that do things like “if there’s motion in the room on a weekday and it’s dark outside, then turn on a light, and turn it back off 5 minutes later.”

Insteon has several options. One, you can buy their “power line modem” (PLM). This can be hooked up to a PC to run either Insteon’s proprietary software, or something open-source like MisterHouse, written in Perl. Or you can hook it up to a controller I’ll mention in a minute. Those looking for a fairly simpe controller might get the Insteon 2242-222 Hub, which has the obligatory smartphone app and basic schedules.

For more sophisticated control, my friend recommended the ISY-994i controller. Not only does it have a lot more capable programming language (though still frustrating), it supports both Insteon and Z-Wave in an integrated box, and has a comprehensive REST API for integrating with other things. I went this route.

First step: LED lighting

I began my project by replacing my light bulbs with LEDs. I found that I could get Cree 4-Flow 60W equivs for $10 at Home Depot. They are dimmable, a key advantage over CFL, and also maintain their brightness throughout their life far better. As I wanted to install dimmer switches, I got a combination of Cree 60W bulbs, Cree TW bulbs (which have a better color spectrum coverage for more true colors), and Cree 100W equiv bulbs for places I needed more coverage. I even put in a few LED flood lights to replace my can lights.

Overall I was happy with the LEDs. They are a significant improvement over the CFLs I had been using, and use even less power to boot. I have had issues with three Cree bulbs, though: one arrived broken, and two others have had issues such as being quite dim. They have a good warranty, but it seems their QA could be better. Also, they can have a tendency to flickr when dimmed, though this plagues virtually all LED bulbs.

I had done quite a bit of research. CNET has some helpful brightness guides, and Insteon has a color temperature chart. CNET also had a nifty in-depth test of LED bulbs.

Second step: switches

Once the LED bulbs were in place, I was then able to start installing smart switches. I picked up Insteon’s basic switch, the SwitchLinc 2477D at Menard’s. This is a dimmable switch and requires a neutral wire in the box, but acts as a dual-band repeater for the system as well.

The way Insteon switches work, they can be standalone, or controllers, responders, or both in a “scene”. A scene is where multiple devices act together. You can create virtual 3-way switches in a scene, or more complicated things where different lights are turned on at different levels.

Anyhow, these switches worked out quite well. I have a few boxes where there is no neutral wire, so I had to use the Insteon SwitchLinc 2474D in them. That switch is RF-only and is supposed to have a minimum load of 20W, though they seemed to work OK — albeit with limited range and the occasional glitch — with my LEDs. There is also the relay-based SwitchLinc 2477S for use with non-dimmable lights, fans, etc. You can also get plug-in modules for controlling lamps and such.

I found the Insteon devices mostly lived up to their billing. Occasionally I could provoke a glitch by changing from dimming to brightening in rapid succession on a remote switch controlling a load on a distant one. Twice I had to power cycle an Insteon switch that got confused (rather annoying when they’re hardwared). Other than that, though, it’s been solid. From what I gather, this stuff isn’t ever quite as reliable as a 1950s mechanical switch, but at least in this price range, this is about as good as it gets these days.

Well, this post got quite long, so I will have to follow up with part 2 in a little while. I intend to write about sensors and the Z-Wave network (which didn’t work quite as easily as Insteon), as well as programming the ISY and my lessons learned along the way.

My boys love 1986 computing

November 23, 2014Children & ComputingJohn Goerzen

Yesterday, Jacob (age 8) asked to help me put together a 30-year-old computer from parts in my basement. Meanwhile, Oliver (age 5) asked Laura to help him learn cursive. Somehow, this doesn’t seem odd for a Saturday at our place.

Let me tell you how this came about.

I’ve had a project going on for a while now to load data from old floppies. It’s been fun, and had a surprise twist the other day: my parents gave me an old TRS-80 Color Computer II (aka “CoCo 2”). It was, in fact, my first computer, one they got for me when I was in Kindergarten. It is nearly 30 years old.

I have been musing lately about the great disservice Apple did the world by making computers easy to learn — namely the fact that few people ever bother to learn about them. Who bothers to learn about them when, on the iPhone for instance, the case is sealed shut, the lifespan is 1 or 2 years for many purchasers, and the platform is closed in lots of ways?

I had forgotten how finicky computers used to be. But after some days struggling with IDE incompatibilities, booting issues, etc., when I actually managed to get data off a machine that had last booted in 1999, I had quite the sense of accomplishment, which I rarely have lately. I did something that was hard to do in a world where most of the interfaces don’t work with equipment that old (even if nominally they are supposed to.)

The CoCo is one of those computers normally used with a floppy drive or cassette recorder to store programs. You type DIR, and you feel the clack of the drive heads through the desk. You type CLOAD and you hear the relay click closed to turn on the tape motor. You wiggle cables around until they make contact just right. You power-cycle for the times when the reset button doesn’t quite do the job. The details of how it works aren’t abstracted away by innumerable layers of controllers, interfaces, operating system modules, etc. It’s all right there, literally vibrating your desk.

So I thought this could be a great opportunity for Jacob to learn a few more computing concepts, such as the difference between mass storage and RAM, plus a great way to encourage him to practice critical thinking. So we trekked down to the basement and came up with handfulls of parts. We brought up the computer, some joysticks, all sorts of tangled cables. We needed adapters, an old TV. Jacob helped me hook everything up, and then the moment of truth: success! A green BASIC screen!

I added more parts, but struck out when I tried to connect the floppy drive. The thing just wouldn’t start up right whenever the floppy controller cartridge was installed. I cleaned the cartridge. I took it apart, scrubbed the contacts, even did a re-seat of the chips. No dice.

So I fired up my CoCo emulator (xroar), and virtually “saved” some programs to cassette (a .wav file). I then burned those .wav files to an audio CD, brought up an old CD player from the basement, connected the “cassette in” plug to the CD player’s headphone jack, and presto — instant programs. (Well, almost. It takes a couple of minutes to load a program from audio codes.)

The picture above is Oliver cackling at one of the very simplest BASIC programs there is: “number find.” The computer picks a random number between 1 and 2000, and asks the user to guess it, giving a “too low” or “too high” clue with each incorrect guess. Oliver delighted in giving invalid input (way too high numbers, or things that weren’t numbers at all) and cackled at the sarcastic error messages built into the program. During Jacob’s turn, he got very serious about it, and is probably going to be learning about how to calculate halfway points before too long.

But imagine my pride when this morning, Jacob found the new CD I had made last night (correcting a couple recordings), found my one-line instruction on just part of how to load a program, and correctly figured out by himself all the steps to do in order (type CLOAD on the CoCo, advance the CD to the proper track, press play on the player, wait for it to load on the CoCo, then type RUN).

I ordered a replacement floppy controller off eBay tonight, and paid $5 for a coax adapter that should fix some video quality issues. I rescued some 5.25″ floppies from my trash can from another project, so they should have plenty of tools for exploration.

It is so much easier for them to learn how a disk drive works, and even what the heck a track is, when you can look at a floppy drive with the cover off and see the heads move. There are other things we can do with more modern equipment — Jacob has shown a lot of interest in Arduino projects — but I have so far drawn a blank on ways to really let kids discover how a modern PC (let alone a modern phone or tablet) works.

Update Nov. 24: Every so often, the world surprises me by deciding to, well, read one of my random blog posts. For the benefit of those of you that don’t already know my boys, you might want to know that among their common play activites are turning trees into pretend trains, typing at a manual typewriter, reading, writing their own books, using a cassette recorder, building a PC and learning to use bash or xmonad, making long paper tapes with an adding machine, playing records on a record player, building electric gizmos, and even making mud balls.

I am often asked about the role of the computer in the lives, given that my hobby and profession involves computers. The answer: less than that of most of their peers. I look for opportunities for them to learn by doing, discovering, playing, or imagining. I make no presumption that they will develop the passion for computers that I did. What I want is for them to have the curiosity and drive to learn everything there is to know about whatever they do develop a passion for, so they will be great at it.

The Changelog

Comments on family, technology, and society