Monthly Archives: January 2011

Unix Password and Authority Management

One of the things that everyone seems to do different is managing passwords. We haven’t looked at that in quite some time, despite growth both of the company and the IT department.

As I look to us moving some things to the cloud, and shifting offsite backups from carrying tapes to a bank to backups via the Internet, I’m aware that the potential for mischief — whether intentional or not — is magnified. With cloud hosting, a person could, with the press of a button, wipe out the equivalent of racks of machines in a few seconds. With disk-based local and Internet-based offsite backups, the potential for malicious behavior may be magnified; someone could pretty quickly wipe out local and remote backups.

Add to that the mysterious fact that many enterprise-targeted services allow only a single username/password for an account, and make no provision for ACLs to delegate permissions to others. Even Rackspace Cloud has this problem, as do their JungleDisk backup product, and many, many other offsite backup products. Amazon AWS seems to be the only real exception to this rule, and their ACL support is more than a little complicated.

So one of the questions we will have to address is the balance of who has these passwords. Too many people and the probability of trouble, intentional or not, rises. Too few and productivity is harmed, and potentially also the ability to restore. (If only one person has the password, and that person is unavailable, company data may be as well.) The company does have some storage locations, including locked vaults and safe deposit boxes, that no IT people have access to. I am thinking that putting a record of passwords in those locations may be a good first step, as putting the passwords in the control of those that can’t use them seems a reasonable step.

But we’ve been thinking of this as it pertains to our local systems as well. We have, for a number of years now, assigned a unique root password to every server. These passwords are then stored in a password-management tool, encrypted with a master password, and stored on a shared filesystem. Everyone in the department therefore can access every password.

Many places where I worked used this scheme, or some variant of it. The general idea was that if root on one machine was compromised and the attacker got root’s password, it would prevent the person from being able to just try that password on the other servers on the network and achieve a greater level of intrusion.

However, the drawback is that we now have more servers than anyone can really remember the passwords for. So many people are just leaving the password tool running. Moreover, while the attack described above is still possible, these days I worry more about automated intrusion attempts that most likely won’t try that attack vector.

A couple of ways we could go may include using a single root password everywhere, or a small set of root passwords. Another option may be to not log in to root accounts at all — possibly even disabling their password — and requiring the use of user accounts plus sudo. This hasn’t been practical to date. We don’t want to make a dependency on LDAP from a bunch of machines just to be able to use root, and we haven’t been using a tool such as puppet or cfengine to manage this stuff. Using such a tool is on our roadmap and could let us manage that approach more easily. But this approach has risks too. One is that if user accounts can get to root on many machines, then we’re not really more secure than a standard root password. Second is that it makes it more difficult to detect and enforce password expiration and systematic password changes.

I’m curious what approaches other people are taking on this.

rdiff-backup, ZFS, and rsync scripts

rdiff-backup vs. ZFS

As I’ve been writing about backups, I’ve gone ahead and run some tests with rdiff-backup. I have been using rdiff-backup personally for many years now — probably since 2002, when I packaged it up for Debian. It’s a nice, stable system, but I always like to look at other options for things every so often.

rdiff-backup stores an uncompressed current mirror of the filesystem, similar to rsync. History is achieved by the use of compressed backwards binary deltas generated by rdiff (using the rsync algorithm). So, you can restore the current copy very easily — a simple cp will do if you don’t need to preserve permissions. rdiff-backup restores previous copies by applying all necessary binary deltas to generate the previous version.

Things I like about rdiff-backup:

  1. Bandwidth-efficient
  2. Reasonably space-efficient, especially where history is concerned
  3. Easily scriptable and nice CLI
  4. Unlike tools such as duplicity, there is no need to periodically run full backups — old backups can be deleted without impacting the ability to restore more current backups

Things I don’t like about it:

  1. Speed. It can be really slow. Deleting 3 months’ worth of old history takes hours. It has to unlink vast numbers of files — and that’s pretty much it, but it does it really slowly. Restores, backups, etc. are all slow as well. Even just getting a list of your increment sizes so you’d know how much space would be saved can take a very long time.
  2. The current backup copy is stored without any kind of compression, which is not at all space-efficient
  3. It creates vast numbers of little files that take forever to delete or summarize

So I thought I would examine how efficient ZFS would be. I wrote a script that would replay the rdiff-backup history — first it would rsync the current copy onto the ZFS filesystem and make a ZFS snapshot. Then each previous version was processed by my script (rdiff-backup’s files are sufficiently standard that a shell script can process them), and a ZFS snapshot created after each. This lets me directly compare the space used by rdiff-backup to that used by ZFS using actual history.

I enabled gzip-3 compression and block dedup in ZFS.

My backups were nearly 1TB in size and the amount of space I had available for ZFS was roughly 600GB, so I couldn’t test all of them. As it happened, I tested the ones that were the worst-case scenario for ZFS: my photos, music collection, etc. These files had very little duplication and very little compressibility. Plus a backup of my regular server that was reasonably compressible.

The total size of the data backed up with rdiff-backup was 583 GB. With ZFS, this came to 498GB. My dedup ratio on this was only 1.05 (meaning 5% or 25GB saved). The compression ratio was 1.12 (60GB saved). The combined ratio was 1.17 (85GB saved). Interestingly 498 + 85 = 583.

Remember that the data under test here was mostly a worst-case scenario for ZFS. It would probably have done better had I had the time to throw the rest of my dataset at it (such as the 60GB backup of my iPod, which would have mostly deduplicated with the backup of my music server).

One problem with ZFS is that dedup is very memory-hungry. This is common knowledge and it is advertised that you need to use roughly 2GB of RAM per TB of disk when using dedup. I don’t have quite that much to dedicate to it, so ZFS got VERY slow and thrashed the disk a lot after the ARC grew to about 300MB. I found some tweakables in zfsrc and the zfs command that let me tweak the ARC cache to grow bigger. But the machine in question only has 2GB RAM, and is doing lots of other things as well, so this barely improved anything. Note that this dedup RAM requirement is not out of line with what is expected from these sorts of solutions.

Even if I got absolutely stellar dedup ratio of 2:1, that would get me at most 1TB. The cost of buying a 1TB disk is less than the cost of upgrading my system to 4GB RAM, so dedup isn’t worth it here.

I think the lesson is: think carefully about where dedup makes sense. If you’re storing a bunch of nearly-identical virtual machine images — the sort of canonical use case for this — go for it. A general fileserver — well, maybe you should just add more disk instead of more RAM.

Then that raises the question: if I don’t need dedup from ZFS, do I bother with it at all, or just use ext4 and LVM snapshots? I think ZFS still makes sense, given its built-in support for compression and very fast snapshots — LVM snapshots are known to cause serious degradation to write performance once enabled, which ZFS doesn’t.

So I plan to switch my backups to use ZFS. A few observations on this:

  1. Some testing suggests that the time to delete a few months of old snapshots will be a minute or two with ZFS compared to hours with rdiff-backup.
  2. ZFS has shown itself to be more space-efficient than rdiff-backup, even without dedup enabled.
  3. There are clear performance and convenience wins with ZFS.
  4. Backup Scripts

    So now comes the question of backup scripts. rsync is obviously a pretty nice choice here — and if used with –inplace perhaps even will play friendly with ZFS snapshots even if dedup is off. But let’s say I’m backing up a few machines at home, or perhaps dozens at work. There is a need to automate all of this. Specifically, there’s a need to:

    1. Provide scheduling, making sure that we don’t hammer the server with 30 clients all at once
    2. Provide for “run before” jobs to do things like snapshot databases
    3. Be silent on success and scream loudly via emails to administrators on any kind of error… and keep backing up other systems when there is an error
    4. Create snapshots and provide an automated way to remove old snapshots (or mount them for reading, as ZFS-fuse doesn’t support the .zfs snapshot directory yet)

    To date I haven’t found anything that looks suitable. I found a shell script system called rsbackup that does a large part of this, but something about using a script whose homepage is a forum makes me less than 100% confident.

    On the securing the backups front, rsync comes with a good-looking rrsync script (inexplicably installed under /usr/share/doc/rsync/scripts instead of /usr/bin on Debian) that can help secure the SSH authorization. GNU rush also looks like a useful restricted shell.

Research on deduplicating disk-based and cloud backups

Yesterday, I wrote about backing up to the cloud. I specifically was looking at cloud backup services. I’ve been looking into various options there, but also various options for disk-based backups. I’d like to have both onsite and offsite backups, so both types of backup are needed. Also, it is useful to think about how the two types of backups can be combined with minimal overhead.

For the onsite backups, I’d want to see:

  1. Preservation of ownership, permissions, etc.
  2. Preservation of symlinks and hardlinks
  3. Space-efficient representation of changes — ideally binary deltas or block-level deduplication
  4. Ease of restoring
  5. Support for backing up Linux and Windows machines

Deduplicating Filesystems for Local Storage

Although I initially thought of block-level deduplicating file systems as something to use for offsite backups, they could also make an excellent choice for onsite disk-based backups.

rsync-based dedup backups

One way to use them would be to simply rsync data to them each night. Since copies are essentially free, we could do (or use some optimized version of) cp -r current snapshot/2011-01-20 or some such to save off historic backups. Moreover, we’d get dedup both across and within machines. And, many of these can use filesystem-level compression.

The real upshot of this is that the entire history of the backups can be browsed as a mounted filesystem. It would be fast and easy to find files, especially when users call about that file that they deleted at some point in the past but they don’t remember when, exactly what it was called, or exactly where it was stored. We can do a lot more with find and grep to locate these things than we could do with tools in Bacula (or any other backup program) restore console. Since it is a real mounted filesystem, we could also do fun things like make tarballs of it at will, zip parts up, scp them back to the file server, whatever. We could potentially even give users direct access to their files to restore things they need for themselves.

The downside of this approach is that rsync can’t store all the permissions unless it’s running as root on the system. Wrappers such as rdup around rsync could help with that. Another downside is that there isn’t a central scheduling/statistics service. We wouldn’t want the backup system to be hammered by 20 servers trying to send it data at once. So there’d be an element of rolling our own scripts, though not too bad. I’d have preferred not to authorize a backup server with root-level access to dozens of machines, but may be inescapable in this instance.

Bacula and dedup

The other alternative I thought of system such as Bacula with disk-based “volumes”. A Bacula volume is normally a tape, but Bacula can just write them to disk files. This lets us use the powerful Bacula scheduling engine, logging service, pre-backup and post-backup jobs, etc. Normally this would be an egregious waste of disk space. Bacula, like most tape-heritage programs, will write out an entire new copy of a file if even one byte changes. I had thought that I could let block-level dedupe reduce the storage size of Bacula volumes, but after looking at the Bacula block format spec, this won’t be possible as each block will have timestamps and such in it.

The good things about this setup revolve around using the central Bacula director. We need only install bacula-fd on each server to be backed up, and it has a fairly limited set of things it can do. Bacula already has built-in support for defining simple or complicated retention policies. Its director will email us if there is a problem with anything. And its logs and catalog are already extensive and enable us to easily find out things such as how long backups take, how much space they consume, etc. And it backs up Windows machines intelligently and comprehensively in addition to POSIX ones.

The downsides are, of course, that we don’t have all the features we’d get from having the entire history on the filesystem all at once, and far less efficient use of space. Not only that, but recovering from a disaster would require a more extensive bootstrapping process.

A hybrid option may be possible: automatically unpacking bacula backups after they’ve run onto the local filesystem. Dedupe should ensure this doesn’t take additional space — if the Bacula blocksize aligns with the filesystem blocksize. This is certainly not a given however. It may also make sense to use Bacula for Windows and rsync/rdup for Linux systems.

This seems, however, rather wasteful and useless.

Evaluation of deduplicating filesystems

I set up and tested three deduplicating filesystems available for Linux: S3QL, SDFS, and zfs-fuse. I did not examine lessfs. I ran a similar set of tests for each:

  1. Copy /usr/bin into the fs with tar -cpf - /usr/bin | tar -xvpf - -C /mnt/testfs
  2. Run commands to sync/flush the disk cache. Evaluate time and disk used at this point.
  3. Rerun the tar command, putting the contents into a slightly different path in the test filesystem. This should consume very little additional space since the files will have already been there. This will validate that dedupe works as expected, and provide a hint about its efficiency.
  4. Make a tarball of both directories from the dedup filesystem, writing it to /dev/zero (to test read performance)

I did not attempt to flush read caches during this, but I did flush write caches. The test system has 8GB RAM, 5GB of which was free or in use by a cache. The CPU is a Core2 6420 at 2.13GHz. The filesystems which created files atop an existing filesystem had ext4 mounted noatime beneath them. ZFS was mounted on an LVM LV. I also benchmarked native performance on ext4 as a baseline. The data set consists of 3232 files and 516MB. It contains hardlinks and symlinks.

Here are my results. Please note the comments below as SDFS could not accurately complete the test.

Test ext4 S3QL SDFS zfs-fuse
First copy 1.59s 6m20s 2m2s 0m25s
Sync/Flush 8.0s 1m1s 0s 0s
Second copy+sync N/A 0m48s 1m48s 0m24s
Disk usage after 1st copy 516MB 156MB 791MB 201MB
Disk usage after 2nd copy N/A 157MB 823MB 208MB
Make tarball 0.2s 1m1s 2m22s 0m54s
Max RAM usage N/A 150MB 350MB 153MB
Compression none lzma none gzip-2

It should be mentioned that these tests pretty much ruled out SDFS. SDFS doesn’t appear to support local compression, and it severely bloated the data store, which was much larger than the original data. Moreover, it permitted any user to create and modify files, even if the permissions bits said that the user couldn’t. tar gave many errors unpacking symlinks onto the SDFS filesystem, and du -s on the result threw up errors as well. Besides that, I noted that find found 10 fewer files than in my source data. Between the huge memory consumption, the data integrity concerns, and inefficient disk storage, SDFS is out of the running for this project.

S3QL is optimized for storage to S3, though it can also store its files locally or on an sftp server — a nice touch. I suspect part of its performance problem stems from being designed for network backends, and using slow compression algorithms. S3QL worked fine, however, and produced no problems. Creating a checkpoint using s3qlcp (faster than cp since it doesn’t have to read the data from the store) took 16s.

zfs-fuse appears to be the most-used ZFS implementation on Linux at the moment. I set up a 2GB ZFS pool for this test, and set dedupe=on and compress=gzip-2. When I evaluated compression in the past, I hadn’t looked at lzjb. I found a blog post comparing lzjb to the gzip options supported by zfs and wound up using gzip-2 for this test.

ZFS really shone here. Compared to S3QL, it took 25s instead of over 6 minutes to copy the data over — and took only 28% more space. I suspect that if I selected gzip -9 compression it would have been closer both in time and space to S3QL. But creating a ZFS snapshot was nearly instantaneous. Although ZFS-fuse probably doesn’t have as many users as ZFS on Solaris, still it is available in Debian, and has a good backing behind it. I feel safer using it than I do using S3QL. So I think ZFS wins this comparison.

I spent quite some time testing ZFS snapshots, which are instantaneous. (Incidentally, ZFS-fuse can’t mount them directly as documented, so you create a clone of the snapshot and mount that.) They worked out as well as could be hoped. Due to dedupe, even deleting and recreating the entire content of the original filesystem resulted in less than 1MB additional storage used. I also tested creating multiple filesystems in the zpool, and confirmed that dedupe even works between filesystems.

Incidentally — wow, ZFS has a ton of awesome features. I see why you OpenSolaris people kept looking at us Linux folks with a sneer now. Only our project hasn’t been killed by a new corporate overlord, so guess that maybe didn’t work out so well for you… <grin>.

The Cloud Tie-In

This discussion leaves another discussion: what to do about offsite backups? Assuming for the moment that I want to back them up over the Internet to some sort of cloud storage facility, there are about 3 options:

  1. Get an Amazon EC2 instance with EBS storage and rsync files to it. Perhaps run ZFS on that thing.
  2. Use a filesystem that can efficiently store data in S3 or Cloud Files (S3QL is the only contender here)
  3. Use a third-party backup product (JungleDisk appears to be the leading option)

There is something to be said for using a different tool for offsite backups — if there is some tool-level issue, that could be helpful.

One of the nice things about JungleDisk is that bandwidth is free, and disk is the same $0.15/GB-mo that RackSpace normally charges. JungleDisk also does block-level dedup, and has a central management interface. This all spells “nice” for us.

The only remaining question would be whether to just use JungleDisk to back up the backup server, or to put it on each individual machine as well. If it just backs up the backup server, then administrative burdens are lower; we can back everything there up by default and just not worry about it. On the other hand, if there is a problem with our main backups, we could be really stuck. So I’d say I’m leaning towards ZFS plus some sort of rsync solution and JungleDisk for offsite.

I had two people suggest CrashPlan Pro on my blog. It looks interesting, but is a very closed product which makes me nervous. I like using standard tools and formats — gives me more peace of mind, control, and recovery options. CrashPlan Pro supports multiple destinations and says that they do cloud hosting, but don’t list pricing anywhere. So I’ll probably not mess with it.

I’m still very interested in what comments people may have on all this. Let me know!

Backing Up to the Cloud

I’m recently taking some big-picture looks at how we do things, and one thing that I think could be useful would be for us to back up a limited set of data to an offsite location states away. Prices are cheap enough for this to make it useful. Services such as Amazon S3 and Rackspace Cloud Files (I’ve heard particularly good things about that one) seem to be perfect for this. I’m not quite finding software that does what I want, though. Here are my general criteria:

  1. Storage fees of $0.15 per gigabyte-month or less
  2. Free or cheap ($0.20 per gigabyte or less) bandwidth fees
  3. rsync-like protocol to avoid having to re-send those 20GB files that have 20MB of changes in their entirety every night
  4. Open Source and cross-platform (Linux, Windows, Mac, Solaris ideally; Linux and Windows at a minimum)
  5. Compression and encryption
  6. Easy way to restore the entire backup set or individual files
  7. Versatile include/exclude rules
  8. Must be runnable from scripts, cron, etc. without a GUI
  9. Nice to have: block or file-level de-duplication
  10. Nice to have: support for accurately backing up POSIX (user, group, permission bits, symlink, hard links, sparse files) and Windows filesystem attributes
  11. Nice to have: a point-and-click interface for the non-Unix folks to use to restore Windows files and routine restore requests

So far, here’s what I’ve found. I should note that not a single one of these solutions appears to handle hard links or sparse files correctly, meaning I can’t rely on them for complete system-level backups. That doesn’t mean they’re useless — I could still use them to back up critical user data — just less useful.

Of the Free Software solutions, Duplicity is a leading contender. It has built-in support for Amazon S3 and Rackspace Cloud Files storage. It uses rdiff, which is a standalone implementation of the rsync binary delta algorithm. So you send up a full backup, then binary deltas from that for incrementals. That makes it bandwidth-efficient for incremental backups, and storage-efficient. However, periodic full backups will have to be run, which will make it less bandwidth-efficient. (Perhaps not incredibly *often*, but they will still be needed.) Duplicity doesn’t offer block-level de-deuplication or a GUI for the point-and-click folks. But it DOES offer the most Unixy approach and feels like a decent match for the task overall.

The other service relying on Free Software is rsync.net, which supports rsync, sftp, scp, etc. protocols directly. That would be great, as it could preserve hard links and be compatible with any number of rsync-based backup systems. The downside is that it’s expensive — really expensive. Their cheapest rate is $0.32 per GB-month and that’s only realized if you store more than 2TB with them. The base rate is $0.80 per GB-month. They promise premium support and such, but I just don’t think I can justify that for what is, essentially, secondary backup.

On the non-Open Source side, there’s JungleDisk, which has a Server Edition that looks like a good fit. The files are stored on either S3 or Rackspace, and it seems to be a very slick and full-featured solution. The client, however, is proprietary though it does seem to offer a non-GUI command-line interface. They claim to offer block-level de-duplication which could be very nice. The other nice thing is that the server management is centralized, which presumably lets you easily automate things like not running more than one backup at a time in order to not monopolize an Internet link. This can, of course, be managed with something like duplicity with appropriate ssh jobs kicked off from appropriate places, but it would be easier if the agent just handled it automatically.

What are people’s thoughts about this sort of thing?

24 hours with Jacob

Friday, I wrote about the train trip Jacob and I were planning to take. Here’s the story about it.

Friday night, Jacob was super excited. He was running around the house, talking about trains. I had him pack his own backpack with toys this time, which were — you guessed it — trains. Plus train track. His usual bedtime is around 7. He was still awake in his room at about 11, too excited to sleep.

The train was an hour late into Newton, so got up, got ready, and then went into Jacob’s room at 3:15AM. I put my arm around him and said his name softly. No response. I said, just a little louder, “Jacob, it’s time to wake up to go to the train station.” There was about a 2-second pause and then he sat bolt upright rubbing his eyes. A couple seconds later, in a very tired but clear voice, “OK dad, let’s go!” That is, I believe, a record for waking up speed for Jacob.

We went downstairs, got coats, mittens, hats, etc. on, made sure we had the stuffed butterfly he always sleeps with, and went out the door.

As usual, Jacob chattered happily during the entire 15-minute drive to the Amtrak station. One of these days I need to remember to record it because it’s unique. He described things to me ranging from the difference between freight and passenger trains, to what the dining car is all about, to tractors and how to ride them safely. Newton has some “winter lights”, and a few places still had Christmas lights, which were of course big hits.

We had to wait a few minutes at the Amtrak station, and Jacob hadn’t shown any signs of slowing down yet. He wanted to look at every Amtrak poster, picture, logo, or sign in the building. This generally meant me holding him up high while he leaned over to touch it and make out a few words. Then, of course, he would pick out minute details about the trains, such as how many coach cars he thought they had, and we’d visit about that for awhile.

We got on at about 4:20. We found our seats, and Jacob showed no signs of calming down, despite having had only 4 hours of sleep (instead of his usual 11) so far. We checked out the buttons for lights. And, of course, he excitedly yelled out, “Dad, the train is moving!”

He spent the next while mostly watching out his window, but also still exploring his space. Finally at about 5, I said, “Jacob, I am really tired. I am going to sleep now. Will you sleep too?” His response: “Oh sure dad, I will sleep with my eyes open!” As a result, no sleep was had for Jacob, and only a little for me.

The dining car opens for breakfast at 6:30, which is normally a rather foreign time for breakfast on the train for us. But we were both awake so I figured might as well go. So Jacob and I went to the dining car. We sat with a woman going from New Mexico to Lawrence for her grandpa’s funeral, though it was expected and she was having a good time on the train. Jacob turned completely shy, and refused to say a word, except maybe a few whispered into my ear.

He got his favorite railroad French toast, and had me “drizzle” some syrup on it. I used the word “drizzle” for syrup the first time he had French toast on the train, and if I fail to use that word in the dining car, I will hear about it in no uncertain terms from Jacob.

He loved his dining car breakfast, but we spent about an hour and a half there. He was really slow at eating because his face was pressed up against the window so much. But that was just fine; we had nowhere else to be, the person eating breakfast with us enjoyed visiting (and, apparently, scaring the dining car staff with tales of bears in the New Mexico mountains). This was what the train trip was all about, after all.

We played in the lounge car for awhile. The almost floor-to-ceiling wrap-around windows provided a great view for him, and more opportunities to press his face against a window. We talked about freight trains that he saw, noticed the snow on some of them. Then we found the back of the train and he got to look out the back window.

Back at our seat, he played with his toys for about 10 minutes, which was all he used them on the entire trip. There was just too much else to enjoy.

When we used the restroom on the train, he’d comment on how much he liked the Amtrak soap. “It smells SO very very good!” He wanted to wash his hands on the train. By late morning, he had decided: “Dad, I LOVE this Amtrak soap. It smells like peaches! Shall your hands smell like peaches too?” And, when we’d get back up to our seats, he’d put his hands in my face, saying, “Dad, smell that! My hands smell like peaches! It was from the AMTRAK SOAP!”

At some point, he discovered the airline-style safety brochures in the seat back pockets. These were filled with diagrams of the train car, a few photos, and lots of icons with descriptions. I don’t know how many times I read the thing to him, or really how many times he then recited it to me from memory. It was a lot. He spent hours with those brochures.

Jacob had already told me that he wanted pizza for lunch, so I got him the kid-sized pizza. It wasn’t all that big, and he could have devoured at least half of it when hungry. But he was getting really tired and ate only a few bites of pizza and a few chips. Pretty soon he was leaning up against me, the window, and eventually had his head on the table in some tomato sauce. But he didn’t quite fall asleep by the time we went back to our seats, and of course was wide awake by that point.

Jacob loves spotting the word “Amtrak” on things. It was very exciting when he noticed his orange juice at breakfast, and milk at lunch, were “Amtrak juice” and “Amtrak milk” due to the logo on the cups. At dinner he noticed we had Amtrak plates, and when I pointed out that his metal fork had the Amtrak logo on it, he got very excited and had to check every piece of silverware within reach. “Dad, I have an Amtrak fork too!…. And dad, YOU also have an Amtrak fork! We ALL have Amtrak forks! *cackling laughter*”

I finally insisted that Jacob lay down for some quiet time. I closed the curtains, and he finally did fall asleep… less than an hour before our arrival into Galesburg. So by 2:15 he was up to 4.75 hours of sleep, I guess.

We stopped in the train station briefly, then started our walk to the Discovery Depot Children’s Museum, which was right nearby. Although I made no comment about it, Jacob said, “Dad, there is a train museum RIGHT HERE!” “Yes, you’re right Jacob. I can see a steam engine and some cars here.” “Let’s go in!” “I don’t think it’s open today.” “It IS open — shall we go check?” It wasn’t, and that was mighty sad — though when he spotted another old caboose sitting outside the children’s museum, the day suddenly seemed brighter. He complained of how cold he was, although my suggestion that he stop walking through the big piles of snowdrifts was met with a whiny, “But dad, I WANT to do that!”

We went inside the museum (having to walk right buy the locked caboose — thankfully the people at the desk promised to unlock it for us when we were ready) and Jacob started to explore. There was some wooden play trains big enough for children to climb in which he enjoyed, but in general he went from one thing to the next every minute or two as he does when he’s really tired or overstimulated. Until, that is, he discovered the giant toy train table. It had a multi-level wooden track setup, and many toy trains with magnetic hitches. It was like what we have at home, only much bigger and fancier. He spent a LONG time with that. We then briefly explored the rest of the museum and went out into the caboose. It wasn’t the hit it might have been, possibly because there are several at the Great Plains Transportation Museum that he gets to go in on a somewhat regular basis.

After that, he was ready to go back into the museum, but I was feeling rather over-stimulated. On a day when the highs were still well below freezing, it seemed just about every family in Galesburg was crowded into the children’s museum, making it loud and crowded — which I don’t enjoy at all. So I suggested maybe it was snack time instead. A moment’s thought, then he started to pull me out of the caboose before I could get my gloves back on — “Yes dad, I think it IS snack time. Let’s go. Let’s go NOW!”

We walked over to Uncle Billy’s Bakery (Google link or minimal website). Jacob spotted some sugar cookies shaped like mittens. Despite my reluctance to get him more sugar, he was so excited — plus I had barely prevented a meltdown at lunch by promising him that he would get dessert later in the day — so he picked two red mitten cookies. I got myself a wonderful peach muffin and a croissant and we sat down at one of the tables by the window. I taught Jacob how to hang his coat on his chair and he lit into those cookies.

I spotted a guy at the next table over wearing a BNSF jacket, and asked him if he worked for the railroad. He had retired as an engineer a couple of years ago, and had worked various jobs before that. He grew up in Manhattan, KS and so was interested in our trip — and very friendly. While we visited, Jacob devoured his cookies and increasing portions of my snack as well. He told us about a new shop — The Stray Cat — just two stores down that was having a grand opening event today. They make decorations and art out of basically discarded items, and had some really nifty things that I may have bought had I not been wanting for space in our backpack.

Then I spotted Sweets Old-Fashioned Ice Cream, Candy, and Soda Shop across the road. I figured he’d love it and I was already in for the sugar so might as well. He picked out some “birthday cake” flavor ice cream for himself. I got huckleberry ice cream, which he insisted on calling “purpleberry” and managed to get some tastes of as well.

After that, we went to the train station. It was about an hour until our train would be there. I wasn’t sure if we’d find enough to do, but I shouldn’t have worried. Earlier, we had made the happy discovery that the station’s restroom featured the Amtrak soap, so there was that. Then there was the model Amtrak train in the ticket window, which Jacob kept wanting to look at while I’d hold him. And also, the California Zephyr came in. We watched it arrive from the station window, saw people get off and on, and saw it leave — maybe the first time Jacob has witnessed all that in person. And, of course, we looked at the pictures in that train station. The ticketmaster gave Jacob a paper conductor’s hat with puzzles and mazes on the back side.

And then it was time to get onto our train back home. We ate dinner — Jacob again ate little and almost fell asleep — and got back to our seats. I let Jacob stay awake until about 8, when he was starting to get a bit fragile. It took him awhile to fall asleep, but he finally did at about 8:30.

Today he’s still been all excited. He will randomly tell us about bits of the trip, that the man at supper called his grilled cheese sandwich piece “little” when it was really big, what we did at the ice cream store, etc. And I do think that he is now a train safety expert.

All in all, I think that is probably the most excitement he’s ever had in 24 hours and it was a lot of fun to be with him for it!

Jacob & Dad & Trains

Back in July, our family took a train trip from Kansas to New York for Debconf10. And then in September, we went to Indiana.

The only train service from here leaves at about 3AM in both directions. So starting about November, Jacob started asking me, “Dad, will you wake me up in the middle of the night to go to the train station TODAY?” He didn’t seem to get it through his head that we didn’t have another trip planned, although we surely would at some point. It just couldn’t possibly be, right?

So around Christmas, I booked a round trip from here to Galesburg, IL for just Jacob and me. We’ll get on the train at 3AM Saturday morning, get to Galesburg about noon, and then head back home at 5PM, getting home again at, well, 3:30AM.

Jacob is super excited about this. When the tickets arrived, he didn’t yet know about the trip. I thought he’d be excited then, but the ticket sleeve had a picture of a toy train that he didn’t own, so he was somewhat sad. But starting the next day he was very excited. We wrote “Amtrak” on the Jan. 15 spot on his pharmacy calendar (a local pharmacy gives them away free each year). He carefully checked off each day as it went past. And he’s been getting increasingly excited all week.

Tonight he couldn’t really think, couldn’t really play, couldn’t really calm down. He jabbered about how he would sit by the window, how precisely I would wake him up, and his eyes would open up “right away” and we’ll go straight there. He talked about how he will look out the window at the dark night, and was extra excited when I told him he’d see snow out the window like one of the Amtrak videos he likes to watch on Youtube. He already placed his order for breakfast in the dining car: “French toast with syrup on top.”

He ran past the computer while I was looking at things to do in Galesburg, and saw I had a map up, and immediately noticed the train tracks. Then he pointed to the station, and said, “Dad, that says ‘Galesburg Amtrak’.” A rather stunned dad replied, “Yes indeed it does, Jacob.” I guess it was some combination of pre-reading and detective skills, but that surprised me.

Anyhow, this is the first trip with just Jacob and me. We’re going to have a blast, I’m sure. I may, however, wind up going 24 hours without sleep if his adrenaline level is any guide…

Looking back at 2010: reading

A year ago, I posted my reading list for 2010. I listed a few highlights, and a link to my Goodreads page, pointing out that this wasn’t necessarily a goal, just a list of things that sounded interesting.

I started off with Homer’s Iliad, which I tremendously enjoyed and found parallels to modern life surprisingly common in that ancient tale. I enjoyed it so much, in fact, that I quickly jumped to a book that wasn’t on my 2010 list: The Odyssey. I made a somewhat controversial post suggesting that the Old Testament of the Bible can be read similar to how we read The Odyssey. Homer turned out to be much more exciting than I’d expected.

Jordan’s Fires of Heaven (WoT #5) was a good read, though it is one of those books that sometimes is action-packed and interesting, and other times slow-moving and almost depressing. I do plan to continue with the series but I’m not enjoying it as much as I did at first.

War and Peace is something I started late last year. I’m about 400 pages into it, which means I’ve not even read a third of it yet. It has some moving scenes, and is a fun read overall, but the work it takes to keep all the many characters straight can be a bit frustrating at times.

Harvey Cox’s The Future of Faith was one of the highlights of the year. A thought-provoking read by someone that embraces both science and religion, and shows a vision of religion that returns to its earlier roots, less concerned about what particular truths a person believes in than it is about more fundamental issues.

Marcus Borg’s Jesus: Uncovering the Life, Teachings, and Relevance of a Religious Revolutionary began with a surprisingly engaging history lesson on how agriculture caused the formation of domination societies. It also described in a lot of detail how historians analyze ancient texts — their drafting, copying, etc. It paints a vivid portrait of Jewish society in the time that Jesus would have lived, and follows the same lines of thought as Cox regarding religion finally moving past the importance of intellectual assent to a set of statements.

Among books that weren’t on my 2010 list, I also read — and here I’m not listing all of them, just some highlights:

The Cricket on the Hearth in something of a Christmastime tradition of reading one of the shorter Dickens works. I enjoyed it, but not as much as I enjoyed A Christmas Carol last year. Perhaps I made up for that by watching Patrick Stewart as Scrooge instead.

How to Disappear Completely was a fun short humorous read, with a very well-developed first-person narrative.

Paralleling my interest in amateur radio, I read and studied three books in order to prepare myself for the different exams.

In something of a surprise, I laughed a lot at Sh*t My Dad Says, which was more interesting and funny than I expected it to be. All I can say is that Justin’s got quite the dad and quite the interesting childhood.

I even read two other recent releases: The Politician (about John Edwards) and Game Change (about the 2008 presidential race). Both were interesting, vibrant, and mostly unsourced — so hard to know exactly how much to take from them.

And finally, reflecting on and travel before my first trip to Europe, Travel as a Political Act, which encourages us to find the fun in “my cultural furniture rearranged and my ethnocentric self-assuredness walloped.” And that was fun.

Now to make up the 2011 list…

Wikis, Amateur Radio, and Debian

As I have been getting involved with amateur radio this year, I’ve been taking notes on what I’m learning about certain things: tips from people on rigging up a bicycle antenna to achieve a 40-mile range, setting up packet radio in Linux, etc. I have long run a personal, private wiki where I put such things.

But I really wanted a convenient place to put this stuff in public. There was no reason to keep it private. In fact, I wanted to share with others what I’ve learned. And, as I wanted to let others add their tips if they wish, I set up a public MoinMoin instance on . So far, most of my attention has focused on the amateur radio section of it

This has worked out pretty well for me. Sometimes I will cut and paste tips from emails into there, and then after trying them out, edit them into a more coherent summary based on my experiences.

Now then, on to packet radio and Debian. Packet radio is a digital communications mode that runs on the amateur radio bands. It is a routable, networking protocol that typically runs at 300bps, 1200bps, and 9600bps. My packet radio page gives a better background on it, but essentially AX.25 — the packet protocol — is similar to a scaled-down TCP/IP. One interesting thing about packet is that, since it can use the HF bands, can have direct transcontinental wireless links. More common are links spanning 30-50 miles on VHF and UHF, as well as those going across a continent on HF.

Linux is the only operating system I know of that has AX.25 integrated as a first-class protocol in the kernel. You can create AX.25 sockets and use them with the APIs you’re familiar with already. Not only that, but the Linux AX.25 stack is probably the best there is, and it interfaces easily with TCP/IP — there are global standards for encapsulating TCP/IP within AX.25 and AX.25 within UDP, and both are supported on Linux. Yes, I have telnetted to a machine to work on it over VHF. Of Linux distributions, Debian appears to have the best AX.25 stack built-in.

The AX.25 support in Linux is great, but it’s rather under-documented. So I set up a page for packet radio on Linux. I’ve had a great deal of fun with this. It’s amazing what you can do running a real networking protocol at 300bps over long-distance radio. I’ve had real-time conversations with people, connected to their personal BBS and sent them mail, and even use AX.25 “nodes” (think of them as a kind of router or bridge; you can connect in to them and the connect back out on the same or different frequencies to extend your reach) to connect out to systems that I can’t reach directly.

MoinMoin has worked out well for this. It has an inviting theme and newbie-friendly interface (I want to encourage drive-by contributions).