Morning in the Skies

IMG_8515

This is morning. Time to fly. Two boys, happy to open the hangar door and get the plane ready.

It’s been a year since I passed the FAA exam and became a pilot. Memories like these are my favorite reminders why I did. It is such fun to see people’s faces light up with the joy of flying a few thousand feet above ground, of the beauty and freedom and peace of the skies.

I’ve flown 14 different passengers in that time; almost every flight I’ve taken has been with people, which I enjoy. I’ve heard “wow” or “beautiful” so many times, and said it myself even more times.

IMG_6083

I’ve landed in two state parks, visited any number of wonderful small towns, seen historic sites and placid lakes, ascended magically over forests and plains. I’ve landed at 31 airports in 10 states, flying over 13,000 miles.

airports

Not once have I encountered anyone other than friendly, kind, and outgoing. And why not? After all, we’re working around magic flying carpet machines, right?

IMG_7867_bw

(That’s my brother before a flight with me, by the way)

Some weeks it is easy to be glum. This week has been that way for many, myself included. But then, whether you are in the air or on the ground, if you pay attention, you realize we still live in a beautiful world with many wonderful people.

And, in fact, I got a reminder of that this week. Not long after the election, I got in a plane, pushed in the throttle, and started the takeoff roll down a runway in the midst of an Indiana forest. The skies were the best kind of clear blue, and pretty soon I lifted off and could see for miles. Off in the distance, I could see the last cottony remnants of the morning’s fog, lying still in the valleys, surrounding the little farms and houses as if to give them a loving hug. Wow.

Sometimes the flight is bumpy. Sometimes the weather doesn’t cooperate, and it doesn’t happen at all. Sometimes you can fly across four large states and it feels as smooth as glass the whole way.

Whatever happens, at the end of the day, the magic flying carpet machine gets locked up again. We go home, rest our heads on our soft pillows, and if we so choose, remember the beauty we experienced that day.

Really, this post is not about being a pilot. This post is a reminder to pay attention to all that is beautiful in this world. It surrounds us; the smell of pine trees in the forest, the delight in the faces of children, the gentle breeze in our hair, the kind word from a stranger, the very sunrise.

I hope that more of us will pay attention to the moments of clear skies and wind at our back. Even at those moments when we pull the hangar door shut.

IMG_20160716_093627

Two Boys, An Airplane, Plus Hundreds of Old Computers

“Was there anything you didn’t like about our trip?”

Jacob’s answer: “That we had to leave so soon!”

That’s always a good sign.

When I first heard about the Vintage Computer Festival Midwest, I almost immediately got the notion that I wanted to go. Besides the TRS-80 CoCo II up in my attic, I also have fond memories of an old IBM PC with CGA monitor, a 25MHz 486, an Alpha also in my attic, and a lot of other computers along the way. I didn’t really think my boys would be interested.

But I mentioned it to them, and they just lit up. They remembered the Youtube videos I’d shown them of old line printers and punch card readers, and thought it would be great fun. I thought it could be a great educational experience for them too — and it was.

It also turned into a trip that combined being a proud dad with so many of my other interests. Quite a fun time.

IMG_20160911_061456

(Jacob modeling his new t-shirt)

Captain Jacob

Chicago being not all that close to Kansas, I planned to fly us there. If you’re flying yourself, solid flight planning is always important. I had already planned out my flight using electronic tools, but I always carry paper maps with me in the cockpit for backup. I got them out and the boys and I planned out the flight the old-fashioned way.

Here’s Oliver using a scale ruler (with markings for miles corresponding to the scale of the map) and Jacob doing calculating for us. We measured the entire route and came to within one mile of the computer’s calculation for each segment — those boys are precise!

20160904_175519

We figured out how much fuel we’d use, where we’d make fuel stops, etc.

The day of our flight, we made it as far as Davenport, Iowa when a chance of bad weather en route to Chicago convinced me to land there and drive the rest of the way. The boys saw that as part of the exciting adventure!

Jacob is always interested in maps, and had kept wanting to use my map whenever we flew. So I dug an old Android tablet out of the attic, put Avare on it (which has aviation maps), and let him use that. He was always checking it while flying, sometimes saying this over his headset: “DING. Attention all passengers, this is Captain Jacob speaking. We are now 45 miles from St. Joseph. Our altitude is 6514 feet. Our speed is 115 knots. We will be on the ground shortly. Thank you. DING”

Here he is at the Davenport airport, still busy looking at his maps:

IMG_20160909_183813

Every little airport we stopped at featured adults smiling at the boys. People enjoyed watching a dad and his kids flying somewhere together.

Oliver kept busy too. He loves to help me on my pre-flight inspections. He will report every little thing to me – a scratch, a fleck of paint missing on a wheel cover, etc. He takes it seriously. Both boys love to help get the plane ready or put it away.

The Computers

Jacob quickly gravitated towards a few interesting things. He sat for about half an hour watching this old Commodore plotter do its thing (click for video):

VID_20160910_142044

His other favorite thing was the phones. Several people had brought complete analog PBXs with them. They used them to demonstrate various old phone-related hardware; one had several BBSs running with actual modems, another had old answering machines and home-security devices. Jacob learned a lot about phones, including how to operate a rotary-dial phone, which he’d never used before!

IMG_20160910_151431

Oliver was drawn more to the old computers. He was fascinated by the IBM PC XT, which I explained was just about like a model I used to get to use sometimes. They learned about floppy disks and how computers store information.

IMG_20160910_195145

He hadn’t used joysticks much, and found Pong (“this is a soccer game!”) interesting. Somebody has also replaced the guts of a TRS-80 with a Raspberry Pi running a SNES emulator. This had thoroughly confused me for a little while, and excited Oliver.

Jacob enjoyed an old TRS-80, which, through a modern Ethernet interface and a little computation help in AWS, provided an interface to Wikipedia. Jacob figured out the text-mode interface quickly. Here he is reading up on trains.

IMG_20160910_140524

I had no idea that Commodore made a lot of adding machines and calculators before they got into the home computer business. There was a vast table with that older Commodore hardware, too much to get on a single photo. But some of the adding machines had their covers off, so the boys got to see all the little gears and wheels and learn how an adding machine can do its printing.

IMG_20160910_145911

And then we get to my favorite: the big iron. Here is a VAX — a working VAX. When you have a computer that huge, it’s easier for the kids to understand just what something is.

IMG_20160910_125451

When we encountered the table from the Glenside Color Computer Club, featuring the good old CoCo IIs like what I used as a kid (and have up in my attic), I pointed out to the boys that “we have a computer just like this that can do these things” — and they responded “wow!” I think they are eager to try out floppy disks and disk BASIC now.

Some of my favorites were the old Unix systems, which are a direct ancestor to what I’ve been working with for decades now. Here’s AT&T System V release 3 running on its original hardware:

IMG_20160910_144923

And there were a couple of Sun workstations there, making me nostalgic for my college days. If memory serves, this one is actually running on m68k in the pre-Sparc days:

IMG_20160910_153418

Returning home

After all the excitement of the weekend, both boys zonked out for awhile on the flight back home. Here’s Jacob, sleeping with his maps still up.

IMG_20160911_132952

As we were nearly home, we hit a pocket of turbulence, the kind that feels as if the plane is dropping a bit (it’s perfectly normal and safe; you’ve probably felt that on commercial flights too). I was a bit concerned about Oliver; he is known to get motion sick in cars (and even planes sometimes). But what did I hear from Oliver?

“Whee! That was fun! It felt like a roller coaster! Do it again, dad!”

Easily Improving Linux Security with Two-Factor Authentication

2-Factor Authentication (2FA) is a simple way to help improve the security of your systems. It restricts the scope of damage if a machine is compromised. If, for instance, you have a security token or authenticator app on your phone that is required for ssh to a remote machine, then even if every laptop you use to connect to the remote is totally owned, an attacker cannot establish a new ssh session on their own.

There are a lot of tutorials out there on the Internet that get you about halfway there, so here is some more detail.

Background

In this article, I will be focusing on authentication in the style of Google Authenticator, which is a special case of OATH HOTP or TOTP. You can use the Google Authenticator app, FreeOTP, or a hardware token like Yubikey to generate tokens with this. They are all 100% compatible with Google Authenticator and libpam-google-authenticator.

The basic idea is that there is a pre-shared secret key. At each login, a different and unique token is required, which is generated based on the pre-shared secret key and some other information. With TOTP, the “other information” is the current time, implying that both machines must be reasably well in-sync time-wise. With HOTP, the “other information” is a count of the number of times the pre-shared key has been used. Both typically have a “window” on the server side that can let times within a certain number of seconds, or a certain number of login accesses, work.

The beauty of this system is that after the initial setup, no Internet access is required on either end to validate the key (though TOTP requires both ends to be reasonably in sync time-wise).

The basics: user account setup and ssh authentication

You can start with the basics by reading one of these articles: one, two, three. Debian/Ubuntu users will find both the pam module and the user account setup binary in libpam-google-authenticator.

For many, you can stop there. You’re done. But if you want to kick it up a notch, read on:

Enhancement 1: Requiring 2FA even when ssh public key auth is used

Let’s consider a scenario in which your system is completely compromised. Unless your ssh keys are also stored in something like a Yubikey Neo, they could wind up being compromised as well – if someone can read your files and sniff your keyboard, your ssh private keys are at risk.

So we can configure ssh and PAM so that a OTP token is required even for this scenario.

First off, in /etc/ssh/sshd_config, we want to change or add these lines:

UsePAM yes
ChallengeResponseAuthentication yes
AuthenticationMethods publickey,keyboard-interactive

This forces all authentication to pass two verification methods in ssh: publickey and keyboard-interactive. All users will have to supply a public key and then also pass keyboard-interactive auth. Normally keyboard-interactive auth prompts for a password, but we can change /etc/pam.d/sshd on this. I added this line at the very top of /etc/pam.d/sshd:

auth [success=done new_authtok_reqd=done ignore=ignore default=bad] pam_google_authenticator.so

This basically makes Google Authenticator both necessary and sufficient for keyboard-interactive in ssh. That is, whenever the system wants to use keyboard-interactive, rather than prompt for a password, it instead prompts for a token. Note that any user that has not set up google-authenticator already will be completely unable to ssh into their account.

Enhancement 1, variant 2: Allowing automated processes to root

On many of my systems, I have ~root/.ssh/authorized_keys set up to permit certain systems to run locked-down commands for things like backups. These are automated commands, and the above configuration will break them because I’m not going to be typing in codes at 3AM.

If you are very restrictive about what you put in root’s authorized_keys, you can exempt the root user from the 2FA requirement in ssh by adding this to sshd_config:

Match User root
  AuthenticationMethods publickey

This says that the only way to access the root account via ssh is to use the authorized_keys file, and no 2FA will be required in this scenario.

Enhancement 1, variant 2: Allowing non-pubkey auth

On some multiuser systems, some users may still want to use password auth rather than publickey auth. There are a few ways we can support that:

  1. Users without public keys will have to supply a OTP and a password, while users with public keys will have to supply public key, OTP, and a password
  2. Users without public keys will have to supply OTP or a password, while users with public keys will have to supply public key, OTP, or a password
  3. Users without public keys will have to supply OTP and a password, while users with public keys only need to supply the public key

The third option is covered in any number of third-party tutorials. To enable options 1 or 2, you’ll need to put this in sshd_config:

AuthenticationMethods publickey,keyboard-interactive keyboard-interactive

This means that to authenticate, you need to pass either publickey and then keyboard-interactive auth, or just keyboard-interactive auth.

Then in /etc/pam.d/sshd, you put this:

auth required pam_google_authenticator.so

As a sub-variant for option 1, you can add nullok to here to permit auth from people that do not have a Google Authenticator configuration.

Or for option 2, change “required” to “sufficient”. You should not add nullok in combination with sufficient, because that could let people without a Google Authenticator config authenticate completely without a password at all.

Enhancement 2: Configuring su

A lot of other tutorials stop with ssh (and maybe gdm) but forget about the other ways we authenticate or change users on a system. su and sudo are the two most important ones. If your root password is compromised, you don’t want anybody to be able to su to that account without having to supply a token. So you can set up google-authenticator for root.

Then, edit /etc/pam.d/su and insert this line after the pam_rootok.so line:

auth       required     pam_google_authenticator.so nullok

The reason you put this after pam_rootok.so is because you want to be able to su from root to any account without having to input a token. We add nullok to the end of this, because you may want to su to accounts that don’t have tokens. Just make sure to configure tokens for the root account first.

Enhancement 3: Configuring sudo

This one is similar to su, but a little different. This lets you, say, secure the root password for sudo.

Normally, you might sudo from your user account to root (if so configured). You might have sudo configured to require you to enter in your own password (rather than root’s), or to just permit you to do whatever you want as root without a password.

Our first step, as always, is to configure PAM. What we do here depends on your desired behavior: do you want to require someone to supply both a password and a token, or just a token, or require a token? If you want to require a token, put this at the top of /etc/pam.d/sudo:

auth [success=done new_authtok_reqd=done ignore=ignore default=bad] pam_google_authenticator.so

If you want to require a token and a password, change the bracketed string to “required”, and if you want a token or a password, change it to “sufficient”. As before, if you want to permit people without a configured token to proceed, add “nullok”, but do not use that with “sufficient” or the bracketed example here.

Now here comes the fun part. By default, if a user is required to supply a password to sudo, they are required to supply their own password. That does not help us here, because a user logged in to the system can read the ~/.google_authenticator file and easily then supply tokens for themselves. What you want to do is require them to supply root’s password. Here’s how I set that up in sudoers:

Defaults:jgoerzen rootpw
jgoerzen ALL=(ALL) ALL

So now, with the combination of this and the PAM configuration above, I can sudo to the root user without knowing its password — but only if I can supply root’s token. Pretty slick, eh?

Further reading

In addition to the basic tutorials referenced above, consider:

Edit: additional comments

Here are a few other things to try:

First, the libpam-google-authenticator module supports putting the Google Authenticator files in different locations and having them owned by a certain user. You could use this to, for instance, lock down all secret keys to be readable only by the root user. This would prevent users from adding, changing, or removing their own auth tokens, but would also let you do things such as reusing your personal token for the root account without a problem.

Also, the pam-oath module does much of the same things as the libpam-google-authenticator module, but without some of the help for setup. It uses a single monolithic root-owned password file for all accounts.

There is an oathtool that can be used to generate authentication codes from the command line.

All Aboard

“Aaaaaall Aboard!” *chug* *chug*

And so began a “trip” aboard our hotel train in Indianapolis, conducted by our very own Jacob and Oliver.

IMG_20160703_101438

Because, well, what could be more fun than spending a few days in the world’s only real Pullman sleeping car, on its original service track, inside a hotel?

IMG_20160703_101520

We were on a family vacation to Indianapolis, staying in what two railfan boys were sure to enjoy: a hotel actually built into part of the historic Indianapolis Union Station complex. This is the original train track and trainshed. They moved in the Pullman cars, then built the hotel around them. Jacob and Oliver played for hours, acting as conductors and engineers, sending their “train” all across the country to pick up and drop off passengers.

Opa!

Have you ever seen a kid’s face when you introduce them to something totally new, and they think it is really exciting, but a little scary too?

That was Jacob and Oliver when I introduced them to saganaki (flaming cheese) at a Greek restaurant. The conversation went a little like this:

“Our waitress will bring out some cheese. And she will set it ON FIRE — right by our table!”

“Will it burn the ceiling?”

“No, she’ll be careful.”

“Will it be a HUGE fire?”

“About a medium-sized fire.”

“Then what will happen?”

“She’ll yell ‘OPA!’ and we’ll eat the cheese after the fire goes out.”

“Does it taste good?”

“Oh yes. My favorite!”

It turned out several tables had ordered saganaki that evening, so whenever I saw it coming out, I’d direct their attention to it. Jacob decided that everyone should call it “opa” instead of saganaki because that’s what the waitstaff always said. Pretty soon whenever they’d see something appear in the window from the kitchen, there’d be craning necks and excited jabbering of “maybe that’s our opa!”

And when it finally WAS our “opa”, there were laughs of delight and I suspect they thought that was the best cheese ever.

Giggling Elevators

IMG_20160703_205544

Fun times were had pressing noses against the glass around the elevator. Laura and I sat on a nearby sofa while Jacob and Oliver sat by the elevators, anxiously waiting for someone to need to go up and down. They point and wave at elevators coming down, and when elevator passengers waved back, Oliver would burst out giggling and run over to Laura and me with excitement.

Some history

IMG_20160704_161550

We got to see the grand hall of Indianapolis Union Station — what a treat to be able to set foot in this magnificent, historic space, the world’s oldest union station. We even got to see the office where Thomas Edison worked, and as a hotel employee explained, was fired for doing too many experiments on the job.

Water and walkways

Indy has a system of elevated walkways spanning quite a section of downtown. It can be rather complex navigating them, and after our first day there, I offered to let Jacob and Oliver be the leaders. Boy did they take pride in that! They stopped to carefully study maps and signs, and proudly announced “this way” or “turn here” – and were usually correct.

20160702_164754_Richtone(HDR)

And it was the same in the paddleboat we took down the canal. Both boys wanted to be in charge of steering, and we only scared a few other paddleboaters.

Fireworks

IMG_20160704_220332

Our visit ended with the grand fireworks show downtown, set off from atop a skyscraper. I had been scouting for places to watch from, and figured that a bridge-walkway would be great. A couple other families had that thought too, and we all watched the 20-minute show in the drizzle.

Loving brothers

By far my favorite photo from the week is this one, of Jacob and Oliver asleep, snuggled up next to each other under the covers. They sure are loving and caring brothers, and had a great time playing together.

IMG_20160702_071015

Building a home firewall: review of pfsense

For some time now, I’ve been running OpenWRT on an RT-N66U device. I initially set that because I had previously been using my Debian-based file/VM server as a firewall, and this had some downsides: every time I wanted to reboot that, Internet for the whole house was down; shorewall took a fair bit of care and feeding; etc.

I’ve been having indications that all is not well with OpenWRT or the N66U in the last few days, and some long-term annoyances prompted me to search out a different solution. I figured I could buy an embedded x86 device, slap Debian on it, and be set.

The device I wound up purchasing happened to have pfsense preinstalled, so I thought I’d give it a try.

As expected, with hardware like that to work with, it was a lot more capable than OpenWRT and had more features. However, I encountered a number of surprising issues.

The biggest annoyance was that the system wouldn’t allow me to set up a static DHCP entry with the same IP for multiple MAC addresses. This is a very simple configuration in the underlying DHCP server, and OpenWRT permitted it without issue. It is quite useful so my laptop has the same IP whether connected by wifi or Ethernet, and I have used it for years with no issue. Googling it a bit turned up some rather arrogant pfsense people saying that this is “broken” and poor design, and that your wired and wireless networks should be on different VLANs anyhow. They also said “just give it the same hostname for the different IPs” — but it rejects this too. Sigh. I discovered, however, that downloading the pfsense backup XML file, editing the IP within, and re-uploading it gets me what I want with no ill effects!

So then I went to set up DNS. I tried to enable the “DNS Forwarder”, but it wouldn’t let me do that while the “DNS Resolver” was still active. Digging in just a bit, it appears that the DNS Forwarder and DNS Resolver both provide forwarding and resolution features; they just have different underlying implementations. This is not clear at all in the interface.

Next stop: traffic shaping. Since I use VOIP for work, this is vitally important for me. I dove in, and found a list of XML filenames for wizards: one for “Dedicated Links” and another for “Multiple Lan/Wan”. Hmmm. Some Googling again turned up that everyone suggests using the “Multiple Lan/Wan” wizard. Fine. I set it up, and notice that when I start an upload, my download performance absolutely tanks. Some investigation shows that outbound ACKs aren’t being handled properly. The wizard had created a qACK queue, but neglected to create a packet match rule for it, so ACKs were not being dealt with appropriately. Fixed that with a rule of my own design, and now downloads are working better again. I also needed to boost the bandwidth allocated to qACK (setting it to 25% seemed to do the trick).

Then there was the firewall rules. The “interface” section is first-match-wins, whereas the “floating” section is last-match-wins. This is rather non-obvious.

Getting past all the interface glitches, however, the system looks powerful, solid, and well-engineered under the hood, and fairly easy to manage.

A great day for a flight with the boys

I tend to save up my vacation time to use in summer for family activities, and today was one of those days.

Yesterday, Jacob and Oliver enjoyed planning what they were going to do with me. They ruled out all sorts of things nearby, but they decided they would like to fly to Ponca City, explore the oil museum there, then eat at Enrique’s before flying home.

Of course, it is not particularly hard to convince me to fly somewhere. So off we went today for some great father-son time.

The weather on the way was just gorgeous. We cruised along at about a mile above ground, which gave us pleasantly cool air through the vents and a smooth ride. Out in the distance, a few clouds were trying to form.

IMG_20160627_141614

Whether I’m flying or driving, a pilot is always happy to pass a small airport. Here was the Winfield, KS airport (KWLD):

IMG_20160627_142106

This is a beautiful time of year in Kansas. The freshly-cut wheat fields are still a vibrant yellow. Other crops make a bright green, and colors just pop from the sky. A camera can’t do it justice.

They enjoyed the museum, and then Oliver wanted to find something else to do before we returned to the airport for dinner. A little exploring yielded the beautiful and shady Garfield Park, complete with numerous old stone bridges.

IMG_20160627_162121

Of course, the hit of any visit to Enrique’s is their “ice cream tacos” (sopapillas with ice cream). Here is Oliver polishing off his.

IMG_20160627_174345

They had both requested sightseeing from the sky on our way back, but both fell asleep so we opted to pass on that this time. Oliver slept through the landing, and I had to wake him up when it was time to go. I always take it as a compliment when a 6-year-old sleeps through a landing!

IMG_20160627_191524

Most small airports have a bowl of candy setting out somewhere. Jacob and Oliver have become adept at finding them, and I will usually let them “talk me into” a piece of candy at one of them. Today, after we got back, they were intent at exploring the small gift shop back home, and each bought a little toy helicopter for $1.25. They may have been too tired to enjoy it though.

They’ve been in bed for awhile now, and I’m still smiling about the day. Time goes fast when you’re having fun, and all three of us were. It is fun to see them inheriting my sense of excitement at adventure, and enjoying the world around them as they go.

The lady at the museum asked how we had heard about them, and noticed I drove up in an airport car (most small airports have an old car you can borrow for a couple hours for free if you’re a pilot). I told the story briefly, and she said, “So you flew out to this small town just to spend some time here?” “Yep.” “Wow, that’s really neat. I don’t think we’ve ever had a visitor like you before.” Then she turned to the boys and said, “You boys are some of the luckiest kids in the world.”

And I can’t help but feel like the luckiest dad in the world.

I’m switching from git-annex to Syncthing

I wrote recently about using git-annex for encrypted sync, but due to a number of issues with it, I’ve opted to switch to Syncthing.

I’d been using git-annex with real but noncritical data. Among the first issues I noticed was occasional but persistent high CPU usage spikes, which once started, would persist apparently forever. I had an issue where git-annex tried to replace files I’d removed from its repo with broken symlinks, but the real final straw was a number of issues with the gcrypt remote repos. git-remote-gcrypt appears to have a number of issues with possible race conditions on the remote, and at least one of them somehow caused encrypted data to appear in a packfile on a remote repo. Why there was data in a packfile there, I don’t know, since git-annex is supposed to keep the data out of packfiles.

Anyhow, git-annex is still an awesome tool with a lot of use cases, but I’m concluding that live sync to an encrypted git remote isn’t quite there yet enough for me.

So I looked for alternatives. My main criteria were supporting live sync (via inotify or similar) and not requiring the files to be stored unencrypted on a remote system (my local systems all use LUKS). I found Syncthing met these requirements.

Syncthing is pretty interesting in that, like git-annex, it doesn’t require a centralized server at all. Rather, it forms basically a mesh between your devices. Its concept is somewhat similar to the proprietary Bittorrent Sync — basically, all the nodes communicate about what files and chunks of files they have, and the changes that are made, and immediately propagate as much as possible. Unlike, say, Dropbox or Owncloud, Syncthing can actually support simultaneous downloads from multiple remotes for optimum performance when there are many changes.

Combined with syncthing-inotify or syncthing-gtk, it has immediate detection of changes and therefore very quick propagation of them.

Syncthing is particularly adept at figuring out ways for the nodes to communicate with each other. It begins by broadcasting on the local network, so known nearby nodes can be found directly. The Syncthing folks also run a discovery server (though you can use your own if you prefer) that lets nodes find each other on the Internet. Syncthing will attempt to use UPnP to configure firewalls to let it out, but if that fails, the last resort is a traffic relay server — again, a number of volunteers host these online, but you can run your own if you prefer.

Each node in Syncthing has an RSA keypair, and what amounts to part of the public key is used as a globally unique node ID. The initial link between nodes is accomplished by pasting the globally unique ID from one node into the “add node” screen on the other; the user of the first node then must accept the request, and from that point on, syncing can proceed. The data is all transmitted encrypted, of course, so interception will not cause data to be revealed.

Really my only complaint about Syncthing so far is that, although it binds to localhost, the web GUI does not require authentication by default.

There is an ITP open for Syncthing in Debian, but until then, their apt repo works fine. For syncthing-gtk, the trusty version of the webupd8 PPD works in Jessie (though be sure to pin it to a low priority if you don’t want it replacing some unrelated Debian packages).

Mud, Airplanes, Arduino, and Fun

The last few weeks have been pretty hectic in their way, but I’ve also had the chance to take some time off work to spend with family, which has been nice.

Memorial Day: breakfast and mud

For Memorial Day, I decided it would be nice to have a cookout for breakfast rather than for dinner. So we all went out to the fire ring. Jacob and Oliver helped gather kindling for the fire, while Laura chopped up some vegetables. Once we got a good fire going, I cooked some scrambled eggs in a cast iron skillet, mixed with meat and veggies. Mmm, that was tasty.

Then we all just lingered outside. Jacob and Oliver enjoyed playing with the cats, and the swingset, and then…. water. They put the hose over the slide and made a “water slide” (more mud slide maybe).

IMG_7688

Then we got out the water balloon fillers they had gotten recently, and they loved filling up water balloons. All in all, we all just enjoyed the outdoors for hours.

MVI_7738

Flying to Petit Jean, Arkansas

Somehow, neither Laura nor I have ever really been to Arkansas. We figured it was about time. I had heard wonderful things about Petit Jean State Park from other pilots: it’s rather unique in that it has a small airport right in the park, a feature left over from when Winthrop Rockefeller owned much of the mountain.

And what a beautiful place it was! Dense forests with wonderful hiking trails, dotted with small streams, bubbling springs, and waterfalls all over; a nice lake, and a beautiful lodge to boot. Here was our view down into the valley at breakfast in the lodge one morning:

IMG_7475

And here’s a view of one of the trails:

IMG_7576

The sunset views were pretty nice, too:

IMG_7610

And finally, the plane we flew out in, parked all by itself on the ramp:

IMG_20160522_171823

It was truly a relaxing, peaceful, re-invigorating place.

Flying to Atchison

Last weekend, Laura and I decided to fly to Atchison, KS. Atchison is one of the oldest cities in Kansas, and has quite a bit of history to show off. It was fun landing at the Amelia Earhart Memorial Airport in a little Cessna, and then going to three museums and finding lunch too.

Of course, there is the Amelia Earhart Birthplace Museum, which is a beautifully-maintained old house along the banks of the Missouri River.

IMG_20160611_134313

I was amused to find this hanging in the county historical society museum:

IMG_20160611_153826

One fascinating find is a Regina Music Box, popular in the late 1800s and early 1900s. It operates under the same principles as those that you might see that are cylindrical. But I am particular impressed with the effort that would go into developing these discs in the pre-computer era, as of course the holes at the outer edge of the disc move faster than the inner ones. It would certainly take a lot of careful calculation to produce one of these. I found this one in the Cray House Museum:

VID_20160611_151504

An Arduino Project with Jacob

One day, Jacob and I got going with an Arduino project. He wanted flashing blue lights for his “police station”, so we disassembled our previous Arduino project, put a few things on the breadboard, I wrote some code, and there we go. Then he noticed an LCD in my Arduino kit. I hadn’t ever gotten around to using it yet, and of course he wanted it immediately. So I looked up how to connect it, found an API reference, and dusted off my C skills (that was fun!) to program a scrolling message on it. Here is Jacob showing it off:

VID_20160614_074802.mp4

How git-annex replaces Dropbox + encfs with untrusted providers

git-annex has been around for a long time, but I just recently stumbled across some of the work Joey has been doing to it. This post isn’t about it’s traditional roots in git or all the features it has for partial copies of large data sets, but rather for its live syncing capabilities like Dropbox. It takes a bit to wrap your head around, because git-annex is just a little different from everything else. It’s sort of like a different-colored smell.

The git-annex wiki has a lot of great information — both low-level reference and a high-level 10-minute screencast showing how easy it is to set up. I found I had to sort of piece together the architecture between those levels, so I’m writing this all down hoping it will benefit others that are curious.

Ir you just want to use it, you don’t need to know all this. But I like to understand how my tools work.

Overview

git-annex lets you set up a live syncing solution that requires no central provider at all, or can be used with a completely untrusted central provider. Depending on your usage pattern, this central provider could require only a few MBs of space even for repositories containing gigabytes or terabytes of data that is kept in sync.

Let’s take a look at the high-level architecture of the tool. Then I’ll illustrate how it works with some scenarios.

Three Layers

Fundamentally, git-annex takes layers that are all combined in Dropbox and separates them out. There is the storage layer, which stores the literal data bytes that you are interested in. git-annex indexes the data in storage by a hash. There is metadata, which is for things like a filename-to-hash mapping and revision history. And then there is an optional layer, which is live signaling used to drive the real-time syncing.

git-annex has several modes of operation, and the one that enables live syncing is called the git-annex assistant. It runs as a daemon, and is available for Linux/POSIX platforms, Windows, Mac, and Android. I’ll be covering it here.

The storage layer

The storage layer simply is blobs of data. These blobs are indexed by a hash, and can be optionally encrypted at rest at remote backends. git-annex has a large number of storage backends; some examples include rsync, a remote machine with git-annex on it that has ssh installed, WebDAV, S3, Amazon Glacier, removable USB drive, etc. There’s a huge list.

One of the git-annex features is that each client knows the state of each storage repository, as well as the capability set of each storage repository. So let’s say you have a workstation at home and a laptop you take with you to work or the coffee shop. You’d like changes on one to be instantly recognized on another. With something like Dropbox or OwnCloud, every file in the set you want synchronized has to reside on a server in the cloud. With git-annex, it can be configured such that the server in the cloud only contains a copy of a file until every client has synced it up, at which point it gets removed. Think about it – that is often what you want anyhow, so why maintain an unnecessary copy after it’s synced everywhere? (This behavior is, of course, configurable.) git-annex can also avoid storing in the cloud entirely if the machines are able to reach each other directly at least some of the time.

The metadata layer

Metadata about your files includes a mapping from the file names to the storage location (based on hashes), change history, and information about the status of each machine that participates in the syncing. On your clients, git-annex stores this using git. This detail is very useful to some, and irrelevant to others.

Some of the git-annex storage backends can support only storage (S3, for instance). Some can support both storage and metadata (rsync, ssh, local drives, etc.) You can even configure a backend to support only metadata (more on why that may be useful in a bit). When you are working with a git-backed repository for git-annex, it can hold data, metadata, or both.

So, to have a working sync system, you must have a way to transport both the data and the metadata. The transport for the metadata is generally rsync or git, but it can also be XMPP in which Git changesets are basically wrapped up in XMPP presence messages. Joey says, however, that there are some known issues with XMPP servers sometimes dropping or reordering some XMPP messages, so he doesn’t encourage that method currently.

The live signaling layer

So once you have your data and metadata, you can already do syncs via git annex sync --contents. But the real killer feature here will be automatic detection of changes, both on the local and the remote. To do that, you need some way of live signaling. git-annex supports two methods.

The first requires ssh access to a remote machine where git-annex is installed. In this mode of operation, when the git-annex assistant fires up, it opens up a persistent ssh connection to the remote and runs the git-annex-shell over there, which notifies it of changes to the git metadata repository. When a change is detected, a sync is initiated. This is considered ideal.

A substitute can be XMPP, and git-annex actually converts git commits into a form that can be sent over XMPP. As I mentioned above, there are some known reliability issues with this and it is not the recommended option.

Encryption

When it comes to encryption, you generally are concerned about all three layers. In an ideal scenario, the encryption and decryption happens entirely on the client side, so no service provider ever has any details about your data.

The live signaling layer is encrypted pretty trivially; the ssh sessions are, of course, encrypted and TLS support in XMPP is pervasive these days. However, this is not end-to-end encryption; those messages are decrypted by the service provider, so a service provider could theoretically spy on metadata, which may include change times and filenames, though not the contents of files themselves.

The data layer also can be encrypted very trivially. In the case of the “dumb” backends like S3, git-annex can use symmetric encryption or a gpg keypair and all that ever shows up on the server are arbitrarily-named buckets.

You can also use a gcrypt-based git repository. This can cover both data and metadata — and, if the target also has git-annex installed, the live signalling layer. Using a gcrypt-based git repository for the metadata and live signalling is the only way to accomplish live syncing with 100% client-side encryption.

All of these methods are implemented in terms of gpg, and can support symmetric of public-key encryption.

It should be noted here that the current release versions of git-annex need a one-character patch in order to fix live syncing with a remote using gcrypt. For those of you running jessie, I recommend the version in jessie-backports, which is presently 5.20151208. For your convenience, I have compiled an amd64 binary that can drop in over /usr/bin/git-annex if you have this version. You can download it and a gpg signature for it. Note that you only need this binary on the clients; the server can use the version from jessie-backports without issue.

Putting the pieces together: some scenarios

Now that I’ve explained the layers, let’s look at how they fit together.

Scenario 1: Central server

In this scenario, you might have a workstation and a laptop that sync up with each other by way of a central server that also has a full copy of the data. This is the scenario that most closely resembles Dropbox, box, or OwnCloud.

Here you would basically follow the steps in the git-assistant screencast: install git-annex on a server somewhere, and point your clients to it. If you want full end-to-end encryption, I would recommend letting git-annex generate a gpg keypair for you, which you would then need to copy to both your laptop and workstation (but not the server).

Every change you make locally will be synced to the server, and then from the server to your other PC. All three systems would be configured in the “client” transfer group.

Scenario 1a: Central server without a full copy of the data

In this scenario, everything is configured the same except the central server is configured with the “transfer” transfer group. This means that the actual data synced to it is deleted after it has been propagated to all clients. Since git-annex can verify which repository has received a copy of which data, it can easily enough delete the actual file content from the central server after it has been copied to all the clients. Many people use something like Dropbox or OwnCloud as a multi-PC syncing solution anyhow, so once the files have been synced everywhere, it makes sense to remove them from the central server.

This is often a good ideal for people. There are some obvious downsides that are sometimes relevant. For instance, to add a third sync client, it must be able to initially copy down from one of the existing clients. Or, if you intend to access the data from a device such as a cell phone where you don’t intend for it to have a copy of all data all the time, you won’t have as convenient way to download your data.

Scenario 1b: Split data/metadata central servers

Imagine that you have a shell or rsync account on some remote system where you can run git-annex, but don’t have much storage space. Maybe you have a cheap VPS or shell account somewhere, but it’s just not big enough to hold your data.

The answer to this would be to use this shell or rsync account for the metadata, but put the data elsewhere. You could, for instance, store the data in Amazon S3 or Amazon Glacier. These backends aren’t capable of storing the git-annex metadata, so all you need is a shell or rsync account somewhere to sync up the metadata. (Or, as below, you might even combine a fully distributed approach with this.) Then you can have your encrypted data pushed up to S3 or some such service, which presumably will grow to whatever size you need.

Scenario 2: Fully distributed

Like git itself, git-annex does not actually need a central server at all. If your different clients can reach each other directly at least some of the time, that is good enough. Of course, a given client will not be able to do fully automatic live sync unless it can reach at least one other client, so changes may not propagate as quickly.

You can simply set this up by making ssh connections available between your clients. git-annex assistant can automatically generate appropriate ~/.ssh/authorized_keys entries for you.

Scenario 2a: Fully distributed with multiple disconnected branches

You can even have a graph of connections available. For instance, you might have a couple machines at home and a couple machines at work with no ability to have a direct connection between them (due to, say, firewalls). The two machines at home could sync with each other in real-time, as could the two machines at work. git-annex also supports things like USB drives as a transport mechanism, so you could throw a USB drive in your pocket each morning, pop it in to one client at work, and poof – both clients are synced up over there. Repeat when you get home in the evening, and you’re synced there. The USB drive’s repository can, of course, be of the “transport” type so data is automatically deleted from it once it’s been synced everywhere.

Scenario 3: Hybrid

git-annex can support LAN sync even if you have a central server. If your laptop, say, travels around but is sometimes on the same LAN as your PC, git-annex can easily sync directly between the two when they are reachable, saving a round-trip to the server. You can assign a cost to each remote, and git-annex will always try to sync first to the lowest-cost path that is available.

Drawbacks of git-annex

There are some scenarios where git-annex with the assistant won’t be as useful as one of the more traditional instant-sync systems.

The first and most obvious one is if you want to access the files without the git-annex client. For instance, many of the other tools let you generate a URL that you can email to people, and then they can download files without any special client software. This is not directly possible with git-annex. You could, of course, make something like a public_html directory be managed with git-annex, but it wouldn’t provide things like obfuscated URLs, password-protected sharing, time-limited sharing, etc. that you get with other systems. While you can share your repositories with others that have git-annex, you can’t share individual subdirectories; for a given repository, it is all or nothing.

The Android client for git-annex is a pretty interesting thing: it is mostly a small POSIX environment, providing a terminal, git, gpg, and the same web interface that you get on a standalone machine. This means that the git-annex Android client is fully functional compared to a desktop one. It also has a quick setup process for syncing off your photos/videos. On the other hand, the integration with the Android ecosystem is poor compared to most other tools.

Other git-annex features

git-annex has a lot to offer besides the git-annex assistant. Besides the things I’ve already mentioned, any given git-annex repository — including your client repository — can have a partial copy of the full content. Say, for instance, that you set up a git-annex repository for your music collection, which is quite large. You want some music on your netbook, but don’t have room for it all. You can tell git-annex to get or drop files from the netbook’s repository without deleting them remotely. git-annex has quite a few ways to automate and configure this, including making sure that at least a certain number of copies of a file exist in your git-annex ecosystem.

Conclusion

I initially started looking at git-annex due to the security issues with encfs, and the difficulty with setting up ecryptfs in this way. (I had been layering encfs atop OwnCloud). git-annex certainly ticks the box for me security-wise, and obviously anything encrypted with encfs wasn’t going to be shared with others anyhow. I’ll be using git-annex more in the future, I’m sure.

Update 2016-06-27: I had some issues with git-annex in this configuration.