Category Archives: Linux

Long-Range Radios: A Perfect Match for Unix Protocols From The 70s

It seems I’ve been on a bit of a vintage computing kick lately. After connecting an original DEC vt420 to Linux and resurrecting some old operating systems, I dove into UUCP.

In fact, it so happened that earlier in the week, my used copy of Managing UUCP & Usenet by none other than Tim O’Reilly arrived. I was reading about the challenges of networking in the 70s: half-duplex lines, slow transmission rates, and modems that had separate dialers. And then I stumbled upon long-distance radio. It turns out that a lot of modern long-distance radio has much in common with the challenges of communication in the 1970s – 1990s, and some of our old protocols might be particularly well-suited for it. Let me explain — I’ll start with the old software, and then talk about the really cool stuff going on in hardware (some radios that can send a signal for 10-20km or more with very little power!), and finally discuss how to bring it all together.

UUCP

UUCP, for those of you that may literally have been born after it faded in popularity, is a batch system for exchanging files and doing remote execution. For users, the uucp command copies files to or from a remote system, and uux executes commands on a remote system. In practical terms, the most popular use of this was to use uux to execute rmail on the remote system, which would receive an email message on stdin and inject it into the system’s mail queue. All UUCP commands are queued up and transmitted when a “call” occurs — over a modem, TCP, ssh pipe, whatever.

UUCP had to deal with all sorts of line conditions: very slow lines (300bps), half-duplex lines, noisy and error-prone communication, poor or nonexistent flow control, even 7-bit communication. It supports a number of different transport protocols that can accommodate these varying conditions. It turns out that these mesh fairly perfectly with some properties of modern long-distance radio.

AX.25

The AX.25 stack is a frame-based protocol used by amateur radio folks. Its air speed is 300bps, 1200bps, or (rarely) 9600bps. The Linux kernel has support for the AX.25 protocol and it is quite possible to run TCP/IP atop it. I have personally used AX.25 to telnet to a Linux box 15 miles away over a 1200bps air speed, and have also connected all the way from Kansas to Texas and Indiana using 300bps AX.25 using atmospheric skip. AX.25 has “connected” packets (as TCP) and unconnected/broadcast ones (similar to UDP) and is a error-detected protocol with retransmit. The radios generally used with AX.25 are always half-duplex and some of them have iffy carrier detection (which means collision is frequent). Although the whole AX.25 stack has grown rare in recent years, a subset of it is still in wide use as the basis for APRS.

A lot of this is achieved using equipment that’s not particularly portable: antennas on poles, radios that transmit with anywhere from 1W to 100W of power (even 1W is far more than small portable devices normally use), etc. Also, under the regulations of the amateur radio service, transmitters must be managed by a licensed operator and cannot be encrypted.

Nevertheless, AX.25 is just a protocol and it could, of course, run on other kinds of carriers than traditional amateur radios.

Long-range low-power radios

There is a lot being done with radios these days, much of which I’m not going to discuss. I’m not covering very short-range links such as Bluetooth, ZigBee, etc. Nor am I covering longer-range links that require large and highly-directional antennas (such as some are doing in the 2.4GHz and 5GHz bands). What I’m covering is long-range links that can be used by portable devices.

There is always a compromise in radios, and if we are going to achieve long-range links with poor antennas and low power, the compromise is going to be in bitrate. These technologies may scale down to as low at 300bps or up to around 115200bps. They can, as a side bonus, often be quite cheap.

HC-12 radios

HC-12 is a radio board, commonly used with Arduino, that sports 500bps to 115200bps communication. According to the vendor, in 500bps mode, the range is 1800m or 0.9mi, while at 115200bps, the range is 100m or 328ft. They’re very cheap, at around $5 each.

There are a few downsides to HC-12. One is that the lowest air bitrate is 500bps, but the lowest UART bitrate is 1200bps, and they have no flow control. So, if you are running in long-range mode, “only small packets can be sent: max 60 bytes with the interval of 2 seconds.” This would pose a challenge in many scenarios: though not much for UUCP, which can be perfectly well configured to have a 60-byte packet size and a window size of 1, which would wait for a remote ACK before proceeding.

Also, they operate over 433.4-473.0 MHz which appears to fall outside the license-free bands. It seems that many people using HC-12 are doing so illegally. With care, it would be possible to operate it under amateur radio rules, since this range is mostly within the 70cm allocation, but then it must follow amateur radio restrictions.

LoRa radios

LoRa is a set of standards for long range radios, which are advertised as having a range of 15km (9mi) or more in rural areas, and several km in cities.

LoRa can be done in several ways: the main LoRa protocol, and LoRaWAN. LoRaWAN expects to use an Internet gateway, which will tell each node what frequency to use, how much power to use, etc. LoRa is such that a commercial operator could set up roughly one LoRaWAN gateway per city due to the large coverage area, and some areas have good LoRa coverage due to just such operators. The difference between the two is roughly analogous to the difference between connecting two machines with an Ethernet crossover cable, and a connection over the Internet; LoRaWAN includes more protocol layers atop the basic LoRa. I have yet to learn much about LoRaWAN; I’ll follow up later on that point.

The speed of LoRa ranges from (and different people will say different things here) about 500bps to about 20000bps. LoRa is a packetized protocol, and the maximum packet size depends

LoRa sensors often advertise battery life in the months or years, and can be quite small. The protocol makes an excellent choice for sensors in remote or widely dispersed areas. LoRa transceiver boards for Arduino can be found for under $15 from places like Mouser.

I wound up purchasing two LoStik USB LoRa radios from Amazon. With some experimentation, with even very bad RF conditions (tiny antennas, one of them in the house, the other in a car), I was able to successfully decode LoRa packets from 2 miles away! And these aren’t even the most powerful transmitters available.

Talking UUCP over LoRa

In order to make this all work, I needed to write interface software; the LoRa radios don’t just transmit things straight out. So I wrote lorapipe. I have successfully transmitted files across this UUCP link!

Developing lorapipe was somewhat more challenging than I expected. For one, the LoRa modem raw protocol isn’t well-suited to rapid fire packet transmission; after receiving each packet, the modem exits receive mode and must be told to receive again. Collisions with protocols that ACKd data and had a receive window — which are many — were a problem so bad that it rendered some of the protocols unusable. I wound up adding a “expect more data after this packet” byte to every transmission, and have the receiver not transmit until it believes the sender is finished. This dramatically improved things. There’s more detail on this in my lorapipe documentation.

So far, I have successfully communicated over LoRa using UUCP, kermit, and YMODEM. KISS support will be coming next.

I am also hoping to discover the range I can get from this thing if I use more proper antennas (outdoor) and transmitters capable of transmitting with more power.

All in all, a fun project so far.

Connecting A Physical DEC vt420 to Linux

John and Oliver trip to Vintage Computer Festival Midwest 2019. Oliver playing Zork on the Micro PDP-11

Inspired by a weekend visit to Vintage Computer Festival Midwest at which my son got to play Zork on an amber console hooked up to a MicroPDP-11 running 2BSD, I decided it was time to act on my long-held plan to get a real old serial console hooked up to Linux.

Not being satisfied with just doing it for the kicks, I wanted to make it actually usable. 30-year-old DEC hardware meets Raspberry Pi. I thought this would be pretty easy, but it turns out is was a lot more complicated than I realized, involving everything from nonstandard serial connectors to long-standing kernel bugs!

Selecting a Terminal — And Finding Parts

I wanted something in amber for that old-school feel. Sadly I didn’t have the forethought to save any back in the 90s when they were all being thrown out, because now they’re rare and can be expensive. Search eBay and pretty soon you find a scattering of DEC terminals, the odd Bull or Honeywell, some Sperrys, and assorted oddballs that don’t speak any kind of standard protocol. I figured, might as well get a vt, since we’re still all emulating them now, 40+ years later. Plus, my old boss from my university days always had stories about DEC. I wish he were still around to see this.

I selected the vt420 because I was able to find them, and it has several options for font size, letting more than 24 lines fit on a screen.

Now comes the challenge: most of the vt420s never had a DB25 RS-232 port. The VT420-J, an apparently-rare international model, did, but it is exceptionally rare. The rest use a DEC-specific port called the MMJ. Thankfully, it is electrically compatible with RS-232, and I managed to find the DEC H8571-J adapter as well as a BC16E MMJ cable that I need.

I also found a vt510 (with “paperwhite” instead of amber) in unknown condition. I purchased it, and thankfully it is also working. The vt510 is an interesting device; for that model, they switched to using a PS/2 keyboard connector, and it can accept either a DEC VT keyboard or a PC keyboard. It also supports full key remapping, so Control can be left of A as nature intended. However, there’s something about amber that is just so amazing to use again.

Preparing the Linux System

I thought I would use a Raspberry Pi as a gateway for this. With built-in wifi, that would let me ssh to other machines in my house without needing to plug in a serial cable – I could put the terminal wherever. Alternatively, I can plug in a USB-to-serial adapter to my laptop and just plug the terminal into it when I want. I wound up with a Raspberry Pi 4 kit that included some heatsinks.

I had two USB-to-serial adapters laying around: a Keyspan USA-19HS and a Digi I/O Edgeport/1. I started with the Keyspan on a Raspberry Pi 4 on the grounds that I didn’t have the needed Edgeport/1 firmware file laying about already. The Raspberry Pi does have serial capability integrated, but it doesn’t use RS-232 voltages and there have been reports of it dropping characters sometimes, so I figured the easy path would be a USB adapter. That turned out to be only partially right.

Serial Terminals with systemd

I have never set up a serial getty with systemd — it has, in fact, been quite a long while since I’ve done anything involving serial other than the occasional serial console (which is a bit different purpose).

It would have taken a LONG time to figure this out, but thanks to an article about the topic, it was actually pretty easy in the end. I didn’t set it up as a serial console, but spawning a serial getty did the trick. I wound up modifying the command like this:

ExecStart=-/sbin/agetty -8 -o '-p -- \\u' %I 19200 vt420

The vt420 supports speeds up to 38400 and the vt510 supports up to 115200bps. However, neither can process plain text at faster than 19200 so there is no point to higher speeds. And, as you are about to see, they can’t necessarily even muster 19200 all the time.

Flow Control: Oh My

The unfortunate reality with these old terminals is that the processor in them isn’t actually able to keep up with line speeds. Any speed above 4800bps can exceed processor capabilities when “expensive” escape sequences are sent. That means that proper flow control is a must. Unfortunately, the vt420 doesn’t support any form of hardware flow control. XON/XOFF is all it’ll do. Yeah, that stinks.

So I hooked the thing up to my desktop PC with a null-modem cable, and started to tinker. I should be able to send a Ctrl-S down the line and the output from the pi should immediately stop. It didn’t. Huh. I verified it was indeed seeing the Ctrl-S (open emacs, send Ctrl-S, and it goes into search mode). So something, somehow, was interfering.

After a considerable amount of head scratching, I finally busted out the kernel source. I discovered that the XON/XOFF support is part of the serial driver in Linux, and that — ugh — the keyspan serial driver never actually got around to implementing it. Oops. That’s a wee bit of a bug. I plugged in the Edgeport/1 instead of the Keyspan and magically XON/XOFF started working.

Well, for a bit.

You see, flow control is a property of the terminal that can be altered by programs on a running system. It turns out that a lot of programs have opinions about it, and those opinions generally run along the lines of “nobody could possibly be using XON/XOFF, so I’m going to turn it off.” Emacs is an offender here, but it can be configured. Unfortunately, the most nasty offender here is ssh, which contains this code that is ALWAYS run when using a pty to connect to a remote system (which is for every interactive session):

tio.c_iflag &= ~(ISTRIP | INLCR | IGNCR | ICRNL | IXON | IXANY | IXOFF);

Yes, so when you use ssh, your local terminal no longer does flow control. If you are particularly lucky, the remote end may recognize your XON/XOFF characters and process them. Unfortunately, the added latency and buffering in going through ssh and the network is likely to cause bursts of text to exceed the vt420’s measly 100-ish-byte buffer. You just can’t let the remote end handle flow control with ssh. I managed to solve this via GNU Screen; more on that later.

The vt510 supports hardware flow control! Unfortunately, it doesn’t use CTS/RTS pins, but rather DTR/DSR. This was a reasonably common method in the day, but appears to be totally unsupported in Linux. Bother. I see some mentions that FreeBSD supports DTR/DSR flow (dtrflow and dsrflow in stty outputs). It definitely looks like the Linux kernel has never plumbed out the reaches of RS-232 very well. It should be possible to build a cable to swap DTR/DSR over to CTS/RTS, but since the vt420 doesn’t support any of this anyhow, I haven’t bothered.

Character Sets

Back when the vt420 was made, it was pretty hot stuff that it was one of the first systems to support the new ISO-8859-1 standard. DEC was rather proud of this. It goes without saying that the terminal knows nothing of UTF-8.

Nowadays, of course, we live in a Unicode world. A lot of software crashes on ISO-8859-1 input (I’m looking at you, Python 3). Although I have old files from old systems that have ISO-8859-1 encoding, they are few and far between, and UTF-8 rules the roost now.

I can, of course, just set LANG=en_US and that will do — well, something. man, for instance, renders using ISO-8859-1 characters. But that setting doesn’t imply that any layer of the tty system actually converts output from UTF-8 to ISO-8859-1. For instance, if I have a file with a German character in it and use ls, nothing is going to convert it from UTF-8 to ISO-8859-1.

GNU Screen also, as it happens, mostly solves this.

GNU Screen to the rescue, somewhat

It turns out that GNU Screen has features that can address both of these issues. Here’s how I used it.

First, in my .bashrc, I set this:


if [ `tty` = "/dev/ttyUSB0" ]; then
stty -iutf8
export LANG=en_US
export MANOPT="-E ascii"
fi

Then, in my .screenrc, I put this:


defflow on
defencoding UTF-8

This tells screen that the default flow control mode is on, and that the default encoding for the pty that screen creates is UTF-8. It determines the encoding for the physical terminal for the environment, and correctly figures it to be ISO-8859-1. It then maps between the two! Yes!

My little ssh connecting script then does just this:

exec screen ssh "$@"

Which nicely takes care of the flow control issue and (most of) the encoding issue. I say “most” because now things like man will try to render with fancy em-dashes and the like, which have no representation in iso8859-1, so they come out as question marks. (Setting MANOPT=”-E ascii” fixes this) But no matter, it works to ssh to my workstation and read my email! (mu4e in emacs)

What screen doesn’t help with are things that have no ISO-8859-1 versions; em-dashes are the most frequent problems, and are replaced with unsightly question marks.

termcaps, terminfos, and weird things

So pretty soon you start diving down the terminal rabbit hole, and you realize there’s a lot of weird stuff out there. For instance, one solution to the problem of slow processors in terminals was padding: ncurses would know how long it would take the terminal to execute some commands, and would send it NULLs for that amount of time. That calculation, of course, requires knowledge of line speed, which one wouldn’t have in this era of ssh. Thankfully the vt420 doesn’t fall into that category.

But it does have a ton of modes. The Emacs On Terminal page discusses some of the interesting bits: 7-bit or 8-bit control characters, no ESC key, Alt key not working, etc, etc. I believe some of these are addressed by the vt510 (at least in PC mode). I wonder whether Emacs or vim keybindings would be best here…

Helpful Resources

The Desktop Security Nightmare

Back in 1995 or so, pretty much everyone with a PC did all their work as root. We ran graphics editors, word processors, everything as root. Well, not literally an account named “root”, but the most common DOS, Windows, and Mac operating systems of the day had no effective reduced privilege account.

It was that year that I tried my first Unix. “Wow!” A virus can’t take over my system. My programs are safe!

That turned out to be a little short-sighted.

The fundamental problem we have is that we’d like to give users of a computer more access than we would like to give the computer itself.

Many of us have extremely sensitive data on our systems. Emails to family, medical or bank records, Bitcoin wallets, browsing history, the list goes on. Although we have isolation between our user account and root, we have no isolation between applications that run as our user account. We still, in effect, have to be careful about what attachments we open in email.

Only now it’s worse. You might “npm install hello-world”, and audit hello-world itself, but get some totally malicious code as well. How many times do we see instructions to gem install this, pip install that, go get the other, and even curl | sh? Nowadays our risky click isn’t an email attachment. It’s hosted on Github with a README.md.

Not only that, but my /usr/bin has over 4000 binaries. Have every one been carefully audited? Certainly not, and this is from a distro with some of the highest quality control around. What about the PPAs that people add? The debs or rpms that are installed from the Internet? Are you sure that the postinst scripts — which run as root — aren’t doing anything malicious when you install Oracle Virtualbox?

Wouldn’t it be nice if we could, say, deny access to everything in ~/.ssh or ~/bankstatements except for trusted programs when we want it? On mobile, this happens, to an extent. But we have both a legacy of a different API on desktop, and a much more demanding set of requirements.

It feels like our ecosystem is on the cusp of being able to do this, but none of the options I’ve looked at quite get us there. Let’s take a look at some.

AppArmor

AppArmor falls into the “first line of defense — better than nothing” category. It works by imposing mandatory access controls on a per-executable basis. This starts out as a pretty good idea: we can go after some high-risk targets (Firefox, Chromium, etc) and lock them down. Great! Although it’s not exactly intuitive, with a little configuration, you can prevent them from accessing sensitive areas on disk.

But there’s a problem. Actually, several. To start with, AppArmor does nothing by default. On my system, aa-unconfined --paranoid lists 171 processes that have no policies on them. Among them are Firefox, Apache, ssh, a ton of Pythons, and some stuff I don’t even recognize (/usr/lib/geoclue-2.0/demos/agent? What’s this craziness?)

Worse, since AppArmor matches on executable, all shell scripts would match the /bin/bash profile, all Python programs the Python profile, etc. It’s not so useful for them. While AppArmor does technically have a way to set a default profile, it’s not as useful as you might think.

Then you’re still left with problems like: a PDF viewer should not ordinarily have access to my sensitive files — except when I want to see an old bank statement. This can’t really be expressed in AppArmor.

SELinux

From its documentation, it sounds like SELinux might fit the bill well. It allows transitions into different roles after logging in, which is nice. The problem is complexity. The “notebook” for SELinux is 395 pages. The SELinux homepage has a wiki, which says it’s outdated and replaced by a github link with substantially less information. The Debian wiki page on it is enough to be scary in itself: you need to have various filesystem support, even backups are complicated. Ted T’so had a famous comment about never getting some of his life back, and the Debian wiki also warns that it’s not really tested on desktop systems.

We have certainly learned that complexity is an enemy of good security, leading users to just bypass it. I’m not sure we can rely on it.

Mount Tricks

One thing a person could do would be to keep the sensitive data on a separate, ideally encrypted, filesystem. (Maybe even a fuse one such as gocryptfs.) Then, at least, it could be unavailable for most of the time the system is on.

Of course, the downside here is that it’s still going to be available to everything when it is mounted, and there’s the hassle of mounting, remembering to unmount, password typing, etc. Not exactly transparent.

I wondered if mount namespaces might be an answer here. A filesystem could be mounted but left pretty much unavailable to processes unless a proper mount namespace is joined. Indeed that might be a solution. It is somewhat complicated, though, since nsenter requires root to work. Enter sudo, and dropping privileges back to a particular user — a not particularly ideal situation, and complex as well.

Still, it might well have some promise for some of these things.

Firejail

Firejail is a great idea, but suffers from a lot of the problems that AppArmor does: things must explicitly be selected to run within it.

AppImage and related tools

So now there’s your host distro and your bundled distro, each with libraries that may or may not be secure, both with general access to your home directory. I think this is a recipe for worse security, not better. Add to that the difficulty of making those things work; I know that the Digikam people have been working for months to get sound to work reliably in their AppImage.

Others?

What other ideas are out there? I’ve occasionally created a separate user on the system for running suspicious-ish code, or even a VM or container. That’s a fair bit of work, and provides incomplete protection, but has some benefits. Still, it’s again not going to work for everything.

I hope to play around with many of these tools, especially SELinux, before too long and report back how I’ve found them to be.

Finally, I would like to be really clear that I don’t believe this issue is limited to Debian, or even to Linux. It impacts every desktop platform in wide use today. Actually, I think we’re in a better position to address it than some, but it won’t be easy for anyone.

Please stop making the library situation worse with attempts to fix it

I recently had a simple-sounding desire. I would like to run the latest stable version of Digikam. My desktop, however, runs Debian stable, which has 5.3.0, not 5.9.0.

This is not such a simple proposition.


$ ldd /usr/bin/digikam | wc -l
396

And many of those were required at versions that weren’t in stable.

I had long thought that AppImage was a rather bad idea, but I decided to give it a shot. I realized it was worse than I had thought.

The problems with AppImage

About a year ago, I wrote about the problems Docker security. I go into much more detail there, but the summary for AppImage is quite similar. How can I trust all the components in the (for instance) Digikam AppImage image are being kept secure? Are they using the latest libssl and libpng, to avoid security issues? How will I get notified of a security update? (There seems to be no mechanism for this right now.) An AppImage user that wants to be secure has to manually answer every one of those questions for every application. Ugh.

Nevertheless, the call of better facial detection beckoned, and I downloaded the Digikam AppImage and gave it a whirl. The darn thing actually fired up. But when it would play videos, there was no sound. Hmmmm.

I found errors like this:

Cannot access file ././/share/alsa/alsa.conf

Nasty. I spent quite some time trying to make ALSA work, before a bunch of experimentation showed that if I ran alsoft-conf on the host, and selected only the PulseAudio backend, then it would work. I reported this bug to Digikam.

Then I thought it was working — until I tried to upload some photos. It turns out that SSL support in Qt in the AppImage was broken, since it was trying to dlopen an incompatible version of libssl or libcrypto on the host. More details are in the bug I reported about this also.

These are just two examples. In the rather extensive Googling I did about these problems, I came across issue after issue people had with running Digikam in an AppImage. These issues are not limited to the ALSA and SSL issues I describe here. And they are not occurring due to some lack of skill on the part of Digikam developers.

Rather, they’re occurring because AppImage packaging for a complex package like this is hard. It’s hard because it’s based on a fiction — the fiction that it’s possible to make an AppImage container for a complex desktop application act exactly the same, when the host environment is not exactly the same. Does the host run PulseAudio or ALSA? Where are its libraries stored? How do you talk to dbus?

And it’s not for lack of trying. The scripts to build the Digikam appimage support runs to over 1000 lines of code in the AppImage directory, plus another 1300 lines of code (at least) in CMake files that handle much of the work, and another 3000 lines or so of patches to 3rd-party packages. That’s over 5000 lines of code! By contrast, the Debian packaging for the same version of Digikam, including Debian patches but excluding the changelog and copyright files, amounts to 517 lines. Of course, it is reusing OS packages for the dependencies that were already built, but this amounts to a lot simpler build.

Frankly I don’t believe that AppImage really lives up to its hype. Requiring reinventing a build system and making some dangerous concessions on security for something that doesn’t really work in the end — not good in my book.

The library problem

But of course, AppImage exists for a reason. That reason is that it’s a real pain to deal with so many levels of dependencies in software. Even if we were to compile from source like the old days, and even if it was even compatible with the versions of the dependencies in my OS, that’s still a lot of work. And if I have to build dependencies from source, then I’ve given up automated updates that way too.

There’s a lot of good that ELF has brought us, but I can’t help but think that it wasn’t really designed for a world in which a program links 396 libraries (plus dlopens a few more). Further, this world isn’t the corporate Unix world of the 80s; Open Source developers aren’t big on maintaining backwards compatibility (heck, both the KDE and Qt libraries under digikam have both been entirely rewritten in incompatible ways more than once!) The farther you get from libc, the less people seem to care about backwards compatibility. And really, who can blame volunteers? You want to work on new stuff, not supporting binaries from 5 years ago, right?

I don’t really know what the solution is here. Build-from-source approaches like FreeBSD and Gentoo have plenty of drawbacks too. Is there some grand solution I’m missing? Some effort to improve this situation without throwing out all the security benefits that individually-packaged libraries give us in distros like Debian?

Emacs #4: Automated emails to org-mode and org-mode syncing

This is fourth in a series on Emacs and org-mode.

Hopefully by now you’ve started to see how powerful and useful org-mode is. If you’re like me, you’re thinking:

“I’d really like to have this in sync across all my devices.”

and, perhaps:

“Can I forward emails into org-mode?”

This being Emacs, the answers, of course, are “Yes.”

Syncing

Since org-mode just uses text files, syncing is pretty easily accomplished using any number of tools. I use git with git-remote-gcrypt. Due to some limitations of git-remote-gcrypt, each machine tends to push to its own branch, and to master on command. Each machine merges from all the other branches and pushes the result to master after a merge. A cron job causes pushes to the machine’s branch to happen, and a bit of elisp coordinates it all — making sure to save buffers before a sync, refresh them from disk after, etc.

The code for this post is somewhat more extended, so I will be linking to it on github rather than posting inline.

I have a directory $HOME/org where all my org-stuff lives. In ~/org lives a Makefile that handles the syncing. It defines these targets:

  • push: adds, commits, and pushes to a branch named after the machine’s hostname
  • fetch: does a simple git fetch
  • sync: adds, commits, pulls remote changes, merges, and (assuming the merge was successful) pushes to the branch named after the machine’s hostname plus master

Now, in my user’s crontab, I have this:

*/15   *   *  *   *      make -C $HOME/org push fetch 2>&1 | logger --tag 'orgsync'

The accompanying elisp code defines a shortcut (C-c s) to cause a sync to occur. Thanks to the cronjob, as long as files were saved — even if I didn’t explicitly sync on the other boxen — they’ll be pulled in.

I have found this setup to work really well.

Emailing to org-mode

Before going down this path, one should ask the question: do you really need it? I use org-mode with mu4e, and the integration is excellent; any org task can link to an email by message-id, and this is ideal — it lets a person do things like make a reminder to reply to a message in a week.

However, org is not just about reminders. It’s also a knowledge base, authoring system, etc. And, not all of my mail clients use mu4e. (Note: things like MobileOrg exist for mobile devices). I don’t actually use this as much as I thought I would, but it has its uses and I thought I’d document it here too.

Now I didn’t want to just be able to accept plain text email. I wanted to be able to handle attachments, HTML mail, etc. This quickly starts to sound problematic — but with tools like ripmime and pandoc, it’s not too bad.

The first step is to set up some way to get mail into a specific folder. A plus-extension, special user, whatever. I then use a fetchmail configuration to pull it down and run my insorgmail script.

This script is where all the interesting bits happen. It starts with ripmime to process the message. HTML bits are converted from HTML to org format using pandoc. an org hierarchy is made to represent the structure of the email as best as possible. emails can get pretty complicated, with HTML and the rest, but I have found this does an acceptable job with my use cases.

Up next…

My last post on org-mode will talk about using it to write documents and prepare slides — a use for which I found myself surprisingly pleased with it, but which needed a bit of tweaking.

An old DOS BBS in a Docker container

Awhile back, I wrote about my Debian Docker base images. I decided to extend this concept a bit further: to running DOS applications in Docker.

But first, a screenshot:

It turns out this is possible, but difficult. I went through all three major DOS emulators available (dosbox, qemu, and dosemu). I got them all running inside the Docker container, but had a number of, er, fun issues to resolve.

The general thing one has to do here is present a fake modem to the DOS environment. This needs to be exposed outside the container as a TCP port. That much is possible in various ways — I wound up using tcpser. dosbox had a TCP modem interface, but it turned out to be too buggy for this purpose.

The challenge comes in where you want to be able to accept more than one incoming telnet (or TCP) connection at a time. DOS was not a multitasking operating system, so there were any number of hackish solutions back then. One might have had multiple physical computers, one for each incoming phone line. Or they might have run multiple pseudo-DOS instances under a multitasking layer like DESQview, OS/2, or even Windows 3.1.

(Side note: I just learned of DESQview/X, which integrated DESQview with X11R5 and replaced the Windows 3 drivers to allow running Windows as an X application).

For various reasons, I didn’t want to try running one of those systems inside Docker. That left me with emulating the original multiple physical node setup. In theory, pretty easy — spin up a bunch of DOS boxes, each using at most 1MB of emulated RAM, and go to town. But here came the challenge.

In a multiple-physical-node setup, you need some sort of file sharing, because your nodes have to access the shared message and file store. There were a myriad of clunky ways to do this in the old DOS days – Netware, LAN manager, even some PC NFS clients. I didn’t have access to Netware. I tried the Microsoft LM client in DOS, talking to a Samba server running inside the Docker container. This I got working, but the LM client used so much RAM that, even with various high memory tricks, BBS software wasn’t going to run. I couldn’t just mount an underlying filesystem in multiple dosbox instances either, because dosbox did caching that wasn’t going to be compatible.

This is why I wound up using dosemu. Besides being a more complete emulator than dosbox, it had a way of sharing the host’s filesystems that was going to work.

So, all of this wound up with this: jgoerzen/docker-bbs-renegade.

I also prepared building blocks for others that want to do something similar: docker-dos-bbs and the lower-level docker-dosemu.

As a side bonus, I also attempted running this under Joyent’s Triton (SmartOS, Solaris-based). I was pleasantly impressed that I got it all almost working there. So yes, a Renegade DOS BBS running under a Linux-based DOS emulator in a container on a Solaris machine.

The Yellow House Phone Company (Featuring Asterisk and an 11-year-old)

“Well Jacob, do you think we should set up our own pretend phone company in the house?”

“We can DO THAT?”

“Yes!”

“Then… yes. Yes! YES YES YESYESYESYES YES! Let’s do it, dad!”

Not long ago, my parents had dug up the old phone I used back in the day. We still have a landline, and Jacob was having fun discovering how an analog phone works. I told him about the special number he could call to get the time and temperature read out to him. He discovered what happens if you call your own number and hang up. He figured out how to play “Mary Had a Little Lamb” using touchtone keys (after a slightly concerned lecture from me setting out some rules to make sure his “musical dialing” wouldn’t result in any, well, dialing.)

He was hooked. So I thought that taking it to the next level would be a good thing for a rainy day. I have run Asterisk before, though I had unfortunately gotten rid of most of my equipment some time back. But I found a great deal on a Cisco 186 ATA (Analog Telephone Adapter). It has two FXS lines (FXS ports simulate the phone company, and provide dialtone and ring voltage to a connected phone), and of course hooks up to the LAN.

We plugged that in, and Jacob was amazed to see its web interface come up. I had to figure out how to configure it (unfortunately, it uses SCCP rather than SIP, and figuring out Asterisk’s chan_skinny took some doing, but we got there.)

I set up voicemail. He loved it. He promptly figured out how to record his own greetings. We set up a second phone on the other line, so he could call between them. The cordless phones in our house support SIP, so I configured one of them as a third line. He spent a long time leaving himself messages.

IMG_3465

Pretty soon we both started having ideas. I set up extension 777, where he could call for the time. Then he wanted a way to get the weather forecast. Well, weather-util generates a text-based report. With it, a little sed and grep tweaking, the espeak TTS engine, and a little help from sox, I had a shell script worked up that would read back a forecast whenever he called a certain extension. He was super excited! “That’s great, dad! Can it also read weather alerts too?” Sure! weather-util has a nice option just for that. Both boys cackled as the system tried to read out the NWS header (their timestamps like 201711031258 started with “two hundred one billion…”)

Then I found an online source for streaming NOAA Weather Radio feeds – Jacob enjoys listening to weather radio – and I set up another extension he could call to listen to that. More delight!

But it really took off when I asked him, “Would you like to record your own menu?” “You mean those things where it says press 1 or 2 for this or that?” “Yes.” “WE CAN DO THAT?” “Oh yes!” “YES, LET’S DO IT RIGHT NOW!”

So he recorded a menu, then came and hovered by me while I hacked up extensions.conf, then eagerly went back to the phone to try it. Oh the excitement of hearing hisown voice, and finding that it worked! Pretty soon he was designing sub-menus (“OK Dad, so we’ll set it up so people can press 2 for the weather, and then choose if they want weather radio or the weather report. I’m recording that now. Got it?”)

He has informed me that next Saturday we will build an intercom system “like we have at school.” I’m going to have to have some ideas on how to tie Squeezebox in with Asterisk to make that happen, I think. Maybe this will do.

Fixing the Problems with Docker Images

I recently wrote about the challenges in securing Docker container contents, and in particular with keeping up-to-date with security patches from all over the Internet.

Today I want to fix that.

Besides security, there is a second problem: the common way of running things in Docker pretends to provide a traditional POSIX API and environment, but really doesn’t. This is a big deal.

Before diving into that, I want to explain something: I have often heard it said the Docker provides single-process containers. This is unambiguously false in almost every case. Any time you have a shell script inside Docker that calls cp or even ls, you are running a second process. Web servers from Apache to whatever else use processes or threads of various types to service multiple connections at once. Many Docker containers are single-application, but a process is a core part of the POSIX API, and very little software would work if it was limited to a single process. So this is my little plea for more precise language. OK, soapbox mode off.

Now then, in a traditional Linux environment, besides your application, there are other key components of the system. These are usually missing in Docker containers.

So today, I will fix this also.

In my docker-debian-base images, I have prepared a system that still has only 11MB RAM overhead, makes minimal changes on top of Debian, and yet provides a very complete environment and API. Here’s what you get:

  • A real init system, capable of running standard startup scripts without modification, and solving the nasty Docker zombie reaping problem.
  • Working syslog, which can either export all logs to Docker’s logging infrastructure, or keep them within the container, depending on your preferences.
  • Working real schedulers (cron, anacron, and at), plus at least the standard logrotate utility to help prevent log files inside the container from becoming huge.

The above goes into my “minimal” image. Additional images add layers on top of it, and here are some of the features they add:

  • A real SMTP agent (exim4-daemon-light) so that cron and friends can actually send you mail
  • SSH client and server (optionally exposed to the Internet)
  • Automatic security patching via unattended-upgrades and needsrestart

All of the above, including the optional features, has an 11MB overhead on start. Not bad for so much, right?

From here, you can layer on top all your usual Dockery things. You can still run one application per container. But you can now make sure your disk doesn’t fill up from logs, run your database vacuuming commands at will, have your blog download its RSS feeds every few minutes, etc — all from within the container, as it should be. Furthermore, you don’t have to reinvent the wheel, because Debian already ships with things to take care of a lot of this out of the box — and now those tools will just work.

There is some popular work done in this area already by phusion’s baseimage-docker. However, I made my own for these reasons:

  • I wanted something based on Debian rather than Ubuntu
  • By using sysvinit rather than runit, the OS default init scripts can be used unmodified, reducing the administrative burden on container builders
  • Phusion’s system is, for some reason, not auto-built on the Docker hub. Mine is, so it will be automatically revised whenever the underlying Debian system, or the Github repository, is.

Finally a word on the choice to use sysvinit. It would have been simpler to use systemd here, since it is the default in Debian these days. Unfortunately, systemd requires you to poke some holes in the Docker security model, as well as mount a cgroups filesystem from the host. I didn’t consider this acceptable, and sysvinit ran without these workarounds, so I went with it.

With all this, Docker becomes a viable replacement for KVM for various services on my internal networks. I’ll be writing about that later.

Easily Improving Linux Security with Two-Factor Authentication

2-Factor Authentication (2FA) is a simple way to help improve the security of your systems. It restricts the scope of damage if a machine is compromised. If, for instance, you have a security token or authenticator app on your phone that is required for ssh to a remote machine, then even if every laptop you use to connect to the remote is totally owned, an attacker cannot establish a new ssh session on their own.

There are a lot of tutorials out there on the Internet that get you about halfway there, so here is some more detail.

Background

In this article, I will be focusing on authentication in the style of Google Authenticator, which is a special case of OATH HOTP or TOTP. You can use the Google Authenticator app, FreeOTP, or a hardware token like Yubikey to generate tokens with this. They are all 100% compatible with Google Authenticator and libpam-google-authenticator.

The basic idea is that there is a pre-shared secret key. At each login, a different and unique token is required, which is generated based on the pre-shared secret key and some other information. With TOTP, the “other information” is the current time, implying that both machines must be reasably well in-sync time-wise. With HOTP, the “other information” is a count of the number of times the pre-shared key has been used. Both typically have a “window” on the server side that can let times within a certain number of seconds, or a certain number of login accesses, work.

The beauty of this system is that after the initial setup, no Internet access is required on either end to validate the key (though TOTP requires both ends to be reasonably in sync time-wise).

The basics: user account setup and ssh authentication

You can start with the basics by reading one of these articles: one, two, three. Debian/Ubuntu users will find both the pam module and the user account setup binary in libpam-google-authenticator.

For many, you can stop there. You’re done. But if you want to kick it up a notch, read on:

Enhancement 1: Requiring 2FA even when ssh public key auth is used

Let’s consider a scenario in which your system is completely compromised. Unless your ssh keys are also stored in something like a Yubikey Neo, they could wind up being compromised as well – if someone can read your files and sniff your keyboard, your ssh private keys are at risk.

So we can configure ssh and PAM so that a OTP token is required even for this scenario.

First off, in /etc/ssh/sshd_config, we want to change or add these lines:

UsePAM yes
ChallengeResponseAuthentication yes
AuthenticationMethods publickey,keyboard-interactive

This forces all authentication to pass two verification methods in ssh: publickey and keyboard-interactive. All users will have to supply a public key and then also pass keyboard-interactive auth. Normally keyboard-interactive auth prompts for a password, but we can change /etc/pam.d/sshd on this. I added this line at the very top of /etc/pam.d/sshd:

auth [success=done new_authtok_reqd=done ignore=ignore default=bad] pam_google_authenticator.so

This basically makes Google Authenticator both necessary and sufficient for keyboard-interactive in ssh. That is, whenever the system wants to use keyboard-interactive, rather than prompt for a password, it instead prompts for a token. Note that any user that has not set up google-authenticator already will be completely unable to ssh into their account.

Enhancement 1, variant 2: Allowing automated processes to root

On many of my systems, I have ~root/.ssh/authorized_keys set up to permit certain systems to run locked-down commands for things like backups. These are automated commands, and the above configuration will break them because I’m not going to be typing in codes at 3AM.

If you are very restrictive about what you put in root’s authorized_keys, you can exempt the root user from the 2FA requirement in ssh by adding this to sshd_config:

Match User root
  AuthenticationMethods publickey

This says that the only way to access the root account via ssh is to use the authorized_keys file, and no 2FA will be required in this scenario.

Enhancement 1, variant 2: Allowing non-pubkey auth

On some multiuser systems, some users may still want to use password auth rather than publickey auth. There are a few ways we can support that:

  1. Users without public keys will have to supply a OTP and a password, while users with public keys will have to supply public key, OTP, and a password
  2. Users without public keys will have to supply OTP or a password, while users with public keys will have to supply public key, OTP, or a password
  3. Users without public keys will have to supply OTP and a password, while users with public keys only need to supply the public key

The third option is covered in any number of third-party tutorials. To enable options 1 or 2, you’ll need to put this in sshd_config:

AuthenticationMethods publickey,keyboard-interactive keyboard-interactive

This means that to authenticate, you need to pass either publickey and then keyboard-interactive auth, or just keyboard-interactive auth.

Then in /etc/pam.d/sshd, you put this:

auth required pam_google_authenticator.so

As a sub-variant for option 1, you can add nullok to here to permit auth from people that do not have a Google Authenticator configuration.

Or for option 2, change “required” to “sufficient”. You should not add nullok in combination with sufficient, because that could let people without a Google Authenticator config authenticate completely without a password at all.

Enhancement 2: Configuring su

A lot of other tutorials stop with ssh (and maybe gdm) but forget about the other ways we authenticate or change users on a system. su and sudo are the two most important ones. If your root password is compromised, you don’t want anybody to be able to su to that account without having to supply a token. So you can set up google-authenticator for root.

Then, edit /etc/pam.d/su and insert this line after the pam_rootok.so line:

auth       required     pam_google_authenticator.so nullok

The reason you put this after pam_rootok.so is because you want to be able to su from root to any account without having to input a token. We add nullok to the end of this, because you may want to su to accounts that don’t have tokens. Just make sure to configure tokens for the root account first.

Enhancement 3: Configuring sudo

This one is similar to su, but a little different. This lets you, say, secure the root password for sudo.

Normally, you might sudo from your user account to root (if so configured). You might have sudo configured to require you to enter in your own password (rather than root’s), or to just permit you to do whatever you want as root without a password.

Our first step, as always, is to configure PAM. What we do here depends on your desired behavior: do you want to require someone to supply both a password and a token, or just a token, or require a token? If you want to require a token, put this at the top of /etc/pam.d/sudo:

auth [success=done new_authtok_reqd=done ignore=ignore default=bad] pam_google_authenticator.so

If you want to require a token and a password, change the bracketed string to “required”, and if you want a token or a password, change it to “sufficient”. As before, if you want to permit people without a configured token to proceed, add “nullok”, but do not use that with “sufficient” or the bracketed example here.

Now here comes the fun part. By default, if a user is required to supply a password to sudo, they are required to supply their own password. That does not help us here, because a user logged in to the system can read the ~/.google_authenticator file and easily then supply tokens for themselves. What you want to do is require them to supply root’s password. Here’s how I set that up in sudoers:

Defaults:jgoerzen rootpw
jgoerzen ALL=(ALL) ALL

So now, with the combination of this and the PAM configuration above, I can sudo to the root user without knowing its password — but only if I can supply root’s token. Pretty slick, eh?

Further reading

In addition to the basic tutorials referenced above, consider:

Edit: additional comments

Here are a few other things to try:

First, the libpam-google-authenticator module supports putting the Google Authenticator files in different locations and having them owned by a certain user. You could use this to, for instance, lock down all secret keys to be readable only by the root user. This would prevent users from adding, changing, or removing their own auth tokens, but would also let you do things such as reusing your personal token for the root account without a problem.

Also, the pam-oath module does much of the same things as the libpam-google-authenticator module, but without some of the help for setup. It uses a single monolithic root-owned password file for all accounts.

There is an oathtool that can be used to generate authentication codes from the command line.

I’m switching from git-annex to Syncthing

I wrote recently about using git-annex for encrypted sync, but due to a number of issues with it, I’ve opted to switch to Syncthing.

I’d been using git-annex with real but noncritical data. Among the first issues I noticed was occasional but persistent high CPU usage spikes, which once started, would persist apparently forever. I had an issue where git-annex tried to replace files I’d removed from its repo with broken symlinks, but the real final straw was a number of issues with the gcrypt remote repos. git-remote-gcrypt appears to have a number of issues with possible race conditions on the remote, and at least one of them somehow caused encrypted data to appear in a packfile on a remote repo. Why there was data in a packfile there, I don’t know, since git-annex is supposed to keep the data out of packfiles.

Anyhow, git-annex is still an awesome tool with a lot of use cases, but I’m concluding that live sync to an encrypted git remote isn’t quite there yet enough for me.

So I looked for alternatives. My main criteria were supporting live sync (via inotify or similar) and not requiring the files to be stored unencrypted on a remote system (my local systems all use LUKS). I found Syncthing met these requirements.

Syncthing is pretty interesting in that, like git-annex, it doesn’t require a centralized server at all. Rather, it forms basically a mesh between your devices. Its concept is somewhat similar to the proprietary Bittorrent Sync — basically, all the nodes communicate about what files and chunks of files they have, and the changes that are made, and immediately propagate as much as possible. Unlike, say, Dropbox or Owncloud, Syncthing can actually support simultaneous downloads from multiple remotes for optimum performance when there are many changes.

Combined with syncthing-inotify or syncthing-gtk, it has immediate detection of changes and therefore very quick propagation of them.

Syncthing is particularly adept at figuring out ways for the nodes to communicate with each other. It begins by broadcasting on the local network, so known nearby nodes can be found directly. The Syncthing folks also run a discovery server (though you can use your own if you prefer) that lets nodes find each other on the Internet. Syncthing will attempt to use UPnP to configure firewalls to let it out, but if that fails, the last resort is a traffic relay server — again, a number of volunteers host these online, but you can run your own if you prefer.

Each node in Syncthing has an RSA keypair, and what amounts to part of the public key is used as a globally unique node ID. The initial link between nodes is accomplished by pasting the globally unique ID from one node into the “add node” screen on the other; the user of the first node then must accept the request, and from that point on, syncing can proceed. The data is all transmitted encrypted, of course, so interception will not cause data to be revealed.

Really my only complaint about Syncthing so far is that, although it binds to localhost, the web GUI does not require authentication by default.

There is an ITP open for Syncthing in Debian, but until then, their apt repo works fine. For syncthing-gtk, the trusty version of the webupd8 PPD works in Jessie (though be sure to pin it to a low priority if you don’t want it replacing some unrelated Debian packages).