Category Archives: Programming

Long-Range Radios: A Perfect Match for Unix Protocols From The 70s

It seems I’ve been on a bit of a vintage computing kick lately. After connecting an original DEC vt420 to Linux and resurrecting some old operating systems, I dove into UUCP.

In fact, it so happened that earlier in the week, my used copy of Managing UUCP & Usenet (its author list includes none other than Tim O’Reilly) arrived. I was reading about the challenges of networking in the 70s: half-duplex lines, slow transmission rates, and modems that had separate dialers. And then I stumbled upon long-distance radio. It turns out that a lot of modern long-distance radio has much in common with the challenges of communication in the 1970s – 1990s, and some of our old protocols might be particularly well-suited for it. Let me explain — I’ll start with the old software, and then talk about the really cool stuff going on in hardware (some radios that can send a signal for 10-20km or more with very little power!), and finally discuss how to bring it all together.


UUCP, for those of you that may literally have been born after it faded in popularity, is a batch system for exchanging files and doing remote execution. For users, the uucp command copies files to or from a remote system, and uux executes commands on a remote system. In practical terms, the most popular use of this was to use uux to execute rmail on the remote system, which would receive an email message on stdin and inject it into the system’s mail queue. All UUCP commands are queued up and transmitted when a “call” occurs — over a modem, TCP, ssh pipe, whatever.

UUCP had to deal with all sorts of line conditions: very slow lines (300bps), half-duplex lines, noisy and error-prone communication, poor or nonexistent flow control, even 7-bit communication. It supports a number of different transport protocols that can accommodate these varying conditions. It turns out that these mesh fairly perfectly with some properties of modern long-distance radio.


The AX.25 stack is a frame-based protocol used by amateur radio folks. Its air speed is 300bps, 1200bps, or (rarely) 9600bps. The Linux kernel has support for the AX.25 protocol and it is quite possible to run TCP/IP atop it. I have personally used AX.25 to telnet to a Linux box 15 miles away over a 1200bps air speed, and have also connected all the way from Kansas to Texas and Indiana using 300bps AX.25 using atmospheric skip. AX.25 has “connected” packets (as TCP) and unconnected/broadcast ones (similar to UDP) and is a error-detected protocol with retransmit. The radios generally used with AX.25 are always half-duplex and some of them have iffy carrier detection (which means collision is frequent). Although the whole AX.25 stack has grown rare in recent years, a subset of it is still in wide use as the basis for APRS.

A lot of this is achieved using equipment that’s not particularly portable: antennas on poles, radios that transmit with anywhere from 1W to 100W of power (even 1W is far more than small portable devices normally use), etc. Also, under the regulations of the amateur radio service, transmitters must be managed by a licensed operator and cannot be encrypted.

Nevertheless, AX.25 is just a protocol and it could, of course, run on other kinds of carriers than traditional amateur radios.

Long-range low-power radios

There is a lot being done with radios these days, much of which I’m not going to discuss. I’m not covering very short-range links such as Bluetooth, ZigBee, etc. Nor am I covering longer-range links that require large and highly-directional antennas (such as some are doing in the 2.4GHz and 5GHz bands). What I’m covering is long-range links that can be used by portable devices.

There is always a compromise in radios, and if we are going to achieve long-range links with poor antennas and low power, the compromise is going to be in bitrate. These technologies may scale down to as low at 300bps or up to around 115200bps. They can, as a side bonus, often be quite cheap.

HC-12 radios

HC-12 is a radio board, commonly used with Arduino, that sports 500bps to 115200bps communication. According to the vendor, in 500bps mode, the range is 1800m or 0.9mi, while at 115200bps, the range is 100m or 328ft. They’re very cheap, at around $5 each.

There are a few downsides to HC-12. One is that the lowest air bitrate is 500bps, but the lowest UART bitrate is 1200bps, and they have no flow control. So, if you are running in long-range mode, “only small packets can be sent: max 60 bytes with the interval of 2 seconds.” This would pose a challenge in many scenarios: though not much for UUCP, which can be perfectly well configured to have a 60-byte packet size and a window size of 1, which would wait for a remote ACK before proceeding.

Also, they operate over 433.4-473.0 MHz which appears to fall outside the license-free bands. It seems that many people using HC-12 are doing so illegally. With care, it would be possible to operate it under amateur radio rules, since this range is mostly within the 70cm allocation, but then it must follow amateur radio restrictions.

LoRa radios

LoRa is a set of standards for long range radios, which are advertised as having a range of 15km (9mi) or more in rural areas, and several km in cities.

LoRa can be done in several ways: the main LoRa protocol, and LoRaWAN. LoRaWAN expects to use an Internet gateway, which will tell each node what frequency to use, how much power to use, etc. LoRa is such that a commercial operator could set up roughly one LoRaWAN gateway per city due to the large coverage area, and some areas have good LoRa coverage due to just such operators. The difference between the two is roughly analogous to the difference between connecting two machines with an Ethernet crossover cable, and a connection over the Internet; LoRaWAN includes more protocol layers atop the basic LoRa. I have yet to learn much about LoRaWAN; I’ll follow up later on that point.

The speed of LoRa ranges from (and different people will say different things here) about 500bps to about 20000bps. LoRa is a packetized protocol, and the maximum packet size depends

LoRa sensors often advertise battery life in the months or years, and can be quite small. The protocol makes an excellent choice for sensors in remote or widely dispersed areas. LoRa transceiver boards for Arduino can be found for under $15 from places like Mouser.

I wound up purchasing two LoStik USB LoRa radios from Amazon. With some experimentation, with even very bad RF conditions (tiny antennas, one of them in the house, the other in a car), I was able to successfully decode LoRa packets from 2 miles away! And these aren’t even the most powerful transmitters available.

Talking UUCP over LoRa

In order to make this all work, I needed to write interface software; the LoRa radios don’t just transmit things straight out. So I wrote lorapipe. I have successfully transmitted files across this UUCP link!

Developing lorapipe was somewhat more challenging than I expected. For one, the LoRa modem raw protocol isn’t well-suited to rapid fire packet transmission; after receiving each packet, the modem exits receive mode and must be told to receive again. Collisions with protocols that ACKd data and had a receive window — which are many — were a problem so bad that it rendered some of the protocols unusable. I wound up adding a “expect more data after this packet” byte to every transmission, and have the receiver not transmit until it believes the sender is finished. This dramatically improved things. There’s more detail on this in my lorapipe documentation.

So far, I have successfully communicated over LoRa using UUCP, kermit, and YMODEM. KISS support will be coming next.

I am also hoping to discover the range I can get from this thing if I use more proper antennas (outdoor) and transmitters capable of transmitting with more power.

All in all, a fun project so far.

The Python Unicode Mess

Unicode has solved a lot of problems. Anyone that remembers the mess of ISO-8859-* vs. CP437 (and of course it’s even worse for non-Western languages) can attest to that. And of course, these days they’re doing the useful work of…. codifying emojis.

Emojis aside, things aren’t all so easy. Today’s cause of pain: Python 3. So much pain.

Python decided to fully integrate Unicode into the language. Nice idea, right?

But here come the problems. And they are numerous.

gpodder, for instance, frequently exits with tracebacks due to Python errors converting podcast titles with smartquotes into ASCII. Then you have the case where the pexpect docs say to use logfile = sys.stdout to show the interaction with the virtual terminal. Only that causes an error these days.

But processing of filenames takes the cake. I was recently dealing with data from 20 years ago, before UTF-8 was a filename standard. These filenames are still valid on Unix. tar unpacks them, and they work fine. But you start getting encoding errors from Python trying to do things like store filenames in strings. For a Python program to properly support all valid Unix filenames, it must use “bytes” instead of strings, which has all sorts of annoying implications. What’s the chances that all Python programs do this correctly? Yeah. Not high, I bet.

I recently was processing data generated by mtree, which uses octal escapes for special characters in filenames. I thought this should be easy in Python, eh?

That second link had a mention of an undocumented function, codecs.escape_decode, which does it right. I finally had to do this:

    if line.startswith(b'#'):
    fields = line.split()
    filename = codecs.escape_decode(fields[0])[0]
    filetype = getfield(b"type", fields[1:])
    if filetype == b"file":

And, whatever you do, don’t accidentally write if filetype == "file" — that will silently always evaluate to False, because "file" tests different than b"file". Not that I, uhm, wrote that and didn’t notice it at first…

So if you want to actually handle Unix filenames properly in Python, you:

  • Must have a processing path that fully avoids Python strings.
  • Must use sys.{stdin,stdout}.buffer instead of just sys.stdin/stdout
  • Must supply filenames as bytes to various functions. See PEP 0471 for this comment: “Like the other functions in the os module, scandir() accepts either a bytes or str object for the path parameter, and returns the and DirEntry.path attributes with the same type as path. However, it is strongly recommended to use the str type, as this ensures cross-platform support for Unicode filenames. (On Windows, bytes filenames have been deprecated since Python 3.3).” So if you want to be cross-platform, it’s even worse, because you can’t use str on Unix nor bytes on Windows.

Update: Would you like to receive filenames on the command line? I’ll hand you this fine mess. And the environment? it’s not even clear.

Fixing the Problems with Docker Images

I recently wrote about the challenges in securing Docker container contents, and in particular with keeping up-to-date with security patches from all over the Internet.

Today I want to fix that.

Besides security, there is a second problem: the common way of running things in Docker pretends to provide a traditional POSIX API and environment, but really doesn’t. This is a big deal.

Before diving into that, I want to explain something: I have often heard it said the Docker provides single-process containers. This is unambiguously false in almost every case. Any time you have a shell script inside Docker that calls cp or even ls, you are running a second process. Web servers from Apache to whatever else use processes or threads of various types to service multiple connections at once. Many Docker containers are single-application, but a process is a core part of the POSIX API, and very little software would work if it was limited to a single process. So this is my little plea for more precise language. OK, soapbox mode off.

Now then, in a traditional Linux environment, besides your application, there are other key components of the system. These are usually missing in Docker containers.

So today, I will fix this also.

In my docker-debian-base images, I have prepared a system that still has only 11MB RAM overhead, makes minimal changes on top of Debian, and yet provides a very complete environment and API. Here’s what you get:

  • A real init system, capable of running standard startup scripts without modification, and solving the nasty Docker zombie reaping problem.
  • Working syslog, which can either export all logs to Docker’s logging infrastructure, or keep them within the container, depending on your preferences.
  • Working real schedulers (cron, anacron, and at), plus at least the standard logrotate utility to help prevent log files inside the container from becoming huge.

The above goes into my “minimal” image. Additional images add layers on top of it, and here are some of the features they add:

  • A real SMTP agent (exim4-daemon-light) so that cron and friends can actually send you mail
  • SSH client and server (optionally exposed to the Internet)
  • Automatic security patching via unattended-upgrades and needsrestart

All of the above, including the optional features, has an 11MB overhead on start. Not bad for so much, right?

From here, you can layer on top all your usual Dockery things. You can still run one application per container. But you can now make sure your disk doesn’t fill up from logs, run your database vacuuming commands at will, have your blog download its RSS feeds every few minutes, etc — all from within the container, as it should be. Furthermore, you don’t have to reinvent the wheel, because Debian already ships with things to take care of a lot of this out of the box — and now those tools will just work.

There is some popular work done in this area already by phusion’s baseimage-docker. However, I made my own for these reasons:

  • I wanted something based on Debian rather than Ubuntu
  • By using sysvinit rather than runit, the OS default init scripts can be used unmodified, reducing the administrative burden on container builders
  • Phusion’s system is, for some reason, not auto-built on the Docker hub. Mine is, so it will be automatically revised whenever the underlying Debian system, or the Github repository, is.

Finally a word on the choice to use sysvinit. It would have been simpler to use systemd here, since it is the default in Debian these days. Unfortunately, systemd requires you to poke some holes in the Docker security model, as well as mount a cgroups filesystem from the host. I didn’t consider this acceptable, and sysvinit ran without these workarounds, so I went with it.

With all this, Docker becomes a viable replacement for KVM for various services on my internal networks. I’ll be writing about that later.

Agile Is Dead (Long Live Agility)

In an intriguing post, PragDave laments how empty the word “agile” has become. To paraphrase, I might say he’s put words to a nagging feeling I’ve had: that there are entire books about agile, conferences about agile, hallway conversations I’ve heard about whether somebody is doing this-or-that agile practice correctly.

Which, when it comes down to it, means that they’re not being agile. If process and tools, even if they’re labeled as “agile” processes and tools, are king, then we’ve simply replaced one productivity-impairing dictator with another.

And he makes this bold statement:

Here is how to do something in an agile fashion:

What to do:

  • Find out where you are
  • Take a small step towards your goal
  • Adjust your understanding based on what you learned
  • Repeat

How to do it:

When faced with two or more alternatives that deliver roughly the same value, take the path that makes future change easier.

Those four lines and one practice encompass everything there is to know about effective software development.

He goes on to dive into that a bit, of course, but I think this man has a rare gift of expressing something complicated so succinctly. I am inclined to believe he is right.

Voice Keying with bash, sox, and aplay

There are plenty of times where it is nice to have Linux transmit things out a radio. One obvious example is the digital communication modes, where software acts as a sort of modem. A prominent example of this in Debian is fldigi.

Sometimes, it is nice to transmit voice instead of a digital signal. This is called voice keying. When operating a contest, for instance, a person might call CQ over and over, with just some brief gaps.

Most people that interface a radio with a computer use a sound card interface of some sort. The more modern of these have a simple USB cable that connects to the computer and acts as a USB sound card. So, at a certain level, all that you have to do is play sound out a specific device.

But it’s not quite so easy, because there is one other wrinkle: you have to engage the radio’s transmitter. This is obviously not something that is part of typical sound card APIs. There are all sorts of ways to do it, ranging from dedicated serial or parallel port circuits involving asserting voltage on certain pins, to voice-activated (VOX) circuits.

I have used two of these interfaces: the basic Signalink USB and the more powerful RigExpert TI-5. The Signalink USB integrates a VOX circuit and provides cabling to engage the transmitter when VOX is tripped. The TI-5, on the other hand, emulates three USB serial ports, and if you raise RTS on one of them, it will keep the transmitter engaged as long as RTS is high. This is a more accurate and precise approach.

VOX-based voice keying with the Signalink USB

But let’s first look at the Signalink USB case. The problem here is that its VOX circuit is really tuned for digital transmissions, which tend to be either really loud or completely silent. Human speech rises and falls in volume, and it tends to rapidly assert and drop PTT (Push-To-Talk, the name for the control that engages the radio’s transmitter) when used with VOX.

The solution I hit on was to add a constant, loud tone to the transmitted audio, but one which is outside the range of frequencies that the radio will transmit (which is usually no higher than 3kHz). This can be done using sox and aplay, the ALSA player. Here’s my script to call cq with Signalink USB:

# NOTE: use alsamixer and set playback gain to 99
set -e

playcmd () {
        sox -V0 -m "$1" \
           "| sox -V0 -r 44100 $1 -t wav -c 1 -   synth sine 20000 gain -1" \
            -t wav - | \
           aplay -q  -D default:CARD=CODEC


echo -n "Started at: "

STARTTIME=`date +%s`
while true; do
        printf "\r"
        echo -n $(( (`date +%s`-$STARTTIME) / 60))
        printf "m/${DELAY}s: TRANSMIT"
        playcmd ~/audio/cq/cq.wav
        printf "\r"
        echo -n $(( (`date +%s`-$STARTTIME) / 60))
        printf "m/${DELAY}s: off         "
        sleep $DELAY

Run this, and it will continuously play your message, with a 1.5s gap in between during which the transmitter is not keyed.

The screen will look like this:

Started at: Fri Aug 24 21:17:47 CDT 2012
2m/1.5s: off

The 2m is how long it’s been going this time, and the 1.5s shows the configured gap.

The sox commands are really two nested ones. The -m causes sox to merge the .wav file in $1 with the 20kHz sine wave being generated, and the entire thing is piped to the ALSA player.

Tweaks for RigExpert TI-5

This is actually a much simpler case. We just replace playcmd as follows:

playcmd () {
        ~/bin/raiserts /dev/ttyUSB1 'aplay -q -D default:CARD=CODEC' < "$1"

Where raiserts is a program that simply keeps RTS asserted on the serial port while the given command executes. Here's its source, which I modified a bit from a program I found online:

/* modified from
 * */

static struct termios oldterminfo;

void closeserial(int fd)
    tcsetattr(fd, TCSANOW, &oldterminfo);
    if (close(fd) < 0)

int openserial(char *devicename)
    int fd;
    struct termios attr;

    if ((fd = open(devicename, O_RDWR)) == -1) {
        perror("openserial(): open()");
        return 0;
    if (tcgetattr(fd, &oldterminfo) == -1) {
        perror("openserial(): tcgetattr()");
        return 0;
    attr = oldterminfo;
    attr.c_cflag |= CRTSCTS | CLOCAL;
    attr.c_oflag = 0;
    if (tcflush(fd, TCIOFLUSH) == -1) {
        perror("openserial(): tcflush()");
        return 0;
    if (tcsetattr(fd, TCSANOW, &attr) == -1) {
        perror("initserial(): tcsetattr()");
        return 0;
    return fd;

int setRTS(int fd, int level)
    int status;

    if (ioctl(fd, TIOCMGET, &status) == -1) {
        perror("setRTS(): TIOCMGET");
        return 0;
    status &= ~TIOCM_DTR;   /* ALWAYS clear DTR */
    if (level)
        status |= TIOCM_RTS;
        status &= ~TIOCM_RTS;
    if (ioctl(fd, TIOCMSET, &status) == -1) {
        perror("setRTS(): TIOCMSET");
        return 0;
    return 1;

int main(int argc, char *argv[])
    int fd, retval;
    char *serialdev;

    if (argc < 3) {
        printf("Syntax: raiserts /dev/ttyname 'command to run while RTS held'\n");
        return 5;
    serialdev = argv[1];
    fd = openserial(serialdev);
    if (!fd) {
        fprintf(stderr, "Error while initializing %s.\n", serialdev);
        return 1;

    setRTS(fd, 1);
    retval = system(argv[2]);
    setRTS(fd, 0);

    return retval;

This compiles to an executable less than 10K in size. I love it when that happens.

So these examples support voice keying both with VOX circuits and with serial-controlled PTT. raiserts.c could be trivially modified to control other serial pins as well, should you have an interface which uses different ones.

How to get started programming?

I have been asked for advice from several people recently on how to get started programming, or how to further develop a nascent interest in coding or software engineering. The people asking the questions range in age from about 10 years old to older than me. These are people that, for various reasons, are not very easily able to take computer science courses right now.

One would think that, since I’ve been doing this for somewhere around a quarter century (oh I do feel old now), that I’d be ready to offer up some great advice. And offer some suggestions I have. But I’m not convinced they’re good ones.

I have two main tensions. The first is that I, like many in the communities I tend to hang out in such as Debian’s, have a personality that leads me to take a deep dive into details of anything that holds my interest. Whether it’s Linux, Haskell, or amateur radio, I want to do more than skim the surface if I’m having fun with it. Many people are not like that. They may have a lot of fun programming in Visual Basic, not really caring that other languages are out there. Or some people are not like this yet. I feel unqualified to provide good advice to people that are different from me in that way. To put it a different way: most people don’t want to wait 4 years to be useful, and want to start out right away and get better over time (and I was the same way too.)

The second is related. I learned programming at a time when, other than BASIC, interpreted languages were not really available to me. (Yes, they were available, but not to me.) I cut my teeth on BASIC, Pascal, and C. Although I rarely use C anymore, I can still drop into it at a moment’s notice and be perfectly comfortable. I feel it was a fundamentally valuable experience, and that it would be very hard to become a great programmer without ever having lived and breathed something like C, where memory and pointers must be managed manually. Having said that, it is probably possible to become a good coder without ever having touched C.

Here, then, is an edited version of some rambly advice I sent to someone recently, where learning OOP was particularly mentioned. I would welcome your comments and suggestions. I may point people that ask to this post in the future.

For simply learning how to write code, Dive Into Python has long been a decent resource, though it may assume more experience than some have. I haven’t read them myself, but I’ve also heard good things about the How to Think Like a Computer Scientist series from Green Tea Press. They’re all available as free PDF downloads, too!

Eric S. Raymond’s The Art of Unix Programming is another work I’ve heard good things about, despite having never read it myself. A quick glance at the table of contents makes me think that even if people don’t wind up working on Unix, the lessons and philosophy should be informative.

It seems that many Computer Science programs are using Java for the core of their instruction, or even almost exclusively. Whether that is good or bad, I’m not completely sure. It certainly gets people into OOP more deeply, but I’m a “right tool for the job” kind of person. Despite the hype, OO — like everything else — isn’t the right tool for every job.

It is fine for people to dive straight into OO and become good programmers/engineers. However, I think it would be difficult to become a great programmer/engineer without ever having a solid understanding of a more low-level language, such as C in particular. I did my CS work when it was mostly based in C, and am glad for it. If someone never has to manage memory or pointers, I suspect they will be at a disadvantage in the long run for not being able to understand or work with the system at a more fundamental level. If a person knows C, plus some concepts of OO and Functional Programming (FP), it should be easy to pick up just about any other language out there.

I used to think Python was a great first language, but during the 2.x series they added so much fluff and so many special cases that I’m less enthusiastic now, though I don’t know how much of that got cleaned up in 3.x. I am not too keen on Java as a first language, because too many things that should be simple aren’t. I have a fondness for Haskell, and its close relationship to mathematics could make it a great first language — or maybe a poor one, depending on your perspective.

One other thing – I think it’s important for good programmers to have experience with all three major models of programming (procedural, OO, functional.) Even if a person winds up working mostly in one universe, knowledge of and experience with the others is important and informative and, in my experience, leads to better algorithms and architecture all around.

  • Procedural languages: Obviously C, but also Unix shell
  • OO languages: Python, Java, plenty of other fine choices
  • Functional: Lisp, Scheme, Haskell (also the only lazy and pure language on this list)
  • Having said all that, more important than a choice of book or language is experience. I have heard people suggest that it takes 10,000 hours of practice to become a superstar at something, whatever that “something” is, and I wouldn’t doubt it. Seth Godin discusses that a bit, with some criticism of the idea too.

    So that leads to the most important piece of advice: dive in to whatever your interest is. Experiment, write code, put theory into practice in a way that holds interest and excitement. People that try to do things they don’t enjoy don’t seem to stick with them as long or execute as well, and thus will never become great.

Shell Scripts For Preschoolers

It probably comes as no surprise to anybody that Jacob has had a computer since he was 3. Jacob and I built it from spare parts, together.

It may come as something of a surprise that it has no graphical interface, and Jacob uses the command line and loves it — and did even before he could really read.

A few months ago, I wrote about the fun Jacob had with speakers and a microphone, and posted a copy of the cheat sheet he has with his computer. Lately, Jacob has really enjoyed playing with the speech synthesizer — both trying to make it say real words and nonsense words. Sometimes he does that for an hour.

I was asked for a copy of the scripts I wrote. They are really simple. I gave them names that would be easy for a preschooler to remember and spell, even if they conflicted with existing Unix/Linux commands. I put them in /usr/local/bin, which occurs first on the PATH, so it doesn’t matter if they conflict.

First, for speech systhesis, /usr/local/bin/talk:

echo "Press Ctrl-C to stop."
espeak -v en-us -s 150

espeak comes from the espeak package. It seemed to give the most consistenly useful response.

Now, on to the sound-related programs. Here’s /usr/local/bin/ssl, the “sound steam locomotive”. It starts playing a train sound if one isn’t already playing:

pgrep mpg321 > /dev/null || mpg321 -q /usr/local/trainsounds/main.mp3 &
sl "$@"

And then there’s /usr/local/bin/record:

cd $HOME/recordings
echo "Now recording. Press Ctrl-C to stop."
DATE=`date +%Y-%m-%dT%H-%M-%S`
chmod a-w *.wav
exec arecord -c 1 -f S16_LE -c 1 -r 44100 "$FILENAME"

This simply records in a timestamped file. Then, its companion, /usr/local/bin/play. Sorry about the indentation; for whatever reason, it is being destroyed by the blog, but you get the idea.

case "$1" in
mpg321 /usr/local/trainsounds/main.mp3
/usr/bin/play /usr/local/trainsounds/traindreams.flac
cd $HOME/recordings
exec aplay `ls -tr| tail -n 1`

So, Jacob can run just “play”, which will play back his most recent recording. As something of a bonus, the history of recordings is saved for us to listen to later. If he types “play train”, there is the sound of a train passing. And, finally, “play song” plays Always a Train in My Dreams by Steve Gillette (I heard it on the radio once and bought the CD).

Some of these commands kick off sound playing in the background, so here is /usr/local/bin/bequiet:

killall mpg321 &> /dev/null
killall play &> /dev/null
killall aplay &> /dev/null
killall cw &> /dev/null

Geeks, Hobbies, and Free/Open Source: Feedback Wanted

I’ve been thinking lately about ways to improve ways in which I interact with Free Software projects, and ways in which they interact with me. Before I proceed to take steps or make suggestions, I’d like to see if others share my traits and observations.

Here are some questions I have been thinking of. If you’d like to help give me anecdotal evidence, please post a comment below this post. Identify the question numbers you are answering. It helps me if you can give specific examples, but if you don’t have the time or memory for that, no problem.

I will post my own answers in a day or two, but the point of this post is listening, not talking, so I’ll not post them immediately.

Hobbies (General – any geeks)

  • H1: To what degree do you like your hobbies to be challenging vs. easy? If something isn’t challenging, does that make it a good, bad, or indifferent candidate for a hobby
  • H2: To what degree do you like your hobbies to be educational or enlightening?
  • H3: How do you pick up new hobbies? Do you go looking for them? Do you stumble upon them? What excites you to commit time and/or money to them at the beginning?
  • H4: How does your interest wane? What causes you to lose interest in hobbies?
  • H5: For how long do you tend to maintain hobbies? Sub-hobbies?
  • H6: Are your hobbies or sub-hobbies cyclical? In other words, do you lose interest in a hobby for a time, then regain interest for a time, then lose it again? What is the length of time of these cycles, if any?
  • H7: Do you prefer social hobbies or solitary hobbies? (Note that many hobbies, including programming, video gaming, reading, knitting, etc. could be either social or solitary, depending on the inclination of individuals.)
  • H8: Have you ever felt guilt about wanting to stop a hobby or sub-hobby? (For instance, from stopping supporting users of your software project, readers of your e-zine, etc) Did the guilt keep you going? Was that a good thing?

Examples: video games might be a challenging hobby (depending on the person) but in most cases aren’t educational.

A hobby might be “video game playing” or “being a Debian developer.” A sub-hobby might be “playing GTA IV”, “playing RPGs”, or “maintaining mutt”.

Free/Open Source Hobbies

  • F1: Considering your answers above, do your FLOSS activities follow the same general pattern as your other hobbies/interests, or are there differences? If there are differences, what are they?
  • F2: Has concern for being expected to support software longer than you will have an interest in it ever been a factor in a decision whether to release source code publicly, or how public to make a release?
  • F3: Has concern over the long-term interest of a submitter in maintaining their patch/contribution ever caused you to consider rejecting it? (Or caused you to avoid using software over the same concern about its author)
  • F4: In general, do you find requirements FLOSS projects place on first-time contributors to be too stringent, not stringent enough, or about right?
  • F5: Have you ever continued contributing to a project past the point where your interest would otherwise motivate you to do so? If so, what caused you to do this? Do you believe that cause is a general positive or negative force for members of the FLOSS community?
  • F6: Have there ever been factors that caused you to stop contributing to a project even though you still had an active interest in doing so? What were they?
  • F7: Have you ever wanted to be able to take a break as a contributor or maintainer of a project, and be able to return to contributing to it later? If so, have you found it easy to do so?
  • F8: What is your typical length of engagement with FLOSS projects (such as Debian) and sub-projects (such as maintaining a particular package)?
  • F9: Does a change in social group ever encourage or discourage you from changing hobbies or sub-hobbies?
  • F10: Have you ever wanted to stop working on a project/sub-project because the problems involved were no longer challenging or educational to you?
  • F11: Have you ever wanted to stop working on a project/sub-project because of issues with the people involved?

Examples on F9: If, say, you are a long-time Perl user and have gone to Perl conferences, but now you are interested in Ruby, would your involvement with the Perl community cause you to avoid taking up the Ruby programming hobby? Or would it cause you to cut your ties with Perl less quickly than your changing interest might dictate? (This is a completely arbitrary example and isn’t meant to start a $LANGUAGE thread.)

Changes over time

  • C1: Do you believe that your answers to any of the above questions have changed over time? If yes, then:
  • C2: What kinds of changes have happened?
  • C3: What caused the change?
  • C4: Do you believe the changes produced positive results for you? For the community?

Time to learn a new language

I have something of an informal goal of learning a new programming language every few years. It’s not so much a goal as it is something of a discomfort. There are so many programming languages out there, with so many niches and approaches to problems, that I get uncomfortable with my lack of knowledge of some of them after awhile. This tends to happen every few years.

The last major language I learned was Haskell, which I started working with in 2004. I still enjoy Haskell and don’t see anything displacing it as my primary day-to-day workhorse.

Yet there are some languages that I’d like to learn. I have an interest in cross-platform languages; one of my few annoyances with Haskell is that it can’t (at least with production quality) be compiled into something like Java bytecode or something else that isn’t architecture-dependent. I have long had a soft spot for functional languages. I haven’t had such a soft spot for static type checking, but Haskell’s type inference changed that for me. Also I have an interest in writing Android apps, which means some sort of Java tie-in would be needed.

Here are my current candidates:

  • JavaScript. I have never learned the language but dislike it intensely based on everything I have learned about it (especially the diverging standards of implementation). Nevertheless, there are certain obvious reasons to try it — the fact that most computers and mobile phones can run it out of the box is an important one.
  • Scheme. Of somewhat less interest since I learned Common Lisp quite awhile back. I’m probably pretty rusty at it, but I’m not sure Scheme would offer me anything novel that I can’t find in Haskell — except for the ability to compile to JVM bytecode.
  • Lua — it sounds interesting, but I’m not sure if it’s general-purpose enough to merit long-term interest.
  • Scala sounds interesting — a OOP and FP language that compiles to JVM bytecode.
  • Smalltalk. Seems sad I’ve never learned this one.
  • There are some amazing webapps written using Cappuccino. The Github issue interface is where I hear about this one.
  • Eclipse. I guess it’s mostly not a programming language but an IDE, but then there’s some libraries (RCP?) or something with it — so to be honest, I don’t know what it is. Some people seem very excited about it. I tried it once, couldn’t figure out how to just open a file and start editing already. Made me feel like I was working for Initech and wouldn’t get to actually compile something until my TPS coversheets were in order. Dunno, maybe it’s not that bad, but I never really understood the appeal of something other than emacs/vi+make.
  • A Haskell web infrastructure such as HSP or hApps. Not a new language, but might as well be…

Of some particular interest to me is that Haskell has interpreters for Scheme, Lua, and JavaScript as well as code generators for some of these languages (though not generic Haskell-to-foo compilers).

Languages not in the running because I already know them include: OCaml, POSIX shell, Python, Perl, Java, C, C++, Pascal, BASIC, Common Lisp, Prolog, SQL. Languages I have no interest in learning right now include Ruby (not different enough from what I already know plus bad experiences with it), any assembly, anything steeped in the Microsoft monoculture (C#, VB, etc.), or anything that is hard to work with outside of an Emacs or vim environment. (If your language requires or strongly encourages me to use your IDE or proprietary compiler, I’m not interested — that means you, flash.)

Brief Reivews of Languages I Have Used

To give you a bit of an idea of where I’m coming from:

  • C: Not much to say there. I think its pros and cons are well-known. I consider it to be too unwieldy for general-purpose use these days, though find myself writing code in it every few years for some reason or other.
  • Perl: The first major interpreted language I learned. Stopped using it entirely after learning Python (didn’t see any advantage of Perl over Python, and plenty of disadvantages.)
  • Python: Used to really like it. Both the language and I have changed. It is no longer the “clean” language it was in the 1.5 days. Too many lists-that-aren’t-lists, __underscorethings__, etc. Still cleaner than Perl. It didn’t scale up to large projects well, and the interpreted dynamic nature left me not all that happy to use it for some tasks. Haven’t looked at Python 3, but also it probably isn’t ready for prime time yet anyhow.
  • Java: Better than C++ for me. Cross-platform bytecode features. That’s about all that I have to say about it that’s kind. The world’s biggest source of carpal tunnel referrals in my book. Mysteriously manages to have web apps that require 2GB of RAM just to load. Dunno where that came from; Apache with PHP and mod_python takes less than 100M.
  • Haskell: A pretty stellar mix of a lot of nice features, type inference being one of the very nice ones in my book. My language of choice for almost all tasks. Laziness can be hard for even experienced Haskellers to understand at times, and the libraries are sometimes in a bit of flux (which should calm down soon). Still a very interesting language and, in fact, a decent candidate for my time as there is some about it I’ve never picked up, including some modules and web toolkits.
  • OCaml: Tried it, eventually discarded it. An I/O library that makes you go through all sorts of contortions to be able to open a file read/write isn’t cool in my book.

Review: Free Software Project Hosting

I asked for suggestions a few days ago. I got several good ones, and investigated them. You can find my original criteria at the link above. Here’s what I came up with:

Google Code

Its very simple interface appeals to me. It has an issue tracker, a wiki, a download area. But zero integration with git. That’s not necessarily a big problem; I can always keep on hosting git repos at It is a bit annoying, though, since I wouldn’t get to nicely link commit messages to automatic issue closing.

A big requirement of mine is being able to upload tarballs or ZIP files from the command line in an automated fashion. I haven’t yet checked to see if Google Code exports an API for this. Google Code also has a lifetime limit of 25 project creations, though rumor has it they may lift the limit if you figure out where to ask and ask nicely.



Gitorious is one of the two Git-based sites that put a strong emphasis on community. Like Github, Gitorious tries to make it easy for developers to fork projects, submit pull requests to maintainers, and work together. This aspect of it does hold some appeal to me, though I have never worked with one of these sites, so I am somewhat unsure of how I would use it.

The downside of Gitorious or Github is that they tie me to Git. While I’m happy with Git and have no plans to change now, I’ve changed VCSs many times over the years when better tools show up; I’ve used, in approximately this order, CVS, Subversion, Arch/tla, baz, darcs, Mercurial, and Git, with a brief use of Perforce at a job that required it. I may use Git for another 3 years, but after 5 years will Git still be the best VCS out there? I don’t know.

Gitorious fails several of my requirements, though. It has no issue tracker and no downloads area.

It can spontaneously create a tar.gz file from the head of any branch, but not a zip file. It is possible to provide a download of a specific revision, but this is not very intuitive for the end user.

Potential workarounds include using Lighthouse for bug tracking (they do support git integration for changelog messages) and my own server to host tarballs and ZIP files — which I could trivially upload via scp.



At first glance, this is a more-powerful version of Gitorious. It has similar community features, has a wiki, but adds an issue tracker, download area, home page capability, and a bunch of features. It has about a dozen pre-built commit hooks, that do everything from integrate with Lighthouse to pop commit notices into Jabber.

But there are surprising drawbacks, limitations, and even outright bugs all throughout. And it all starts with the user interface.

On the main project page, the user gets both a download button and a download tab. But they don’t do the same thing. Talk about confusing!

The download button will make a ZIP or tarball out of any tag in the repo. The download tab will also do the same, though presented in a different way; but the tab can also offer downloads for files that the maintainer has manually uploaded. Neither one lets you limit the set of tags presented, so if you have an old project with lots of checkpoints, the poor end user has to sift through hundreds of tags to find the desired version. It is possible to make a tarball out of a given branch, so a link to the latest revision could be easy, but still.

Even worse, there’s a long-standing issue where several of the tabs get hidden under other on-screen elements. The wiki tab, project administration tab, and sometimes even the download tab are impacted. It’s been open since February with no apparent fix.

And on top of that, uploading arbitrary tarballs requires — yes — Flash. Despite requests to make it scriptable, they reply that there is no option but Flash and they may make some other option sometime.

The issue tracker is nice and simple. But it doesn’t support attachments. So users can’t attach screenshots, debug logs, or diffs.

I really wanted to like Github. It has so many features for developers. But all these surprising limitations make it a pain both for developers (I keep having to “view source” to find the link to the wiki or the project admin page) and for users (confusing download options, lack of issue attachments). In the end, I think the display bug is a showstopper for me. I could work around some of the others by having a wiki page with links to downloads and revisions and giving that out as the home page perhaps. But that’s a lot of manual maintenance that I would rather avoid.



Launchpad is the project management service operated by Canonical, the company behind Ubuntu. While Launchpad can optionally integrate well with Ubuntu, that isn’t required, so non-developers like me can work with it fine.

Launchpad does offer issue tracking, but no wiki. It has a forum of sorts though (the “Answers” section). It has some other features, such as blueprints, that would likely only be useful for projects larger than the ones I would plan to use it for.

It does have a downloads area, and they say they have a Python API. I haven’t checked it out, but if it supports scriptable uploads, that would work for me.

Besides the lack of a wiki, Launchpad is also tied to the bzr VCS. bzr was one of the early players in DVCS, written as a better-designed successor to tla/Arch and baz, but has no compelling features over Git or Mercurial for me today. I have no intention of switching to or using it any time soon.

Launchpad does let you “import” branches from another VCS such as Git or svn. I set up an “import” branch for a test project yesterday. 12 hours later, it still hasn’t imported anything; it’s just sitting at “pending review.” I have no idea if it ever will, or why setting up a bzr branch requires no review but a git branch requires review. So I am unable to test the integration between it and the changesets, which is really annoying.

So, some possibilities here, but the bzr-only thing really bugs me. And having to have my git trees reviewed really goes against the “quick and simple” project setup that I would have preferred to see.



Indefero is explicitly a Google Code clone, but aims to be a better Google Code than Google Code. The interface is similar to Google’s — very simple and clean. Unlike Google Code, Indefero does support Git. It supports a wiki, downloads area, and issue tracker. You can download the PHP-based code and run it yourself, or you can get hosting from the Indefero site.

I initially was favorably impressed by Indefero, but as I looked into it more, I am not very impressed right now. Although it does integrate with Git, and you can refer to an issue number in a Git commit, a Git commit can’t close an issue. Git developers use git over ssh to interact with it, but it supports only one ssh key per user — so this makes it very annoying if I wish to push changes from all three of the machines I regularly do development with. Despite the fact that this is a “high priority” issue, it hasn’t been touched by the maintainer in almost a month, even though patches have been offered.

Indefero can generate files based on any revision in git, or based on the latest on any branch, but only in ZIP format (no tar.gz).

Although the program looks very nice and the developer clueful, Indefero has only one main active developer or committer, and he is a consultant that also works on other projects. That makes me nervous about putting too many eggs into the Indefero basket.



Trac is perhaps the gold standard of lightweight project management apps. It has a wiki, downloads, issue tracking, and VCS integration (SVN only in the base version, quite a few others with 3rd-party plugins). I ran trac myself for awhile.

It also has quite a few failings. Chief among them is that you must run a completely separate Trac instance for each project. So there is no possible way to go to some dashboard and see all bugs assigned to you from all projects, for instance. That is what drove me away from it initially. That and the serious performance problems that most of its VCS backends have.



Redmine is designed to be a better Trac than Trac. It uses the same lightweight philosophy in general, has a wiki, issue tracker, VCS integration, downloads area, etc. But it supports multiple projects in a sane and nice way. It’s what I currently use over on

Redmine has no API to speak of, though I have managed to cobble together an automatic uploader using curl. It was unpleasant and sometimes breaks on new releases, but it generally gets the job done.

I have two big problems with Redmine. One is performance. It’s slow. And when web spiders hit it, it sometimes has been so slow that it takes down my entire server. Because of the way it structures its URLs, it is not possible to craft a robots.txt that does the right thing — and there is no plan to completely fix it. There is, however, a 3rd-party plugin that may help.

The bigger problem relates to maintaining and upgrading Redmine. This is the first Ruby on Rails app I have ever used, and let me say it has made me want to run away screaming from Ruby on Rails. I’ve had such incredible annoyances installing and upgrading this thing that I can’t even describe what was wrong. All sorts of undocumented requirements for newer software, GEMS that are supposed to work with it but don’t, having to manually patch things so they actually work, conflicts with what’s on the system, and nobody in the Redmine, Rails, or Ruby communities being able to help. I upgrade rarely because it is such a hassle and breaks in such spectacular ways. I don’t think this is even Redmine’s fault; I think it’s a Rails and Ruby issue, but nevertheless, I am stuck with it. My last upgrade was a real mess — bugs in the PostgreSQL driver — the newer one that the newer GEM that the newer Redmine required — were sending invalid SQL to it. Finally patched it myself, and this AFTER the whole pain that is installing gems in Ruby.

I’d take a CGI script written in Bash over Ruby on Rails after this.

That said, Redmine has the most complete set of the features I want of all the programs I’ve mentioned on this page.



Savannah is operated by the Free Software Foundation, and runs a fork of the SourceForge software. Its fork does support Git, but lacks a wiki. It has the standard *forge issue tracker, download area, home page support, integrated mailing lists, etc. It also has the standard *forge over-complexity.

There is a command-line SourceForge uploader in Debian that could potentially be hacked to work with Savannah, but I haven’t checked.


Appears to be another *forge clone. Similar to Savannah, but with a wiki, ugly page layout, and intrusive ads.



Used to be the gold-standard of project hosting. Now looks more like a back alley in a trashy neighborhood. Ads all over the place, and intrusive and ugly ones at that. The ads make it hard to use the interface and difficult to navigate, especially for newbies. No thanks.


The four options that look most interesting to me are: Indefero, Github, Gitorious, and staying with Redmine. The community features of Github, Gitorious, and Launchpad all sound interesting, but I don’t have the experience to evaluate how well they work in practice — and how well they encourage “drive by commits” for small projects.

Gitorious + Lighthouse and my own download server merits more attention. Indefero still makes me nervous due to the level of development activity and single main developer. Github has a lot of promise, but an interface that is too confusing and buggy for me to throw at end users. That leaves me with Redmine, despite all the Rails aggravations. Adding the bot blocking plugin may just get me what I want right now, and is certainly the path of least resistance.

I am trying to find ways to build communities around these projects. If I had more experience with Github or Gitorious, and thought their community features could make a difference for small projects, I would try them.