Category Archives: Debian

Managing an External Display on Linux Shouldn’t Be This Hard

I first started using Linux and FreeBSD on laptops in the late 1990s. Back then, there were all sorts of hassles and problems, from hangs on suspend to pure failure to boot. I still worry a bit about suspend on unknown hardware, but by and large, the picture of Linux on laptops has dramatically improved over the last years. So much so that now I can complain about what would once have been a minor nit: dealing with external monitors.

I have a USB-C dock that provides both power and a Thunderbolt display output over the single cable to the laptop. I think I am similar to most people in wanting the following behavior from the laptop:

  • When the lid is closed, suspend if no external monitor is connected. If an external monitor is connected, shut off the built-in display and use the external one exclusively, but do not suspend.
  • Lock the screen automatically after a period of inactivity.
  • While locked, all connected displays should be powered down.
  • When an external display is connected, begin using it automatically.
  • When an external display is disconnected, stop using it. If the lid is closed when the external display is disconnected, go into suspend mode.

This sounds so simple. But somehow on Linux we’ve split up these things into a dozen tiny bits:

  • In /etc/systemd/logind.conf, there are settings about what to do when the lid is opened or closed.
  • Various desktop environments have overlapping settings covering the same things.
  • Then there are the display managers (gdm3, lightdm, etc) that also get in on the act, and frequently have DIFFERENT settings, set in different places, from the desktop environments. And, what’s more, they tend to be involved with locking these days.
  • Then there are screensavers (gnome-screensaver, xscreensaver, etc.) that also enter the picture, and also have settings in these areas.

Problems I’ve Seen

My problems don’t even begin with laptops, but with my desktop, running XFCE with xmonad and lightdm. My desktop is hooked to a display that has multiple inputs. This scenario (reproducible in both buster and bullseye) causes the display to be unusable until a reboot on the desktop:

  1. Be logged in and using the desktop
  2. Without locking the desktop screen, switch the display input to another device
  3. Keep the display input on another device long enough for the desktop screen to auto-lock
  4. At this point, it is impossible to re-awaken the desktop screen.

I should not here that the problems aren’t limited to Debian, but also extend to Ubuntu and various hardware.

Lightdm: which greeter?

At some point while troubleshooting things after upgrading my laptop to bullseye, I noticed that while both were running lightdm, I had different settings and a different appearance between the two. Upon further investigation, I realized that one hat slick-greeter and lightdm-settings installed, while the other had lightdm-gtk-greeter and lightdm-gtk-greeter-settings installed. Very strange.

XFCE: giving up

I eventually gave up on making lightdm work. No combination of settings or greeters would make things work reliably when changing screen configurations. I installed xscreensaver. It doesn’t hang, but it does sometimes take a few tries before it figures out what device to display on.

Worse, since updating from buster to bullseye, XFCE no longer automatically switches audio output when the docking station is plugged in, and there seems to be no easy way to convince Pulseaudio to do this.

X-Based Gnome and derivatives… sigh.

I also tried Gnome, Mate, and Cinnamon, and all of them had various inabilities to configure things to act the way I laid out above.

I’ve long not been a fan of Gnome’s way of hiding things from the user. It now has a Windows-like situation of three distinct settings programs (settings, tweaks, and dconf editor), which overlap in strange ways and interact with systemd in even stranger ways. Gnome 3 make it quite non-intuitive to make app icons from various programs work, and so forth.

Trying Wayland

I recently decided to set up an older laptop that I hadn’t used in awhile. After reading up on Wayland, I decided to try Gnome 3 under Wayland. Both the Debian and Arch wikis note that KDE is buggy on Wayland. Gnome is the only desktop environment that supports it then, unless I want to go with Sway. There’s some appeal to Sway to this xmonad user, but I’ve read of incompatibilities of Wayland software when Gnome’s not available, so I opted to try Gnome.

Well, it’s better. Not perfect, but better. After finding settings buried in a ton of different Settings and Tweaks boxes, I had it mostly working, except gdm3 would never shut off power to the external display. Eventually I found /etc/gdm3/greeter.dconf-defaults, and aadded:

sleep-inactive-ac-timeout=60
sleep-inactive-ac-type='blank'
sleep-inactive-battery-timeout=120
sleep-inactive-battery-type='suspend'

Of course, these overlap with but are distinct from the same kinds of things in Gnome settings.

Sway?

Running without Gnome seems like a challenge; Gnome is switching audio output appropriately, for instance. I am looking at some of the Gnome Shell tiling window manager extensions and hope that some of them may work for me.

Excellent Experience with Debian Bullseye

I’ve appreciated the bullseye upgrade, like most Debian upgrades. I’m not quite sure how, since I was already running a backports kernel, but somehow the entire system is snappier. Maybe newer X or something? I’m really pleased with it. Hardware integration is even nicer now, particularly the automatic driverless support for scanners in addition to the existing support for printers.

All in all, a very nice upgrade, and pretty painless.

I experienced a few odd situations.

For one, I had been using Gnome Flashback. Since xmonad-log-applet didn’t compile there (due to bitrot in the log applet, not flashback), and I had been finding Gnome Flashback to be a rather dusty and forgotten corner of Gnome for a long time, I decided to try Mate.

Mate just seemed utterly unable to handle a situation with a laptop and an external monitor very well. I want to use only the external monitor with the laptop lid is closed, and it just couldn’t remember how to do the right thing – external monitor on, laptop monitor off, laptop not put into suspend. gdm3 also didn’t seem to be able to put the external monitor to sleep, either, causing a few nights of wasted power.

So off I went to XFCE, which I had been using for years on my workstation anyhow. Lots more settings available in XFCE, plus things Just Worked there. Odd that XFCE, the thin and light DE, is now the one that has the most relevant settings. It seems the Gnome “let’s remove a bunch of features” approach has extended to MATE as well.

When I switched to XFCE, I also removed gdm3 from my system, leaving lightdm as the only DM on it. That matched what my desktop machine was using, and also what task-xfce-desktop called for. But strangely, the XFCE settings for lightdm were completely different between the laptop and the desktop. It turns out that with lightdm, you can have the lightdm-gtk-greeter and the accompanying lightdm-gtk-greeter-settings, or slick-greeter and the accompanying lightdm-settings. One machine had one greeter and settings, and the other had the other. Why, I don’t know. But lightdm-gtk-greeter-settings had the necessary options for putting monitors to sleep on the login screen, so I went with it.

This does highlight a bit of a weakness in Debian upgrades. There is SO MUCH choice in Debian, which I highly value. At some point, almost certainly without my conscious choice, one machine got one greeter and another got the other. Despite both having task-xfce-desktop installed, they got different desktop experiences. There isn’t a great way to say “OK, I know I had a bunch of things installed before, but NOW I want the default bullseye experience”.

But overall, it is an absolutely fantastic distribution. It is great to see this nonprofit community distribution continue to have such quality on such an immense scale. And hard to believe I’ve been a Debian developer for 25 years. That seems almost impossible!

Airgapped / Asynchronous Backups with ZFS over NNCP

In my previous articles in the series on asynchronous communication with the modern NNCP tool, I talked about its use for asynchronous, potentially airgapped, backups. The first article, How & Why To Use Airgapped Backups laid out the foundations for this. Now let’s dig into the details.

Today’s post will cover ZFS, because it has a lot of features that make it very easy to support in this setup. Non-ZFS backups will be covered later.

The setup is actually about as simple as it is for SSH, but since people are less familiar with this kind of communication, I’m going to try to go into more detail here.

Assumptions

I am assuming a setup where:

  • The machines being backed up run ZFS
  • The disk(s) that hold the backups are also running ZFS
  • zfs send / receive is desired as an efficient way to transport the backups
  • The machine that holds the backups may have no network connection whatsoever
  • Backups will be sent encrypted over some sort of network to a spooling machine, which temporarily holds them until they are transported to the destination backup system and ingested there. This system will be unable to decrypt the data streams it temporarily stores.

Hardware

Let’s start with hardware for the machine to hold the backups. I initially considered a Raspberry Pi 4 with 8GB of RAM. That would probably have been a suitable machine, at least for smaller backup sets. However, none of the Raspberry Pi machines support hardware AES encryption acceleration, and my Pi4 benchmarks as about 60MB/s for AES encryption. I want my backups to be encrypted, and decided this would just be too slow for my purposes. Again, if you don’t need encrypted backups or don’t care that much about performance — may people probably fall into this category — you can have a fully-functional Raspberry Pi 4 system for under $100 that would make a fantastic backup server.

I wound up purchasing a Qotom-Q355G4 micro PC with a Core i5 for about $315. It has USB 3 ports and is designed as a rugged, long-lasting system. I have been using one of their older Celeron-based models as my router/firewall for a number of years now and it’s been quite reliable.

For backup storage, you can get a USB 3 external drive. My own preference is to get a USB 3 “toaster” (device that lets me plug in SATA drives) so that I have more control over the underlying medium and can save the expense and hassle of a bunch of power supplies. In a future post, I will discuss drive rotation so you always have an offline drive.

Then, there is the question of transport to the backup machine. A simple solution would be to have a heavily-firewalled backup system that has no incoming ports open but makes occasional outgoing connections to one specific NNCP daemon on the spooling machine. However, for airgapped operation, it would also be very simple to use nncp-xfer to transport the data across on a USB stick or some such. You could set up automounting for a specific USB stick – plug it in, all the spooled data is moved over, then plug it in to the backup system and it’s processed, and any outbound email traffic or whatever is copied to the USB stick at that point too. The NNCP page has some more commentary about this kind of setup.

Both are fairly easy to set up, and NNCP is designed to be transport-agnostic, so in this article I’m going to focus on how to integrate ZFS with NNCP.

Operating System

Of course, it should be no surprise that I set this up on Debian.

As an added step, I did all the configuration in Ansible stored in a local git repo. This adds a lot of work, but it means that it is trivial to periodically wipe and reinstall if any security issue is suspected. The git repo can be copied off to another system for storage and takes the system from freshly-installed to ready-to-use state.

Security

There is, of course, nothing preventing you from running NNCP as root. The zfs commands, obviously, need to be run as root. However, from a privilege separation standpoint, I have chosen to run everything relating to NNCP as a nncp user. NNCP already does encryption, but if you prefer to have zero knowledge of the data even to NNCP, it’s trivial to add gpg to the pipeline as well, and in fact I’ll be demonstrating that in a future post for other reasons.

Software

Besides NNCP, there needs to be a system that generates the zfs send streams. For this project, I looked at quite a few. Most were designed to inspect the list of snapshots on a remote end, compare it to a list on the local end, and calculate a difference from there. This, of course, won’t work for this situation.

I realized my own simplesnap project was very close to being able to do this. It already used an algorithm of using specially-named snapshots on the machine being backed up, so never needed any communication about what snapshots were present where. All it needed was a few more options to permit sending to a stream instead of zfs receive. I made those changes and they are available in simplesnap 2.0.0 or above. That version has also been uploaded to sid, and will work fine as-is on buster as well.

Preparing NNCP

I’m going to assume three hosts in this setup:

  • laptop is the machine being backed up. Of course, you may have quite a few of these.
  • spooler holds the backup data until the backup system picks it up
  • backupsvr holds the backups

The basic NNCP workflow documentation covers the basic steps. You’ll need to run nncp-cfgnew on each machine. This generates a basic configuration, along with public and private keys for that machine. You’ll copy the public key sets to the configurations of the other machines as usual. On the laptop, you’ll add a via line like this:

backupsvr: {
  id: ....
  exchpub: ...
  signpub: ...
  noisepub: ...
  via: ["spooler"]

This tells NNCP that data destined for backupsvr should always be sent via spooler first.

You can then arrange for the nncp-daemon to run on the spooler, and nncp-caller or nncp-call on the backupsvr. Or, alternatively, airgapped between the two with nncp-xfer.

Generating Backup Data

Now, on the laptop, install simplesnap (2.0.0 or above). Although you won’t be backing up to the local system, simplesnap still maintains a hostlock in ZFS. Prepate a dataset for it:

zfs create tank/simplesnap
zfs set org.complete.simplesnap:exclude=on tank/simplesnap

Then, create a script /usr/local/bin/runsimplesnap like this:

#!/bin/bash

set -e

simplesnap --store tank/simplesnap --setname backups --local --host `hostname` \
   --receivecmd /usr/local/bin/simplesnap-queue \
   --noreap

su nncp -c '/usr/local/nncp/bin/nncp-toss -noprogress -quiet'

if ip addr | grep -q 192.168.65.64; then
  su nncp -c '/usr/local/nncp/bin/nncp-call -noprogress -quiet -onlinedeadline 1 spooler'
fi

The call to simplesnap sets it up to send the data to simplesnap-queue, which we’ll create in a moment. The –receivmd, plus –noreap, sets it up to run without ZFS on the local system.

The call to nncp-toss will process any previously-received inbound NNCP packets, if there are any. Then, in this example, we do a very basic check to see if we’re on the LAN (checking 192.168.65.64), and if so, will establish a connection to the spooler to transmit the data. If course, you could also do this over the Internet, with tor, or whatever, but in my case, I don’t want to automatically do this in case I’m tethered to mobile. I figure if I want to send backups in that case, I can fire up nncp-call myself. You can also use nncp-caller to set up automated connections on other schedules; there are a lot of options.

Now, here’s what /usr/local/bin/simplesnap-queue looks like:

#!/bin/bash

set -e
set -o pipefail

DEST="`echo $1 | sed 's,^tank/simplesnap/,,'`"

echo "Processing $DEST" >&2
# stdin piped to this
su nncp -c "/usr/local/nncp/bin/nncp-exec -nice B -noprogress backupsvr zfsreceive '$DEST'" >&2
echo "Queued for $DEST" >&2

This is a pretty simple script. simplesnap will call it with a path based on the –store, with the hostname after; so, for instance, tank/simplesnap/laptop/root or some such. This script strips off the leading tank/simplesnap (which is a local fragment), leaving the host and dataset paths. Then it just pipes it to nncp-exec. -nice B classifies it as low-priority bulk data (so if you have some more important interactive data, it would be sent first), then passes it to whatever the backupsvr defines as zfsreceive.

Receiving ZFS backups

In the NNCP configuration on the recipient’s side, in the laptop section, we define what command it’s allowed to run as zfsreceive:

      exec: {
        zfsreceive: ["/usr/bin/sudo", "-H", "/usr/local/bin/nncp-zfs-receive"]
      }

We authorize the nncp user to run this under sudo in /etc/sudoers.d/local–nncp:

Defaults env_keep += "NNCP_SENDER"
nncp ALL=(root) NOPASSWD: /usr/local/bin/nncp-zfs-receive

The NNCP_SENDER is the public key ID of the sending node when nncp-toss processes the incoming data. We can use that for sanity checking later.

Now, here’s a basic nncp-zfs-receive script:

#!/bin/bash
set -e
set -o pipefail

STORE=backups/simplesnap
DEST="$1"

# now process stdin
runcommand zfs receive -o readonly=on -x mountpoint "$STORE/$DEST"

And there you have it — all the basics are in place.

Update 2020-12-30: An earlier version of this article had “zfs receive -F” instead of “zfs receive -o readonly=on -x mountpoint”. These changed arguments are more robust.
Update 2021-01-04: I am now recommending “zfs receive -u -o readonly=on”; see my successor article for more.

Enhancements

You could enhance the nncp-zfs-receive script to improve logging and error handling. For instance:

#!/bin/bash

set -e
set -o pipefail

STORE=backups/simplesnap
# $1 will be the host/dataset

DEST="$1"
HOST="`echo "$1" | sed 's,/.*,,g'`"
if [ -z "$HOST" ]; then
   echo "Malformed command line"
   exit 5
fi

# Log a message
logit () {
   logger -p info -t "`basename "$0"`[$$]" "$1"
}

# Log an error message
logerror () {
   logger -p err -t "`basename "$0"`[$$]" "$1"
}

# Log stdin with the given code.  Used normally to log stderr.
logstdin () {
   logger -p info -t "`basename "$0"`[$$/$1]"
}

# Run command, logging stderr and exit code
runcommand () {
   logit "Running $*"
   if "$@" 2> >(logstdin "$1") ; then
      logit "$1 exited successfully"
      return 0
   else
       RETVAL="$?"
       logerror "$1 exited with error $RETVAL"
       return "$RETVAL"
   fi
}
exiterror () {
   logerror "$1"
   echo "$1" 1>&2
   exit 10
}

# Sanity check

if [ "$HOST" = "laptop" ]; then
  if [ "$NNCP_SENDER" != "12345678" ]; then
    exiterror "Host $HOST doesn't match sender $NNCP_SENDER"
  fi
else
  exiterror "Unknown host $HOST"
fi

runcommand zfs receive -F "$STORE/$DEST"

Now you’ll capture the ZFS receive output in syslog in a friendly way, so you can look back later why things failed if they did.

Further notes on NNCP

nncp-toss will examine the exit code from an invocation. If it is nonzero, it will keep the command (and associated stdin) in the queue and retry it on the next invocation. NNCP does not guarantee order of execution, so it is possible in some cases that ZFS streams may be received in the wrong order. That is fine here; zfs receive will exit with an error, and nncp-toss will just run it again after the dependent snapshots have been received. For non-ZFS backups, a simple sequence number can handle this issue.

Tips for Upgrading to, And Securing, Debian Buster

Wow.  Once again, a Debian release impresses me — a guy that’s been using Debian for more than 20 years.  For the first time I can ever recall, buster not only supported suspend-to-disk out of the box on my laptop, but it did so on an encrypted volume atop LVM.  Very impressive!

For those upgrading from previous releases, I have a few tips to enhance the experience with buster.

AppArmor

AppArmor is a new line of defense against malicious software.  The release notes indicate it’s now enabled by default in buster.  For desktops, I recommend installing apparmor-profiles-extra apparmor-notify.  The latter will provide an immediate GUI indication when something is blocked by AppArmor, so you can diagnose strange behavior.  You may also need to add userself to the adm group with adduser username adm.

Security

I recommend installing these packages and taking note of these items, some of which are different in buster:

  • unattended-upgrades will automatically install security updates for you.  New in buster, the default config file will also apply stable updates in addition to security updates.
  • needrestart will detect what processes need a restart after a library update and, optionally, restart them. Beginning in buster, it will not automatically restart them when in noninteractive (unattended-upgrades) mode. This can be changed by editing /etc/needrestart/needrestart.conf (or, better, putting a .conf file in /etc/needrestart/conf.d) and setting $nrconf{restart} = 'a'. Edit: If you have an Intel CPU, installing iucode-tool intel-microcode will let needrestart also check on your CPU microcode.
  • debian-security-support will warn you of gaps in security support for packages you are installing or running.
  • package-update-indicator is useful for desktops that won’t be running unattended-upgrades. I believe Gnome 3 has this built in, but for other desktops, this adds an icon when updates are available.
  • You can harden apt with seccomp.
  • You can enable UEFI secure boot.

Tuning

If you hadn’t noticed, many of these items are links into the buster release notes. It’s a good document to read over, even for a new buster install.

Goodbye to a 15-year-old Debian server

It was October of 2003 that the server I’ve called “glockenspiel” was born. It was the early days of Linux-based VM hosting, using a VPS provider called memset, running under, of all things, User Mode Linux. Over the years, it has been migrated around, sometimes running on the metal and sometimes in a VM. The operating system has been upgraded in-place using standard Debian upgrades over the years, and is now happily current on stretch (albeit with a 32-bit userland). But it has never been reinstalled. When I’d migrate hosting providers, I’d use tar or rsync to stream glockenspiel across the Internet to its new home.

A lot of people reinstall an OS when a new version comes out. I’ve been doing Debian upgrades with apt for ages, and this one is a case in point. It lingers.

Root’s .profile was last modified in November 2004, and its .bashrc was last modified in December 2004. My own home directory still has a .pinerc, .gopherrc, and .arch-params file. I last edited my .vimrc in 2003 and my .emacs dates back to 2002 (having been copied over from a pre-glockenspiel FreeBSD server).

drwxr-xr-x  3 jgoerzen jgoerzen      4096 Dec  3  2003 irclogs
-rw-r--r--  1 jgoerzen jgoerzen       373 Dec  3  2003 .vimrc
-rw-r--r--  1 jgoerzen jgoerzen       651 Nov 27  2003 .reportbugrc
drwx------  3 jgoerzen jgoerzen      4096 Sep  2  2003 .arch-params
-rw-r--r--  1 jgoerzen jgoerzen      1115 Aug 23  2003 .gopherrc
drwxr-xr-x  3 jgoerzen jgoerzen      4096 Jul 18  2003 .subversion
-rw-r--r--  1 jgoerzen jgoerzen     15317 Jun 21  2003 .pinerc

Poking around /etc on glockenspiel is like a trip back in time. Various apache sites still have configuration files around, but have long since been disabled. Over the years, glockenspiel has hosted source code repositories using Subversion, arch, tla, darcs, mercurial and git. It’s hosted websites using Drupal, WordPress, Serendipity, and so forth. It’s hosted gopher sites, websites or mailing lists for various Free Software projects (such as Freeciv), and any number of local charitable organizations. Remnants of an FTP configuration still exist, when people used web design software to build websites for those organizations on their PCs and then upload them to glockenspiel.

-rw-r--r--   1 root  root                      268 Dec 25  2005 libnet.cfg
-rw-r-----   1 root  root                     1305 Nov 11  2004 mrtg.cfg
-rw-r--r--   1 root  root                      552 Jul 31  2004 pam.conf

All this has been replaced by a set of Docker containers running my docker-debian-base software. They’re all in git, I can rebuild one of the containers in a few seconds or a few minutes by typing “make”, and there is no cruft from 2002. There are a lot of benefits to this.

And yet, there is a part of me that feels it’s all so… cold. Servers having “personalities” was always a distinctly dubious thing, but these days as we work through more and more layers of virtualization and indirection and become more distant from the hardware, we lose an appreciation for what we have and the many shoulders of giants upon which we stand.

And, so with that, the final farewell to this server that’s been running since 2003:

glockenspiel:/etc# shutdown -P now
Shared connection to glockenspiel.complete.org closed.

A (Partial) Defense of Debian

I was sad to read on his blog that Michael Stapelberg is winding down his Debian involvement. In his post, he outlined some critiques of Debian. In his post, I want to acknowledge that he is on point with some of them, but also push back on others. Some of this is also a response to some of the comments on Hacker News.

I’d first like to discuss some of the assumptions I believe his post rests on: namely that “we’ve always done it this way” isn’t a good reason to keep doing something. I completely agree. However, I would also say that “this thing is newer, so it’s better and we should use it” is also poor reasoning. Newer is not always better. Sometimes it is, sometimes it’s not, but deeper thought is generally required.

Also, when thinking about why things are a certain way or why people prefer certain approaches, we must often ask “why does that make sense to them?” So let’s dive in.

Debian’s Perspective: Stability

Stability, of course, can mean software that tends not to crash. That’s important, but there’s another aspect of it that is also important: software continuing to act the same over time. For instance, if you wrote a C program in 1985, will that program still compile and run today? Granted, that’s a bit of an extreme example, but the point is: to what extent can you count on software you need continuing to operate without forced change?

People that have been sysadmins for a long period of time will instantly recognize the value of this kind of stability. Change is expensive and difficult, and often causes outages and incidents as bugs are discovered when software is adapted to a new environment. Being able to keep up-to-date with security patches while also expecting little or no breaking changes is a huge win. Maintaining backwards compatibility for old software is also important.

Even from a developer’s perspective, lack of this kind of stability is why I have handed over maintainership of most of my Haskell software to others. Some of my Haskell projects were basically “done”, and every so often I’d get bug reports that it no longer compiles due to some change in the base library. Occasionally I’d have patches with those bug reports, but they would inevitably break compatibility with older versions (even though the language has plenty good support for something akin to a better version of #ifdefs to easily deal with this.) The culture of stability was not there.

This is not to say that this kind of stability is always good or always bad. In the Haskell case, there is value to be had in fixing designs that are realized to be poor and removing cruft. Some would say that strcpy() should be removed from libc for this reason. People that want the latest versions of gimp or whatever are probably not going to be running Debian stable. People that want to install a machine and not really be burdened by it for a couple of years are.

Debian has, for pretty much its entire life, had a large proportion of veteran sysadmins and programmers as part of the organization. Many of us have learned the value of this kind of stability from the school of hard knocks – over and over again. We recognize the value of something that just works, that is so stable that things like unattended-upgrades are safe and reliable. With many other distros, something like this isn’t even possible; when your answer to a security bug is to “just upgrade to the latest version”, just trusting a cron job to do it isn’t going to work because of the higher risk.

Recognizing Personal Preference

Writing about Debian’s bug-tracking tool, Michael says “It is great to have a paper-trail and artifacts of the process in the form of a bug report, but the primary interface should be more convenient (e.g. a web form).” This is representative of a personal preference. A web form might be more convenient for Michael — I have no reason to doubt this — but is it more convenient for everyone? I’d say no.

In his linked post, Michael also writes: “Recently, I was wondering why I was pushing off accepting contributions in Debian for longer than in other projects. It occurred to me that the effort to accept a contribution in Debian is way higher than in other FOSS projects. My remaining FOSS projects are on GitHub, where I can just click the “Merge” button after deciding a contribution looks good. In Debian, merging is actually a lot of work: I need to clone the repository, configure it, merge the patch, update the changelog, build and upload. “

I think that’s fair for someone that wants a web-based workflow. Allow me to present the opposite: for me, I tend to push off contributions that only come through Github, and the reason is that, for me, they’re less convenient. It’s also harder for me to contribute to Github projects than Debian ones. Let’s look at this – say I want to send in a small patch for something. If it’s Github, it’s going to look like this:

  1. Go to the website for the thing, click fork
  2. Now clone that fork or add it to my .git/config, hack, and commit
  3. Push the commit, go back to the website, and submit a PR
  4. Github’s email integration is so poor that I basically have to go back to the website for most parts of the conversation. I can do little from the comfort of mu4e.
  5. Remember to clean up my fork after the patch is accepted or rejected.

Compare that to how I’d contribute with Debian:

  1. Hack (and commit if I feel like it)
  2. Type “reportbug foo”, attach my patch
  3. Followup conversation happens directly in email where it’s convenient to reply

How about as the developer? Github constantly forces me to their website. I can’t very well work on bug reports, etc. without a strong Internet connection. And it’s designed to push people into using their tools and their interface, which is inferior in a lot of ways to a local interface – but then the process to pull down someone else’s set of patches involves a lot of typing and clicking, much more that would be involved from a simple git format-patch. In short, I don’t have my shortcut keys, my environment, etc. for reviewing things – the roadblocks are there to make me use theirs.

If I get a contribution from someone in debbugs, it’s so much easier. It’s usually just git apply or patch -p1 and boom, I can see exactly what’s changed and review it. A review comment is just a reply to an email. I don’t have to ever fire up a web browser. So much more convenient.

I don’t write this to say Michael is wrong about what’s more convenient for him. I write it to say he’s wrong about what’s more convenient for me (or others). It may well be the case that debbugs is so inconvenient that it pushes him to leave while github is so inconvenient for others that it pushes them to avoid it.

I will note before leaving this conversation that there are some command-line tools available for Github and a web interface to debbugs, but it is still clear that debbugs is a lot easier to work with from within my own mail reader and tooling, and Github is a lot easier to work with from within a web browser.

The case for reportbug

I remember the days before we had reportbug. Over and over and over again, I would get bug reports from users that wouldn’t have the basic information needed to investigate. reportbug gathers information from the system: package versions, configurations, versions of dependencies, etc. A simple web form can’t do this because it doesn’t have a local agent. From a developer’s perspective, trying to educate users on how to do this over and over as an unending, frustrating, and counter-productive task. Even if it’s clearly documented, the battle will be fought over and over. From a user’s perspective, having your bug report ignored or told you’re doing it wrong is frustrating too.

So I think reportbug is much nicer than having some github-esque web-based submission form. Could it be better? Sure. I think a mode to submit the reportbug report via HTTPS instead of email would make sense, since a lot of machines no longer have local email configured.

Where Debian Should Improve

I agree that there are areas where Debian should improve.

Michael rightly identifies the “strong maintainer” concept as a source of trouble. I agree. Though we’ve been making slow progress over time with things like low-threshold NMU and maintainer teams, the core assumption that a maintainer has a lot of power over particular packages is one that needs to be thrown out.

Michael, and commentators on HN, both identify things that boil down to documentation problems. I have heard so many times that it’s terribly hard to package something up for Debian. That’s really not the case for most things; dh_make and similar tools will do the right thing for many packages, and all you have to do is add some package descriptions and such. I wrote a “concise guide” to packaging for my workplace that ran to only about 2 pages. But it is true that the documentation on debian.org doesn’t clearly offer this simple path, so people are put off and not aware of it. Then there were the comments about how hard it is to become a Debian developer, and how easy it is to submit a patch to NixOS or some such. The fact is, these are different things; one does not need to be a Debian Developer to contribute to Debian. A DD is effectively the same as a patch approver elsewhere; these are the people that can ultimately approve software for insertion into the OS, and you DO want an element of trust there. Debian could do more to offer concise guides for drive-by contributions and the building of packages that follow standard language community patterns, both of which can be done without much knowledge of packaging tools and inner workings of the project.

Finally, I have distanced myself from conversations in Debian for some time, due to lack of time to participate in what I would call excessive bikeshedding. This is hardly unique to Debian, but I am glad to see the project putting more effort into expecting good behavior from conversations of late.


How are you handling building local Debian/Ubuntu packages?

I’m in the middle of some conversations about Debian/Ubuntu repositories, and I’m curious how others are handling this.

How are people maintaining repos for an organization? Are you integrating them with a git/CI (github/gitlab, jenkins/travis, etc) workflow? How do packages propagate into repos? How do you separate prod from testing? Is anyone running buildd locally, or integrating with more common CI tools?

I’m also interested in how people handle local modifications of packages — anything from newer versions of C libraries to newer interpreters. Do you just use the regular Debian toolchain, packaging them up for (potentially) the multiple distros/versions that you have in production? Pin them in apt?

Just curious what’s out there.

Some Googling has so far turned up just one relevant hit: Michael Prokop’s DebConf15 slides, “Continuous Delivery of Debian packages”. Looks very interesting, and discusses jenkins-debian-glue.

Some tangentially-related but interesting items:

Edit 2018-02-02: I should have also mentioned BuildStream

Switching to xmonad + Gnome – and ditching a Mac

I have been using XFCE with xmonad for years now. I’m not sure exactly how many, but at least 6 years, if not closer to 10. Today I threw in the towel and switched to Gnome.

More recently, at a new job, I was given a Macbook Pro. I wasn’t entirely sure what to think of this, but I thought I’d give it a try. I found MacOS to be extremely frustrating and confining. It had no real support for a tiling window manager, and although projects like amethyst tried to approximate what xmonad can do on Linux, they were just too limited by the platform and were clunky. Moreover, the entire UI was surprisingly sluggish; maybe that was an induced effect from animations, but I don’t think that explains it. A Debisn stretch install, even on inferior hardware, was snappy in a way that MacOS never was. So I have requested to swap for a laptop that will run Debian. The strange use of Command instead of Control for things, combined with the overall lack of configurability of keybindings, meant that I was going to always be fighting muscle memory moving from one platform to another. Not only that, but being back in the world of a Free Software OS means a lot.

Now then, back to xmonad and XFCE situation. XFCE once worked very well with xmonad. Over the years, this got more challenging. Around the jessie (XFCE 4.10) time, I had to be very careful about when I would let it save my session, because it would easily break. With stretch, I had to write custom scripts because the panel wouldn’t show up properly, and even some application icons would be invisible, if things were started in a certain order. This took much trial and error and was still cumbersome.

Gnome 3, with its tightly-coupled Gnome Shell, has never been compatible with other window managers — at least not directly. A person could have always used MATE with xmonad — but a lot of people that run XFCE tend to have some Gnome 3 apps (for instance, evince) anyhow. Cinnamon also wouldn’t work with xmonad, because it is simply another tightly-coupled shell instead of Gnome Shell. And then today I discovered gnome-flashback. gnome-flashback is a Gnome 3 environment that uses the traditional X approach with a separate window manager (metacity of yore by default). Sweet.

It turns out that Debian’s xmonad has built-in support for it. If you know the secret: apt-get install gnome-session-flashback (OK, it’s not so secret; it’s even in xmonad’s README.Debian these days) Install that, plus gnome and gdm3 and things are nice. Configure xmonad with GNOME support and poof – goodness right out of the box, selectable from the gdm sessions list.

I still have some gripes about Gnome’s configurability (or lack thereof). But I’ve got to say: This environment is the first one I’ve ever used that got external display switching very nearly right without any configuration, and I include MacOS in that. Plug in an external display, and poof – it’s configured and set up. You can hit a toggle key (Windows+P by default) to change the configurations, or use the Display section in gnome-control-center. Unplug it, and it instantly reconfigures itself to put everything back on the laptop screen. Yessss! I used to have scripts to do this in the wheezy/jessie days. XFCE in stretch had numerous annoying failures in this area which rendered the internal display completely dark until the next reboot – very frustrating. With Gnome, it just works. And, even if you have “suspend on lid closed” turned on, if the system is powered up and hooked up to an external display, it will keep running even if the lid is closed, figuring you must be using it on the external screen. Another thing the Mac wouldn’t do well.

All in all, some pretty good stuff here. I continue to be impressed by stretch. It is darn impressive to put this OS on generic hardware and have it outshine the closed-ecosystem Mac!

Silent Data Corruption Is Real

Here’s something you never want to see:

ZFS has detected a checksum error:

   eid: 138
 class: checksum
  host: alexandria
  time: 2017-01-29 18:08:10-0600
 vtype: disk

This means there was a data error on the drive. But it’s worse than a typical data error — this is an error that was not detected by the hardware. Unlike most filesystems, ZFS and btrfs write a checksum with every block of data (both data and metadata) written to the drive, and the checksum is verified at read time. Most filesystems don’t do this, because theoretically the hardware should detect all errors. But in practice, it doesn’t always, which can lead to silent data corruption. That’s why I use ZFS wherever I possibly can.

As I looked into this issue, I saw that ZFS repaired about 400KB of data. I thought, “well, that was unlucky” and just ignored it.

Then a week later, it happened again. Pretty soon, I noticed it happened every Sunday, and always to the same drive in my pool. It so happens that the highest I/O load on the machine happens on Sundays, because I have a cron job that runs zpool scrub on Sundays. This operation forces ZFS to read and verify the checksums on every block of data on the drive, and is a nice way to guard against unreadable sectors in rarely-used data.

I finally swapped out the drive, but to my frustration, the new drive now exhibited the same issue. The SATA protocol does include a CRC32 checksum, so it seemed (to me, at least) that the problem was unlikely to be a cable or chassis issue. I suspected motherboard.

It so happened I had a 9211-8i SAS card. I had purchased it off eBay awhile back when I built the server, but could never get it to see the drives. I wound up not filling it up with as many drives as planned, so the on-board SATA did the trick. Until now.

As I poked at the 9211-8i, noticing that even its configuration utility didn’t see any devices, I finally started wondering if the SAS/SATA breakout cables were a problem. And sure enough – I realized I had a “reverse” cable and needed a “forward” one. $14 later, I had the correct cable and things are working properly now.

One other note: RAM errors can sometimes cause issues like this, but this system uses ECC DRAM and the errors would be unlikely to always manifest themselves on a particular drive.

So over the course of this, had I not been using ZFS, I would have had several megabytes of reads with undetected errors. Thanks to using ZFS, I know my data integrity is still good.

I’m switching from git-annex to Syncthing

I wrote recently about using git-annex for encrypted sync, but due to a number of issues with it, I’ve opted to switch to Syncthing.

I’d been using git-annex with real but noncritical data. Among the first issues I noticed was occasional but persistent high CPU usage spikes, which once started, would persist apparently forever. I had an issue where git-annex tried to replace files I’d removed from its repo with broken symlinks, but the real final straw was a number of issues with the gcrypt remote repos. git-remote-gcrypt appears to have a number of issues with possible race conditions on the remote, and at least one of them somehow caused encrypted data to appear in a packfile on a remote repo. Why there was data in a packfile there, I don’t know, since git-annex is supposed to keep the data out of packfiles.

Anyhow, git-annex is still an awesome tool with a lot of use cases, but I’m concluding that live sync to an encrypted git remote isn’t quite there yet enough for me.

So I looked for alternatives. My main criteria were supporting live sync (via inotify or similar) and not requiring the files to be stored unencrypted on a remote system (my local systems all use LUKS). I found Syncthing met these requirements.

Syncthing is pretty interesting in that, like git-annex, it doesn’t require a centralized server at all. Rather, it forms basically a mesh between your devices. Its concept is somewhat similar to the proprietary Bittorrent Sync — basically, all the nodes communicate about what files and chunks of files they have, and the changes that are made, and immediately propagate as much as possible. Unlike, say, Dropbox or Owncloud, Syncthing can actually support simultaneous downloads from multiple remotes for optimum performance when there are many changes.

Combined with syncthing-inotify or syncthing-gtk, it has immediate detection of changes and therefore very quick propagation of them.

Syncthing is particularly adept at figuring out ways for the nodes to communicate with each other. It begins by broadcasting on the local network, so known nearby nodes can be found directly. The Syncthing folks also run a discovery server (though you can use your own if you prefer) that lets nodes find each other on the Internet. Syncthing will attempt to use UPnP to configure firewalls to let it out, but if that fails, the last resort is a traffic relay server — again, a number of volunteers host these online, but you can run your own if you prefer.

Each node in Syncthing has an RSA keypair, and what amounts to part of the public key is used as a globally unique node ID. The initial link between nodes is accomplished by pasting the globally unique ID from one node into the “add node” screen on the other; the user of the first node then must accept the request, and from that point on, syncing can proceed. The data is all transmitted encrypted, of course, so interception will not cause data to be revealed.

Really my only complaint about Syncthing so far is that, although it binds to localhost, the web GUI does not require authentication by default.

There is an ITP open for Syncthing in Debian, but until then, their apt repo works fine. For syncthing-gtk, the trusty version of the webupd8 PPD works in Jessie (though be sure to pin it to a low priority if you don’t want it replacing some unrelated Debian packages).