Fast, Ordered Unixy Queues over NNCP and Syncthing with Filespooler

It seems that lately I’ve written several shell implementations of a simple queue that enforces ordered execution of jobs that may arrive out of order. After writing this for the nth time in bash, I decided it was time to do it properly. But first, a word on the why of it all.

Why did I bother?

My needs arose primarily from handling Backups over Asynchronous Communication methods – in this case, NNCP. When backups contain incrementals that are unpacked on the destination, they must be applied in the correct order.

In some cases, like ZFS, the receiving side will detect an out-of-order backup file and exit with an error. In those cases, processing in random order is acceptable but can be slow if, say, hundreds or thousands of hourly backups have stacked up over a period of time. The same goes for using gitsync-nncp to synchronize git repositories. In both cases, a best effort based on creation date is sufficient to produce a significant performance improvement.

With other cases, such as tar or dar backups, the receiving cannot detect out of order incrementals. In those situations, the incrementals absolutely must be applied with strict ordering. There are many other situations that arise with these needs also. Filespooler is the answer to these.

Existing Work

Before writing my own program, I of course looked at what was out there already. I looked at celeary, gearman, nq, rq, cctools work queue, ts/tsp (task spooler), filequeue, dramatiq, GNU parallel, and so forth.

Unfortunately, none of these met my needs at all. They all tended to have properties like:

  • An extremely complicated client/server system that was incompatible with piping data over existing asynchronous tools
  • A large bias to processing of small web requests, resulting in terrible inefficiency or outright incompatibility with jobs in the TB range
  • An inability to enforce strict ordering of jobs, especially if they arrive in a different order from how they were queued

Many also lacked some nice-to-haves that I implemented for Filespooler:

  • Support for the encryption and cryptographic authentication of jobs, including metadata
  • First-class support for arbitrary compressors
  • Ability to use both stream transports (pipes) and filesystem-like transports (eg, rclone mount, S3, Syncthing, or Dropbox)

Introducing Filespooler

Filespooler is a tool in the Unix tradition: that is, do one thing well, and integrate nicely with other tools using the fundamental Unix building blocks of files and pipes. Filespooler itself doesn’t provide transport for jobs, but instead is designed to cooperate extremely easily with transports that can be written to as a filesystem or piped to – which is to say, almost anything of interest.

Filespooler is written in Rust and has an extensive Filespooler Reference as well as many tutorials on its homepage. To give you a few examples, here are some links:

Basics of How it Works

Filespooler is intentionally simple:

  • The sender maintains a sequence file that includes a number for the next job packet to be created.
  • The receiver also maintains a sequence file that includes a number for the next job to be processed.
  • fspl prepare creates a Filespooler job packet and emits it to stdout. It includes a small header (<100 bytes in most cases) that includes the sequence number, creation timestamp, and some other useful metadata.
  • You get to transport this job packet to the receiver in any of many simple ways, which may or may not involve Filespooler’s assistance.
  • On the receiver, Filespooler (when running in the default strict ordering mode) will simply look at the sequence file and process jobs in incremental order until it runs out of jobs to process.

The name of job files on-disk matches a pattern for identification, but the content of them is not significant; only the header matters.

You can send job data in three ways:

  1. By piping it to fspl prepare
  2. By setting certain environment variables when calling fspl prepare
  3. By passing additional command-line arguments to fspl prepare, which can optionally be passed to the processing command at the receiver.

Data piped in is added to the job “payload”, while environment variables and command-line parameters are encoded in the header.

Basic usage

Here I will excerpt part of the Using Filespooler over Syncthing tutorial; consult it for further detail. As a bit of background, Syncthing is a FLOSS decentralized directory synchronization tool akin to Dropbox (but with a much richer feature set in many ways).

Preparation

First, on the receiver, you create the queue (passing the directory name to -q):

sender$ fspl queue-init -q ~/sync/b64queue

Now, we can send a job like this:

sender$ echo Hi | fspl prepare -s ~/b64seq -i - | fspl queue-write -q ~/sync/b64queue

Let’s break that down:

  • First, we pipe “Hi” to fspl prepare.
  • fspl prepare takes two parameters:
    • -s seqfile gives the path to a sequence file used on the sender side. This file has a simple number in it that increments a unique counter for every generated job file. It is matched with the nextseq file within the queue to make sure that the receiver processes jobs in the correct order. It MUST be separate from the file that is in the queue and should NOT be placed within the queue. There is no need to sync this file, and it would be ideal to not sync it.
    • The -i option tells fspl prepare to read a file for the packet payload. -i - tells it to read stdin for this purpose. So, the payload will consist of three bytes: “Hi\n” (that is, including the terminating newline that echo wrote)
  • Now, fspl prepare writes the packet to its stdout. We pipe that into fspl queue-write:
    • fspl queue-write reads stdin and writes it to a file in the queue directory in a safe manner. The file will ultimately match the fspl-*.fspl pattern and have a random string in the middle.

At this point, wait a few seconds (or however long it takes) for the queue files to be synced over to the recipient.

On the receiver, we can see if any jobs have arrived yet:

receiver$ fspl queue-ls -q ~/sync/b64queue
ID                   creation timestamp          filename
1                    2022-05-16T20:29:32-05:00   fspl-7b85df4e-4df9-448d-9437-5a24b92904a4.fspl

Let’s say we’d like some information about the job. Try this:

receiver$ $ fspl queue-info -q ~/sync/b64queue -j 1
FSPL_SEQ=1
FSPL_CTIME_SECS=1652940172
FSPL_CTIME_NANOS=94106744
FSPL_CTIME_RFC3339_UTC=2022-05-17T01:29:32Z
FSPL_CTIME_RFC3339_LOCAL=2022-05-16T20:29:32-05:00
FSPL_JOB_FILENAME=fspl-7b85df4e-4df9-448d-9437-5a24b92904a4.fspl
FSPL_JOB_QUEUEDIR=/home/jgoerzen/sync/b64queue
FSPL_JOB_FULLPATH=/home/jgoerzen/sync/b64queue/jobs/fspl-7b85df4e-4df9-448d-9437-5a24b92904a4.fspl

This information is intentionally emitted in a format convenient for parsing.

Now let’s run the job!

receiver$ fspl queue-process -q ~/sync/b64queue --allow-job-params base64
SGkK

There are two new parameters here:

  • --allow-job-params says that the sender is trusted to supply additional parameters for the command we will be running.
  • base64 is the name of the command that we will run for every job. It will:
    • Have environment variables set as we just saw in queue-info
    • Have the text we previously prepared – “Hi\n” – piped to it

By default, fspl queue-process doesn’t do anything special with the output; see Handling Filespooler Command Output for details on other options. So, the base64-encoded version of our string is “SGkK”. We successfully sent a packet using Syncthing as a transport mechanism!

At this point, if you do a fspl queue-ls again, you’ll see the queue is empty. By default, fspl queue-process deletes jobs that have been successfully processed.

For more

See the Filespooler homepage.


This blog post is also available as a permanent, periodically-updated page.

Tools for Communicating Offline and in Difficult Circumstances

Note: this post is also available on my website, where it will be updated periodically.

When things are difficult – maybe there’s been a disaster, or an invasion (this page is being written in 2022 just after Russia invaded Ukraine), or maybe you’re just backpacking off the grid – there are tools that can help you keep in touch, or move your data around. This page aims to survey some of them, roughly in order from easiest to more complex.

Simple radios

Handheld radios shouldn’t be forgotten. They are cheap, small, and easy to operate. Their range isn’t huge – maybe a couple of miles in rural areas, much less in cities – but they can be a useful place to start. They tend to have no actual encryption features (the “privacy” features really aren’t.) In the USA, options are FRS/GMRS and CB.

Syncthing

With Syncthing, you can share files among your devices or with your friends. Syncthing essentially builds a private mesh for file sharing. Devices will auto-discover each other when on the same LAN or Wifi network, and opportunistically sync.

I wrote more about offline uses of Syncthing, and its use with NNCP, in my blog post A simple, delay-tolerant, offline-capable mesh network with Syncthing (+ optional NNCP). Yes, it is a form of a Mesh Network!

Homepage: https://syncthing.net/

Briar

Briar is an instant messaging service based around Android. It’s IM with a twist: it can use a mesh of Bluetooh devices. Or, if Internet is available, Tor. It has even been extended to support the use of SD cards and USB sticks to carry your messages.

Like some others here, it can relay messages for third parties as well.

Homepage: https://briarproject.org/

Manyverse and Scuttlebutt

Manyverse is a client for Scuttlebutt, which is a sort of asynchronous, offline-friendly social network. You can use it to keep in touch with your family and friends, and it supports syncing over Bluetooth and Wifi even in the absence of Internet.

Homepages: https://www.manyver.se/ and https://scuttlebutt.nz/

Yggdrasil

Yggdrasil is a self-healing, fully end-to-end Encrypted Mesh Network. It can work among local devices or on the global Internet. It has network services that can egress onto things like Tor, I2P, and the public Internet. Yggdrasil makes a perfect companion to ad-hoc wifi as it has auto peer discovery on the local network.

I talked about it in more detail in my blog post Make the Internet Yours Again With an Instant Mesh Network.

Homepage: https://yggdrasil-network.github.io/

Ad-Hoc Wifi

Few people know about the ad-hoc wifi mode. Ad-hoc wifi lets devices in range talk to each other without an access point. You just all set your devices to the same network name and password and there you go. However, there often isn’t DHCP, so IP configuration can be a bit of a challenge. Yggdrasil helps here.

NNCP

Moving now to more advanced tools, NNCP lets you assemble a network of peers that can use Asynchronous Communication over sneakernet, USB drives, radios, CD-Rs, Internet, tor, NNCP over Yggdrasil, Syncthing, Dropbox, S3, you name it . NNCP supports multi-hop file transfer and remote execution. It is fully end-to-end encrypted. Think of it as the offline version of ssh.

Homepage: https://nncp.mirrors.quux.org/

Meshtastic

Meshtastic uses long-range, low-power LoRa radios to build a long-distance, encrypted, instant messaging system that is a Mesh Network. It requires specialized hardware, about $30, but will tend to get much better range than simple radios, and with very little power.

Homepages: https://meshtastic.org/ and https://meshtastic.letstalkthis.com/

Portable Satellite Communicators

You can get portable satellite communicators that can send SMS from anywhere on earth with a clear view of the sky. The Garmin InReach mini and Zoleo are two credible options. Subscriptions range from about $10 to $40 per month depending on usage. They also have global SOS features.

Telephone Lines

If you have a phone line and a modem, UUCP can get through just about anything. It’s an older protocol that lacks modern security, but will deal with slow and noisy serial lines well. XBee SX radios also have a serial mode that can work well with UUCP.

Additional Suggestions

It is probably useful to have a Linux live USB stick with whatever software you want to use handy. Debian can be installed from the live environment, or you could use a security-focused distribution such as Tails or Qubes.

References

This page originated in my Mastodon thread and incorporates some suggestions I received there.

It also formed a post on my blog.

KDE: A Nice Tiling Envieonment and a Surprisingly Awesome DE

I recently wrote that managing an external display on Linux shouldn’t be this hard. I went down a path of trying out some different options before finally landing at an unexpected place: KDE. I say “unexpected” because I find tiling window managers are just about a necessity.

Background: xmonad

Until a few months ago, I’d been using xmonad for well over a decade. Configurable, minimal, and very nice; it suited me well.

However, xmonad is getting somewhat long in the tooth. xmobar, which is commonly used with it, barely supports many modern desktop environments. I prefer DEs for the useful integrations they bring: everything from handling mount of USB sticks to display auto-switching and sound switching. xmonad itself can’t run with modern Gnome (whether or not it runs well under KDE 5 seems to be a complicated question, according to wikis, but in any case, there is no log applet for KDE 5). So I was left with XFCE and such, but the isues I identified in the “shouldn’t be this hard” article were bad enough that I just could not keep going that way.

An attempt: Gnome and PaperWM

So I tried Gnome under Wayland, reasoning that Wayland might stand a chance of doing things well where X couldn’t. There are several tiling window extensions available for the Gnome 3 shell. Most seemed to be rather low-quality, but an exception was PaperWM and I eventually decided on it. I never quite decided if I liked its horizontal tape of windows or not; it certainly is unique in any case.

I was willing to tolerate my usual list of Gnome problems for the sake of things working. For instance:

  • The Windows-like “settings are spread out in three different programs and some of them require editing the registry[dconf]”. Finding all the options for keybindings and power settings was a real chore, but done.
  • Some file dialog boxes (such as with the screenshot-taking tool) just do not let me type in a path to save a file, insisting that I first navigate to a directory and then type in a name.
  • General lack of available settings or hiding settings from people.
  • True focus-follows-mouse was incompatible with keyboard window switching (PaperWM or no); with any focus-follows-mouse enabled, using Alt-Tab or any other method to switch to other windows would instantly have focus returned to whatever the mouse was over.

Under Wayland, I found a disturbing lack of logs. There was nothing like /var/log/Xorg.0.log, nothing like ~/.xsession-errors, just nothing. Searching for answers on this revealed a lot of Wayland people saying “it’s a Gnome issue” and the trail going cold at that point.

And there was a weird problem that I just could not solve. After the laptop was suspended and we-awakened, I would be at a lock screen. I could type in my password, but when hitting Enter, the thing would then tend to freeze. Why, I don’t know. It seemed related to Gnome shell; when I switched Gnome from Wayland to X11, it would freeze but eventually return to the unlock screen, at which point I’d type in my password and it would freeze again. I spent a long time tracking down logs to see what was happening, but I couldn’t figure it out. All those hard resets were getting annoying.

Enter KDE

So I tried KDE. I had seen mentions of kwin-tiling, a KDE extension for tiling windows. I thought I’d try this setup.

I was really impressed by KDE’s quality. Not only did it handle absolutely every display-related interaction correctly by default, with no hangs ever, all relevant settings were clearly presented in one place. The KDE settings screens were a breath of fresh air – lots of settings available, all at one place, and tons of features I hadn’t seen elsewhere.

Here are some of the things I was pleasantly surprised by with KDE:

  • Applications can declare classes of notifications. These can be managed Android-style in settings. Moreover, you can associate a shell command to run with a notification in any class. People use this to do things like run commands when a display locks and so forth.
  • KDE Connect is a seriously impressive piece of software. It integrates desktops with Android devices in a way that’s reminiscent of non-free operating systems – and with 100% Free Software (the phone app is even in F-Droid!). Notifications from the phone can appear on the desktop, and their state is synchronized; dismiss it on the desktop and it dismisses on the phone, too. Get a SMS or Signal message on the phone? You can reply directly from the desktop. Share files in both directions, mount a directory tree from the phone on the desktop, “find my phone”, use the phone as a presentation remote for the desktop, shared clipboard, sending links between devices, control the phone media player from the desktop… Really, really impressive.
  • The shortcut settings in KDE really work and are impressive. Unlike Gnome, if you try to assign the same shortcut to multiple things, you are warned and prevented from doing this. As with Gnome, you can also bind shorcuts to arbitrary actions.
  • This shouldn’t be exciting, but I was just using Gnome, so… The panel! I can put things wherever I want them! I can put it at the top of the screen, the bottom, or even the sides! It lets all my regular programs (eg, Nextcloud) put their icons up there without having to install two different extensions, each of which handles a different set of apps! I shouldn’t be excited about all this, because Gnome actually used to have these features years ago… [gripe gripe]
  • Initially I was annoyed that Firefox notifications weren’t showing up in the notification history as they did in Gnome… but that was, of course, a setting, easily fixed!
  • There is a Plasma Integration plugin for Firefox (and other browsers including Chrome). It integrates audio and video playback, download status, etc. with the rest of KDE and KDE Connect. Result: if you like, when a call comes in to your phone, Youtube is paused. Or, you can right-click to share a link to your phone via KDE Connect, and so forth. You can right click on a link, and share via Bluetooth, Nextcloud (it must have somehow registered with KDE), KDE Connect, email, etc.

Tiling

So how about the tiling system, kwin-tiling? The out of the box experience is pretty nice. There are fewer built-in layouts than with xmonad, but the ones that are there are doing a decent job for me, and in some cases are more configurable (those that have a large window pane are configurable on its location, not forcing it to be on the left as with many systems.) What’s more, thanks to the flexibility in the KDE shortcut settings, I can configure it to be nearly keystroke-identical to xmonad!

Issues Encountered

I encountered a few minor issues:

  • There appears to be no way to tell it to “power down the display immediately after it is locked, every time” instead of waiting for some timeout to elapse. This is useful when I want to switch monitor inputs to something else.
  • Firefox ESR seems to have some rendering issues under KDE for some reason, but switching to the latest stable release direct from Mozilla seems to fix that.

In short, I’m very impressed.

Make the Internet Yours Again With an Instant Mesh Network

I’m going to lead with the technical punch line, and then explain it:

Yggdrasil Network is an opportunistic mesh that can be deployed privately or as part of a global-scale network. Each node gets a stable IPv6 address (or even an entire /64) that is derived from its public key and is bound to that node as long as the node wants it (of course, it can generate a new keypair anytime) and is valid wherever the node joins the mesh. All traffic is end-to-end encrypted.

Yggdrasil will automatically discover peers on a LAN via broadcast beacons, and requires zero configuration to peer in such a way. It can also run as an overlay network atop the public Internet. Public peers serve as places to join the global network, and since it’s a mesh, if one device on your LAN joins the global network, the others will automatically have visibility on it also, thanks to the mesh routing.

It neatly solves a lot of problems of portability (my ssh sessions stay live as I move networks, for instance), VPN (incoming ports aren’t required since local nodes can connect to a public peer via an outbound connection), security, and so forth.

Now on to the explanation:

The Tyranny of IP rigidity

Every device on the Internet, at one time, had its own globally-unique IP address. This number was its identifier to the world; with an IP address, you can connect to any machine anywhere. Even now, when you connect to a computer to download a webpage or send a message, under the hood, your computer is talking to the other one by IP address.

Only, now it’s hard to get one. The Internet protocol we all grew up with, version 4 (IPv4), didn’t have enough addresses for the explosive growth we’ve seen. Internet providers and IT departments had to use a trick called NAT (Network Address Translation) to give you a sort of fake IP address, so they could put hundreds or thousands of devices behind a single public one. That, plus the mobility of devices — changing IPs whenever they change locations — has meant that a fundamental rule of the old Internet is now broken:

Every participant is an equal peer. (Well, not any more.)

Nowadays, you can’t you host your own website from your phone. Or share files from your house. (Without, that is, the use of some third-party service that locks you down and acts as an intermediary.)

Back in the 90s, I worked at a university, and I, like every other employee, had a PC on my desk with an unfirewalled public IP. I installed a webserver, and poof – instant website. Nowadays, running a website from home is just about impossible. You may not have a public IP, and if you do, it likely changes from time to time. And even then, your ISP probably blocks you from running servers on it.

In short, you have to buy your way into the resources to participate on the Internet.

I wrote about these problems in more detail in my article Recovering Our Lost Free Will Online.

Enter Yggdrasil

I already gave away the punch line at the top. But what does all that mean?

  • Every device that participates gets an IP address that is fully live on the Yggdrasil network.
  • You can host a website, or a mail server, or whatever you like with your Yggdrasil IP.
  • Encryption and authentication are smaller (though not nonexistent) worries thanks to the built-in end-to-end encryption.
  • You can travel the globe, and your IP will follow you: onto a plane, from continent to continent, wherever. Yggdrasil will find you.
  • I’ve set up /etc/hosts on my laptop to use the Yggdrasil IPs for other machines on my LAN. Now I can just “ssh foo” and it will work — from home, from a coffee shop, from a 4G tether, wherever. Now, other tools like tinc can do this, obviously. And I could stop there; I could have a completely closed, private Yggdrasil network.

    Or, I can join the global Yggdrasil network. Each device, in addition to accepting peers it finds on the LAN, can also be configured to establish outbound peering connections or accept inbound ones over the Internet. Put a public peer or two in your configuration and you’ve joined the global network. Most people will probably want to do that on every device (because why not?), but you could also do that from just one device on your LAN. Again, there’s no need to explicitly build routes via it; your other machines on the LAN will discover the route’s existence and use it.

    This is one of many projects that are working to democratize and decentralize the Internet. So far, it has been quite successful, growing to over 2000 nodes. It is the direct successor to the earlier cjdns/Hyperboria and BATMAN networks, and aims to be a proof of concept and a viable tool for global expansion.

    Finally, think about how much easier development is when you don’t have to necessarily worry about TLS complexity in every single application. When you don’t have to worry about port forwarding and firewall penetration. It’s what the Internet should be.

    Managing an External Display on Linux Shouldn’t Be This Hard

    I first started using Linux and FreeBSD on laptops in the late 1990s. Back then, there were all sorts of hassles and problems, from hangs on suspend to pure failure to boot. I still worry a bit about suspend on unknown hardware, but by and large, the picture of Linux on laptops has dramatically improved over the last years. So much so that now I can complain about what would once have been a minor nit: dealing with external monitors.

    I have a USB-C dock that provides both power and a Thunderbolt display output over the single cable to the laptop. I think I am similar to most people in wanting the following behavior from the laptop:

    • When the lid is closed, suspend if no external monitor is connected. If an external monitor is connected, shut off the built-in display and use the external one exclusively, but do not suspend.
    • Lock the screen automatically after a period of inactivity.
    • While locked, all connected displays should be powered down.
    • When an external display is connected, begin using it automatically.
    • When an external display is disconnected, stop using it. If the lid is closed when the external display is disconnected, go into suspend mode.

    This sounds so simple. But somehow on Linux we’ve split up these things into a dozen tiny bits:

    • In /etc/systemd/logind.conf, there are settings about what to do when the lid is opened or closed.
    • Various desktop environments have overlapping settings covering the same things.
    • Then there are the display managers (gdm3, lightdm, etc) that also get in on the act, and frequently have DIFFERENT settings, set in different places, from the desktop environments. And, what’s more, they tend to be involved with locking these days.
    • Then there are screensavers (gnome-screensaver, xscreensaver, etc.) that also enter the picture, and also have settings in these areas.

    Problems I’ve Seen

    My problems don’t even begin with laptops, but with my desktop, running XFCE with xmonad and lightdm. My desktop is hooked to a display that has multiple inputs. This scenario (reproducible in both buster and bullseye) causes the display to be unusable until a reboot on the desktop:

    1. Be logged in and using the desktop
    2. Without locking the desktop screen, switch the display input to another device
    3. Keep the display input on another device long enough for the desktop screen to auto-lock
    4. At this point, it is impossible to re-awaken the desktop screen.

    I should not here that the problems aren’t limited to Debian, but also extend to Ubuntu and various hardware.

    Lightdm: which greeter?

    At some point while troubleshooting things after upgrading my laptop to bullseye, I noticed that while both were running lightdm, I had different settings and a different appearance between the two. Upon further investigation, I realized that one hat slick-greeter and lightdm-settings installed, while the other had lightdm-gtk-greeter and lightdm-gtk-greeter-settings installed. Very strange.

    XFCE: giving up

    I eventually gave up on making lightdm work. No combination of settings or greeters would make things work reliably when changing screen configurations. I installed xscreensaver. It doesn’t hang, but it does sometimes take a few tries before it figures out what device to display on.

    Worse, since updating from buster to bullseye, XFCE no longer automatically switches audio output when the docking station is plugged in, and there seems to be no easy way to convince Pulseaudio to do this.

    X-Based Gnome and derivatives… sigh.

    I also tried Gnome, Mate, and Cinnamon, and all of them had various inabilities to configure things to act the way I laid out above.

    I’ve long not been a fan of Gnome’s way of hiding things from the user. It now has a Windows-like situation of three distinct settings programs (settings, tweaks, and dconf editor), which overlap in strange ways and interact with systemd in even stranger ways. Gnome 3 make it quite non-intuitive to make app icons from various programs work, and so forth.

    Trying Wayland

    I recently decided to set up an older laptop that I hadn’t used in awhile. After reading up on Wayland, I decided to try Gnome 3 under Wayland. Both the Debian and Arch wikis note that KDE is buggy on Wayland. Gnome is the only desktop environment that supports it then, unless I want to go with Sway. There’s some appeal to Sway to this xmonad user, but I’ve read of incompatibilities of Wayland software when Gnome’s not available, so I opted to try Gnome.

    Well, it’s better. Not perfect, but better. After finding settings buried in a ton of different Settings and Tweaks boxes, I had it mostly working, except gdm3 would never shut off power to the external display. Eventually I found /etc/gdm3/greeter.dconf-defaults, and aadded:

    sleep-inactive-ac-timeout=60
    sleep-inactive-ac-type='blank'
    sleep-inactive-battery-timeout=120
    sleep-inactive-battery-type='suspend'
    

    Of course, these overlap with but are distinct from the same kinds of things in Gnome settings.

    Sway?

    Running without Gnome seems like a challenge; Gnome is switching audio output appropriately, for instance. I am looking at some of the Gnome Shell tiling window manager extensions and hope that some of them may work for me.

    Facebook’s Blocking Decisions Are Deliberate – Including Their Censorship of Mastodon

    In the aftermath of my report of Facebook censoring mentions of the open-source social network Mastodon, there was a lot of conversation about whether or not this was deliberate.

    That conversation seemed to focus on whether a human speficially added joinmastodon.org to some sort of blacklist. But that’s not even relevant.

    OF COURSE it was deliberate, because of how Facebook tunes its algorithm.

    Facebook’s algorithm is tuned for Facebook’s profit. That means it’s tuned to maximize the time people spend on the site — engagement. In other words, it is tuned to keep your attention on Facebook.

    Why do you think there is so much junk on Facebook? So much anti-vax, anti-science, conspiracy nonsense from the likes of Breitbart? It’s not because their algorithm is incapable of surfacing the good content; we already know it can because they temporarily pivoted it shortly after the last US election. They intentionally undid its efforts to make high-quality news sources more prominent — twice.

    Facebook has said that certain anti-vax disinformation posts violate its policies. It has an extremely cumbersome way to report them, but it can be done and I have. These reports are met with either silence or a response claiming the content didn’t violate their guidelines.

    So what algorithm is it that allows Breitbart to not just be seen but to thrive on the platform, lets anti-vax disinformation survive even a human review, while banning mentions of Mastodon?

    One that is working exactly as intended.

    We may think this algorithm is busted. Clearly, Facebook does not. If their goal is to maximize profit by maximizing engagement, the algorithm is working exactly as designed.

    I don’t know if joinmastodon.org was specifically blacklisted by a human. Nor is it relevant.

    Facebook’s choice to tolerate and promote the things that service its greed for engagement and money, even if they are the lowest dregs of the web, is deliberate. It is no accident that Breitbart does better than Mastodon on Facebook. After all, which of these does its algorithm detect keep people engaged on Facebook itself more?

    Facebook removes the ban

    You can see all the screenshots of the censorship in my original post. Now, Facebook has reversed course:

    We also don’t know if this reversal was human or algorithmic, but that still is beside the point.

    The point is, Facebook intentionally chooses to surface and promote those things that drive engagement, regardless of quality.

    Clearly many have wondered if tens of thousands of people have died unnecessary deaths over COVID as a result. One whistleblower says “I have blood on my hands” and President Biden said “they’re killing people” before “walking back his comments slightly”. I’m not equipped to verify those statements. But what do they think is going to happen if they prioritize engagement over quality? Rainbows and happiness?

    Update 2024-04-06: It’s happened again. Facebook censored my post about the Marion County Record because the same site was critical of Facebook censoring environmental stories.

    Facebook Is Censoring People For Mentioning Open-Source Social Network Mastodon

    Update: Facebook has reversed itself over this censorship, but I maintain that whether the censorship was algorithmic or human, it was intentional either way. Details in my new post.

    Last November, I made a brief post to Facebook about Mastodon. Mastodon is an open-source and open social network, which is decentralized and all about user control instead of corporate control. I’ve blogged about Mastodon and the dangers of Facebook before, but rarely mentioned Mastodon on Facebook itself.

    Today, I received this notice that Facebook had censored my post about Mastodon:

    Facebook censoring a post

    Wonder with me for a second what this one-off post I composed myself might have done to trip Facebook’s filter…. and it is probably obvious that what tripped the filter was the mention of an open source competitor, even though Facebook is much more enormous than Mastodon. I have been a member of Facebook for many years, and this is the one and only time anything like that has happened.

    Why they decided today to take down that post – I have no idea.

    In case you wondered about their sincerity towards stamping out misinformation — which, on the rare occasions they do something about, they “deprioritize” rather than remove as they did here — this probably answers your question. Or, are they sincere about thinking they’re such a force for good by “connecting the world’s people?” Well, only so long as the world’s people don’t say nice things about alternatives to Facebook, I guess.

    “Well,” you might be wondering, “Why not appeal, since they obviously made a mistake?” Because, of course, you can’t:

    Indeed I did tick a box that said I disagreed, but there was no place to ask why or to question their action.

    So what would cause a non-controversial post from a long-time Facebook member that has never had anything like this happen, to disappear?

    Greed. Also fear.

    Maybe I’d feel sorry for them if they weren’t acting like a bully.

    Edit: There are reports from several others on Mastodon of the same happening this week. I am trying to gather more information. It sounds like it may be happening on Twitter as well.

    Edit 2: And here are some other reports from both Facebook and Twitter. Definitely not just me.

    Edit 3: While trying to reply to someone on Facebook, that was trying to defend Facebook, I mentioned joinmastodon.org and got this:

    Anyone else seeing it?

    Edit 4: It is far more than just me, clearly. More reports are out there; for instance, this one and that one.

    Excellent Experience with Debian Bullseye

    I’ve appreciated the bullseye upgrade, like most Debian upgrades. I’m not quite sure how, since I was already running a backports kernel, but somehow the entire system is snappier. Maybe newer X or something? I’m really pleased with it. Hardware integration is even nicer now, particularly the automatic driverless support for scanners in addition to the existing support for printers.

    All in all, a very nice upgrade, and pretty painless.

    I experienced a few odd situations.

    For one, I had been using Gnome Flashback. Since xmonad-log-applet didn’t compile there (due to bitrot in the log applet, not flashback), and I had been finding Gnome Flashback to be a rather dusty and forgotten corner of Gnome for a long time, I decided to try Mate.

    Mate just seemed utterly unable to handle a situation with a laptop and an external monitor very well. I want to use only the external monitor with the laptop lid is closed, and it just couldn’t remember how to do the right thing – external monitor on, laptop monitor off, laptop not put into suspend. gdm3 also didn’t seem to be able to put the external monitor to sleep, either, causing a few nights of wasted power.

    So off I went to XFCE, which I had been using for years on my workstation anyhow. Lots more settings available in XFCE, plus things Just Worked there. Odd that XFCE, the thin and light DE, is now the one that has the most relevant settings. It seems the Gnome “let’s remove a bunch of features” approach has extended to MATE as well.

    When I switched to XFCE, I also removed gdm3 from my system, leaving lightdm as the only DM on it. That matched what my desktop machine was using, and also what task-xfce-desktop called for. But strangely, the XFCE settings for lightdm were completely different between the laptop and the desktop. It turns out that with lightdm, you can have the lightdm-gtk-greeter and the accompanying lightdm-gtk-greeter-settings, or slick-greeter and the accompanying lightdm-settings. One machine had one greeter and settings, and the other had the other. Why, I don’t know. But lightdm-gtk-greeter-settings had the necessary options for putting monitors to sleep on the login screen, so I went with it.

    This does highlight a bit of a weakness in Debian upgrades. There is SO MUCH choice in Debian, which I highly value. At some point, almost certainly without my conscious choice, one machine got one greeter and another got the other. Despite both having task-xfce-desktop installed, they got different desktop experiences. There isn’t a great way to say “OK, I know I had a bunch of things installed before, but NOW I want the default bullseye experience”.

    But overall, it is an absolutely fantastic distribution. It is great to see this nonprofit community distribution continue to have such quality on such an immense scale. And hard to believe I’ve been a Debian developer for 25 years. That seems almost impossible!

    Distributed, Asynchronous Git Syncing with NNCP

    I have a problem.

    I have a directory that I use with org-mode and org-roam. I want it to be synced across multiple machines. I also want to keep the history with git. And, I want to use end-to-end encryption (no storing a plain git repo on a remote server), have a serverless setup, not require any two machines to be up simultaneously, and be resilient in the face of races and conflicts.

    Whew.

    I’ve tried a number of setups – git-remote-gcrypt on a remote server (fragile), some complicated scripts around a separate repo in syncthing (requires one machine to be “in charge”), etc. They all were subpar.

    Then NNCP introdoced asynchronous multicast and I was intrigued.

    So, I wrote gitsync-nncp, which uses NNCP to distribute git bundles to all the participating machines. The comprehensive documentation for gitsync-nncp goes into a lot more detail about how it works and what problems it solves. It’s working quite well for me!

    Roundup of Unique Data/Storage Hosting Options

    Recently I have been taking another look at the services at rsync.net and it got me thinking: what would I do with a lot of storage? What might I want to run with it, if it were fairly cheap?

    • Backups are an obvious place to start. Borgbackup makes a pretty compelling option: very bandwidth-efficient thanks to block-level rolling hash dedup, encryption fully on the client side, etc. Borg can run over ssh, though does need a server-side program.
    • Nextcloud is another option. With Google Photos getting quite expensive now, if you could have a TB of storage that you control, what might you do with it? Nextcloud also includes IM, video chat, and online document editing similar to Google Docs.
    • I’ve written before about the really neat properties of Syncthing: distributed synchronization that needs no server component. It also supports untrusted nodes in the mesh, where all content is encrypted before it reaches them. Sometimes an intermediary node is useful; for instance, if nodes A and C are to sync but are rarely online at the same time, an untrusted node B that is always online can facilitate synchronization. A server with some space could help with this.
    • A relay for NNCP or UUCP.
    • More broadly, you could self-host your photo or video cllection.

    Let’s start taking a look at what’s out there. I’m going to try to focus on things that are unique for some reason: pricing, features, etc. Incidentally, good reviews are hard to find due to the proliferation of affiliate links. I have no affiliate relationships with anyone mentioned here and there are no affiliate links in this post.

    I’ll start with the highest-end community and commercial options (though both are quite competitive on price for what they are), and then move on to the cheaper options.

    Community option: SDF

    SDF is somewhat hard to define. “What is SDF?” could prompt answers like:

    • A community-run network offering free Unix shells to the public
    • A diverse community of people that connect with unique tools. A social network in the 80s sense, sort of.
    • A provider of… let me see… VPN, DSL, and even dialup access.
    • An organization that runs various Open Source social network services, including Mastodon, Pixelfed (image sharing), PeerTube (video sharing), WordPress, even Minecraft.
    • A provider of various services for a nominal charge: $3/mo gets you access to the MetaArray with 800GB of storage space which you have shell access to, and can store stuff on with Nextcloud, host public webpages, etc.
    • Thriving communities around amateur radio, musicians, Plan 9, and even – brace yourself – TOPS-20, a DEC operating system first released in 1976 and not updated since 1988.
    • There’s even a Wikipedia article about SDF.

    There’s a lot there. SDF lets you use things for yourself, of course, but you can also join a community. It’s not a commercial service backed by SLAs — it’s best-effort — but it’s been around more than 30 years and has a great track record.

    Top commercial option for backup storage: rsync.net

    rsync.net offers storage broadly over SSH: sftp, rsync, scp, borg, rclone, restic, git-annex, git, and such. You do not get a shell, but you do get to run a few noninteractive commands via ssh. You can, for instance, run git clone on the rsync server.

    The rsync special sauce is in ZFS. They run raidz3 on their arrays (and also offer dual location setups for an additional fee), offer both free and paid ZFS snapshots, etc. The service is designed to be extremely reliable, particularly for backups, and it seems to me to meet those goals.

    Basic storage is $0.025 per GB/mo, but with certain account types such as borg, can be had for $0.015 per GB/mo. The minimum size is 400GB or $10/mo. There are no bandwidth charges. This makes it quite economical even compared to, say, S3. Additional discounts start at 10TB, so 10TB with rsync.net would cost $204.80/mo or $81.92 on the borg plan.

    You won’t run Nextcloud on this thing, but for backups that must be reliable, or even a photo collection or something, it makes perfect sense.

    When you look into other options, you’ll find that other providers are a lot more vague about their storage setup than rsync.net.

    Various offerings from Hetzner

    Hetzner is one of Europe’s large hosting companies, and they have several options of interest.

    Their Storage Box competes directly with the rsync.net service. Their per-GB storage cost is lower than rsync.net, and although they do include a certain amount of free bandwidth with each account, bandwidth is not unlimited and could result in charges. Still, if you don’t drive 2x or more your storage usage in bandwidth each month, it would be cheaper than rsync. The Storage Box also uses ZFS with some kind of redundancy, though they don’t specifcy details.

    What differentiates them from rsync.net is the protocol support. They support sftp, scp, Borg, ssh, rsync, etc. just as rsync.net does. But then they also throw in Samba/CIFS, FTPS, HTTPS, and WebDAV – all optionally enabled or disabled by you. Although things like sshfs exist, they aren’t particularly optimal for some use cases, and CIFS support may just be what you need in some situations.

    10TB with Hetzner would cost EUR 39.90/mo, or about $48.84/mo. (This figure is higher for Europeans, who also have to pay VAT.)

    Hetzner also offers a Storage Share, which is a private Nextcloud instance. 10TB of that is exactly the same cost as 10TB of the Storage Box. You can add your own users, groups, etc. to this as your are the Nextcloud admin of your instance. Hetzner throws in automatic updates (which is great, as updates have been a pain in my side for a long time). Nextcloud is ideal for things like photo sharing, even has email and chat built in, etc. For about the same price at 2TB of Google One, you can have 2TB of Nextcloud with all those services for yourself. Not bad. You can also mount a Nextcloud instance with WebDAV.

    Interestingly, Nextcloud supports “external storages” as backend for the data. It supports another Nextcloud instance, OpenStack or S3 object storage, and SFTP, SMB/CIFS, and WebDAV. If you’re thinking you’d like both SFTP and Nextcloud access to a pool of storage, I imagine you could always get a large Storage Box from Hetzner (internal transfer is free), pair it with a small Nextcloud instance, and link the two with Nextcloud external storage.

    Dedicated Servers

    If you want a more DIY approach, you can find some interesting deals on actual dedicated server hardware – you get the entire machine to yourself. I’ve been using OVH’s SoYouStart for a number of years, with good experienaces, and they have a number of server configurations available. For instance, for $45.99, you can get a Xeon box with 4x2TB drives and 32GB RAM. With RAID5 or raidz1, that’s 6TB of available space – and cheaper than the 6TB from rsync.net (though less redundant) plus you get the whole box to yourself too. OVH directly has some more storage servers; for instance, you can get a box with 4x4TB + 1x500GB SSD for $86.75/mo, giving you 12TB available with RAID5/raidz1, plus a 16GB server to do what you want with.

    Hetzner also has some larger options available, for instance 2x4TB at EUR39 or 2x8TB at EUR54, both with 64GB of RAM.

    Bargain Corner

    Yes, you can find 10TB for $25/mo. It’s hosted on ceph, by what appears to be mostly a single person (though with a lot of experience and a fair bit of transparency). You’re not going to have the round-the-clock support experience as with rsync.net, nor its raidz3 level of redundancy – but if you don’t need that, there are quite a few options.

    Let’s start with Lima Labs. Yes, 10TB is $25/mo, and they support sftp, rsync, borg, and even NFS mounts on storage backed by Ceph. The owner, Sam, seems to be a nice guy but the service isn’t going to be on the scale of rsync.net or Hetzner. That may or may not be OK for your needs – I mean, you can even get 1TB for $5/mo, so there are some fantastic deals to be had here.

    BorgBase does Borg hosting and borg hosting only. You can get 1TB for $6.67/mo or, for instance, 10TB for $53.46. They don’t say much about their infrastructure and it’s hard to get a read on the company, but for Borg backups, it could be a nice option.

    Bargain Corner Part 2: Seedboxes

    There’s a market out there of companies offering BitTorrent seeding and downloading services. Typically, these services offer you Unix ssh access to a shell, give you a bunch of space on completely non-redundant drives (theory being that the data on them is transient), lots of bandwidth, for a low price. Some people use them for BitTorrent, others for media serving and such.

    If you are willing to take the lowest in drive redundancy, there are some deals to be had. Whatbox is a popular leader here, and has an extensive wiki with info. Or you can find some seedbox.io “shared storage” plans – for instance, 12TB for $32.49/mo. But it’s completely non-redundant drives.

    Seedbox has a partner company, Walker Servers, with some interesting deals; for instance, 4x8TB for EUR 52.45. Not bad for 24TB usable with RAID5 – but Walker Servers is completely unknown to me and doesn’t publish a phone number. So, YMMV.

    Conclusion

    I’m sure I’ve left out many quality options here, but hopefully this is enough to lay out a general lay of the land. Leave other suggestions in the comments.