I last reviewed email services in 2019. That review focused a lot of attention on privacy. At the time, I selected mailbox.org as my provider, and have been using them for these 5 years since. However, both their service and their support have gone significantly downhill since, so it is time for me to look at other options.
Here I am focusing strongly on email. Some of the providers mentioned here provide other services (IM, video calls, groupware, etc.), and to the extent they do, I am ignoring them.
What Matters in 2024
I want to start off by acknowledging that what you need in email probably depends on your circumstances and the country in which you live. For me, I begin by naming that the largest threat most of us face isn’t from state actors but from criminals: hackers, ransomware gangs, etc. It is important to take as many steps as possible to secure one’s account against that. Privacy and security are both part of the mix. I still value privacy but I am acknowledging, as Migadu does, that “Email as we know it and encryption are incompatible.” Although some of these services strongly protect parts of the conversation, the reality is that most people will be emailing people using plain old email services which don’t. For stronger security, something like Signal would be needed. (I wrote about Signal in 2021 also.)
Interestingly, OpenPGP support seems to be something of a standard feature in the providers I reviewed by this point. All or almost all of them provide integration with browser-based encryption as well as server-side encryption if you prefer that.
Although mailbox.org can automatically PGP-encrypt every message that arrives in plaintext, for general use, this is unwieldy; there isn’t good tooling for searching mailboxes where every message is encrypted, etc. So I never enabled that feature at Mailbox. I still value security and privacy, but a pragmatic approach addresses the most pressing threats first.
My criteria
The basic requirements for an email service include:
- Ability to use my own domains
- Strong privacy policy
- Ability for me to use my own IMAP and SMTP clients on both desktop and mobile
- It must be extremely reliable
- It must not be free
- It must have excellent support for those rare occasions when it is needed
- Support for basic aliases
Why do I say it must not be free? Because if someone is providing a service with the quality I’m talking about here, and not charging for it, it implies something is fishy: either they are unscrupulous, are financially unstable, or the product is something else like ads. I am not aware of any provider that matches the other criteria with a free account anyhow. These providers range from about $30 to $90 per year, so cheaper than a Netflix subscription.
Immediately, this rules out several options:
- Proton doesn’t let me use my own clients on mobile (their bridge is desktop-only)
- Tuta also doesn’t let me use my own clients
- Posteo doesn’t let me use my own domain
- mxroute.com lacks a strong privacy policy, and its policy has numerous causes for concern (for instance, “If you repeatedly send email to invalid/unroutable recipients, they may be published on our GitHub”)
I will have a bit more to say about a couple of these providers below.
There are some additional criteria that are strongly desired but not absolutely required:
- Ability to set individual access passwords for every device/app
- Support for two-factor authentication (2FA/TFA/TOTP) for web-based access
- Support for basics in filtering: ability to filter on envelope recipient (so if I get BCC’d, I can still filter), and ability to execute more than one action on filter match (eg, deliver to two folders, or deliver to a folder and forward to someone else)
IMAP and SMTP don’t really support 2FA, so by setting individual passwords for every device, you can at least limit the blast radius and cut off a specific device if something is (or might be) compromised.
The candidates
I considered these providers: Startmail, Mailfence, Runbox, Fastmail, Kolab, Mailbox.org, and Migadu. I’ll review each, and highlight the pricing of the plan I would most likely use. Each provider offers multiple plans; some may be more expensive and some may be cheaper than the one I reviewed. I included a link to each provider’s full pricing information so you can compare for your needs.
I set up trials with each of these (except Mailbox.org, with which I already had a paid account). It so happend that I had actual questions for support for each one, which gave me an opportunity to see how support responded. I did not fabricate questions, and would not have contacted support if I didn’t have real ones. (This means that I asked different questions of each provider, because they were the REAL questions I had.) I’ll jump to the spoiler right now: I eventually chose Migadu, with Fastmail and Mailfence as close seconds.
I looked for providers myself, and also solicited recommendations in a Mastodon thread.
Mailbox.org
I begin with Mailbox, as it was my top choice in 2019 and the incumbent.
Until this year, I had been quite happy with it. I had cause to reach their support less than once a year on average, and each time they replied the same day or next day. Now, however, they are failing on reliability and on support.
Their spam filter has become overly aggressive. It has blocked quite a bit of legitimate mail. When contacting their support about a prior issue earlier this year, they initially took 4 days to reply, and then 6 days to reply after that. Ouch. They had me disable some spam settings.
It didn’t really help. I continue to lose mail. I don’t know how much, because they block a lot of it before it even hits the spam folder. One of my friends texted to say mail was dropping. I raised a new ticket with mailbox, which took them 5 days to reply to. Their reply was unhelpful. “As the Internet is not a static system, unforeseen events can always occur.” Well yes, that’s true, and I get it, false positives exist with email. But this was from an ISP’s mail system with an address that had been established for years, and it was part of a larger pattern of rejecting quite a bit of legit mail. And every interaction with them recently hasn’t resulted in them actually doing anything to resolve anything. It’s just a paragraph or two of reply that does nothing and helps nothing.
When I complained that it took 5 days to reply, they said “We have not been able to reply sooner as we are currently experiencing a high volume of customer enquiries.” Even though their SLA for my account is a not-great “48 business hour” turnaround, they still missed it and their reason is “we’re busy.” I finally asked what RBL had caught the blocked email, since when I checked, the sender wasn’t on any RBL. Mailbox’s reply: they only keep their logs for 7 days, so next time I should contact them within 7 days. Which, of course, I DID; it was them that kept delaying. Ugh! It’s like they’ve become a cable company.
Even worse is how they have been blocking mail from GrapheneOS’s discussion form. See their thread about it. In short, Graphene’s mail server has a clean reputation and Mailbox has no problem with it. But because one of Graphene’s IPv6 webservers has an IPv6 allocation of a size Mailbox doesn’t like, they drop mail. It’s ridiculous, and Mailbox was dismissive of this well-known and well-regarded Open Source project. So if the likes of GrapheneOS can’t get good faith effort to deliver their mail, what chance does an individual like me have?
I’m sorry, but I’m literally paying you to deliver email for me and provide good support. If you can’t do either of those, you don’t get to push that problem down onto me. Hire appropriate staff.
On the technical side, they support aliases, my own clients, and have a reasonable privacy policy. Their 2FA support exists for the web interface (though weirdly not the support site), though it is somewhat weird. They do not support app passwords.
A somewhat unique feature is the @secure.mailbox.org
domain. If you try to receive mail at that address, mailbox.org will block it unless it uses TLS. Same for sending. This isn’t E2EE, but it does at least require things not be in plaintext for the last hop to Mailbox.
Verdict: not recommended due to poor reliability and support.
Mailbox.Org summary:
- Website: https://mailbox.org/en/
- Reliability: iffy due to over-aggressive spam filtering
- Support: Poor; takes 4-6 days for a reply and replies are unhelpful
- Individual access passwords: No
- 2FA: Yes, but with a PIN instead of a password as the other factor
- Filtering: Full SIEVE feature set and GUI editor
- Spam settings: greylisting on/off, reject some/all spam, etc. But they’re insufficient to address Mailbox’s overzealousness, which support says I cannot workaround within the interface.
- Server storage location: Germany
- Plan as reviewed: standard [pricing link]
- Cost per year: EUR 30 (about $33)
- Mail storage included: 10GB
- Limits on send/receive volume: none
- Aliases: 50 on your domain name, 25 on mailbox.org
- Additional mailboxes: Available; each one at the same fee as the primary mailbox
Startmail
I really wanted to like Startmail. Its “vault” is an interesting idea and should contribute to the security and privacy of an account. They clearly care about privacy.
It falls down in filtering. They have no way to filter on envelope recipient (BCC or similar). Their support confirmed this to me and that’s a showstopper.
Startmail support was also as slow as Mailbox, taking 5 days to respond to me.
Two showstoppers right there.
Verdict: Not recommended due to slow support responsiveness and weak filtering.
Startmail summary:
- Website: https://www.startmail.com/
- Reliability: Seems to be fine
- Support: Mediocre; Took 5 days for a reply, but the reply was helpful
- Individual app access passwords: Yes
- 2FA: Yes
- Filtering: Poor; cannot filter on envelope recipient, and can’t build filters with multiple actions
- Spam settings: None
- Server storage location: The Netherlands
- Plan as reviewed: Custom domain (trial was Personal), [pricing link]
- Cost per year: $70
- Mail storage included: 20GB
- Limits on send/receive volume: none
- Aliases: unlimited, with lots of features: can set expiration, etc.
- Additional mailboxes: not available
Kolab
Kolab Now is mainly positioned as a full groupware service, but they do have a email-only option which I investigated. There isn’t much documentation about it compared to other providers, and also not much in the way of settings. You can turn greylisting on or off. And…. that’s it.
It has a full suite of filtering options. They set an X-Envelope-To header which you can use with the arbitrary header match to do the right thing even for BCC situations. Filters can have multiple conditions and multiple actions. It is SIEVE-based and you can download your SIEVE definitions.
If you enable 2FA, you disable IMAP and SMTP; not great.
Verdict: Not an impressive enough email featureset to justify going with it.
Kolab Now summary:
- Website: https://kolabnow.com/
- Reliability: Seems to be fine
- Support: Fine responsiveness (next day)
- Invidiaul app passwords: no
- 2FA: Yes, but if you enable it, they disable IMAP and SMTP
- Filtering: Excellent
- Spam settings: Only greylisting on/off
- Server storage location: Switzerland; they have lots of details on their setup
- Plan as reviewed: “Just email” [pricing link]
- Cost per year: CHF 60, about $66
- Mail storage included: 5GB
- Limitations on send/receive volume: None
- Aliases: Yes. Not sure if there are limits.
- Additional mailboxes: Yes if you set up a group account. “Flexible pricing based on user count” is not documented anywhere I could find.
Mailfence
Mailfence is another option, somewhat similar to Startmail but without the unique vault. I had some questions about filters, and support was quite responsive, responding in a couple of hours.
Some of their copy on their website is a bit misleading, but support clarified when I asked them. They do not offer encryption at rest (like most of the entries here).
Mailfence’s filtering system is the kind I’d like to see. It allows multiple conditions and multiple actions for each rule, and has some unique actions as well (notify by SMS or XMPP). Support says that “Recipients” matches envelope recipients. However, one ommission is that I can’t match on arbitrary headers; only the canned list of headers they provide.
They have only two spam settings:
- spam filter on/off
- whitelist
Given some recent complaints about their spam filter being overly aggressive, I find this lack of control somewhat concerning. (However, I discount complaints about people begging for more features in free accounts; free won’t provide the kind of service I’m looking for with any provider.) There are generally just very few settings for email as well.
Verdict: Response and helpful support, filtering has the right structure but lacks arbitrary header match. Could be a good option.
Mailfence summary:
- Website: https://mailfence.com/
- Reliability: Seems to be fine
- Support: Excellent responsiveness and helpful replies (after some initial confusion about my question of greylisting)
- Individual app access passwords: No. You can set a per-service password (eg, an IMAP password), but those will be shared with all devices speaking that protocol.
- 2FA: Yes
- Filtering: Good; only misses the ability to filter on arbitrary headers
- Spam settings: Very few
- Server storage location: Belgium
- Plan as reviewed: Entry [pricing link]
- Cost per year: $42
- Mail storage included: 10GB, with a maximum of 50,000 messages
- Limits on send/receive volume: none
- Aliases: 50. Aliases can’t be deleted once created (there may be an exeption to this for aliases on your own domain rather than mailfence.com)
- Additional mailboxes: Their page on this is a bit confusing, and the pricing page lacks the information promised. It looks like you can pay the same $42/year for additional mailboxes, with a limit of up to 2 additional paid mailboxes and 2 additional free mailboxes tied to the account.
Runbox
This one came recommended in a Mastodon thread. I had some questions about it, and support response was fantastic – I heard from two people that were co-founders of the company! Even within hours, on a weekend. Incredible! This kind of response was only surpassed by Migadu.
I initially wrote to Runbox with questions about the incoming and outgoing message limits, which I hadn’t seen elsewhere, as well as the bandwidth limit. They said the bandwidth limit is no longer enforced on paid accounts. The incoming and outgoing limits are enforced, and all email (even spam) counts towards the limit. Notably the outgoing limit is per recipient, so if you send 10 messages to your 50-recipient family group, that’s the limit. However, they also indicated a willingness to reset the limit if something happens. Unfortunately, hitting the limit results in a hard bounce (SMTP 5xx) rather than a temporary failure (SMTP 4xx) so it can result in lost mail. This means I’d be worried about some attack or other weirdness causing me to lose mail.
Their filter is a pain point. Here are the challenges:
- You can’t directly match on a BCC recipient. Support advised to use a “headers” match, which will search for something anywhere in the headers. This works and is probably “good enough” since this data is in the Received: headers, but it is a little more imprecise.
- They only have a “contains”, not an “equals” operator. So, for instance, a pattern searching for “test@example.com” would also match “newtest@example.com”. Support advised to put the email address in angle brackets to avoid this. That will work… mostly. Angle brackets aren’t always required in headers.
- There is no way to have multiple actions on the filter (there is just no way to file an incoming message into two folders). This was the ultimate showstopper for me.
Support advised they are planning to upgrade the filter system in the future, but these are the limitations today.
Verdict: A good option if you don’t need much from the filtering system. Lots of privacy emphasis.
Runbox summary:
- Website: https://runbox.com/
- Reliability: Seems to be fine, except returning 5xx codes if per-day limits are exceeded
- Support: Excellent responsiveness and replies from founders
- Individual app passwords: Yes
- 2FA: Yes
- Filtering: Poor
- Spam settings: Very few
- Server storage location: Norway
- Plan as reviewed: Mini [pricing link]
- Cost per year: $35
- Mail storage included: 10GB
- Limited on send/receive volume: Receive 5000 messages/day, Send 500 recipients/day
- Aliases: 100 on runbox.com; unlimited on your own domain
- Additional mailboxes: $15/yr each, also with 10GB non-shared storage per mailbox
Fastmail
Fastmail came recommended to me by a friend I’ve known for decades.
Here’s the thing about Fastmail, compared to all the services listed above: It all just works. Everything. Filtering, spam prevention, it is all there, all feature-complete, and all just does the right thing as you’d hope. Their filtering system has a canned dropdown for “To/Cc/Bcc”, it supports multiple conditions and multiple actions, and just does the right thing. (Delivering to multiple folders is a little cumbersome but possible.) It has a particularly strong feature set around administering multiple accounts, including things like whether users can prevent admins from reading their mail.
The not-so-great part of the picture is around privacy. Fastmail is based in Australia, where the government has extensive power around spying on data, even to the point of forcing companies to add wiretap capabilities. Fastmail’s privacy policy states user data may be held in Australia, USA, India, and Netherlands. By default, they share data with unidentified “spam companies”, though you can disable this in settings. On the other hand, they do make a good effort towards privacy.
I contacted support with some questions and got back a helpful response in three hours. However, one of the questions was about in which countries my particular data would be stored, and the support response said they would have to get back to me on that. It’s been several days and no word back.
Verdict: A featureful option that “just works”, with a lot of features for managing family accounts and the like, but lacking in the privacy area.
Fastmail summary:
- Website: https://www.fastmail.com/
- Reliability: Seems to be fine
- Support: Good response time on most questions; dropped the ball on one tha trequired research
- Individual app access passwords: Yes
- 2FA: Yes
- Filtering: Excellent
- Spam settings: Can set filter aggressiveness, decide whether to share spam data with “spam-fighting companies”, configure how to handle backscatter spam, and evaluate the personal learning filter.
- Server storage locations: Australia, USA, India, and The Netherlands. Legal jurisdiction is Australia.
- Plan as reviewed: Individual [pricing link]
- Cost per year: $60
- Mail storage included: 50GB
- Limits on send/receive volume: 300/hour
- Aliases: Unlimited from what I can see
- Additional mailboxes: No; requires a different plan for that
Migadu
Migadu was a service I’d never heard of, but came recommended to me on Mastodon.
I listed Migadu last because it is a class of its own compared to all the other options. Every other service is basically a webmail interface with a few extra settings tacked on.
Migadu has a full-featured email admin console in addition. By that I mean you can:
- View usage graphs (incoming, outgoing, storage) over time
- Manage DNS (if you want Migadu to run your nameservers)
- Manage multiple domains, and cross-domain relationships with mailboxes
- View a limited set of logs
- Configure accounts, reset their passwords if needed/authorized, etc.
- Configure email address rewrite rules with wildcards and so forth
Basically, if you were the sort of person that ran your own mail servers back in the day, here is Migadu giving you most of that functionality. Effectively you have a web interface to do all the useful stuff, and they handle the boring and annoying bits. This is a really attractive model.
Migadu support has been fantastic. They are quick to respond, and went above and beyond. I pointed out that their X-Envelope-To header, which is needed for filtering by BCC, wasn’t being added on emails I sent myself. They replied 5 hours later indicating they had added the feature to add X-Envelope-To even for internal mails! Wow! I am impressed.
With Migadu, you buy a pool of resources: storage space and incoming/outgoing traffic. What you do within that pool is up to you. You can set up users (“mailboxes”), aliases, domains, whatever you like. It all just shares the pool. You can restrict users further so that an individual user has access to only a subset of the pool resources.
I was initially concerned about Migadu’s daily send/receive message count limits, but in visiting with support and reading the documentation, what really comes out is that Migadu is a service with a personal touch. Hitting the incoming traffic limit will cause a SMTP temporary fail (4xx) response so you won’t lose legit mail – and support will work with you if it’s a problem for legit uses. In other words, restrictions are “soft” and they are interpreted reasonably.
One interesting thing about Migadu is that they do not offer accounts under their domain. That is, you MUST bring your own domain. That’s pretty easy and cheap, of course. It also puts you in a position of power, because it is easy to migrate email from one provider to another if you own the domain.
Filtering is done via SIEVE. There is a GUI editor which lets you accomplish most things, though it has an odd blind spot where you can’t file a message into multiple folders. However, you can edit a SIEVE ruleset directly and you get the full SIEVE featureset, which is extensive (and does support filing a message into multiple folders). I note that the SIEVE :envelope match doesn’t work, but Migadu adds an X-Envelope-To header which is just as good.
I particularly love a company that tells you all the reasons you might not want to use them. Migadu’s pro/con list is an honest drawbacks list (of course, their homepage highlights all the features!).
Verdict: Fantastically powerful, excellent support, and good privacy. I chose this one.
Migadu summary:
- Website: https://migadu.com/
- Reliability: Excellent
- Support: Fantastic. Good response times and they added a feature (or fixed a bug?) a few hours after I requested it.
- Individual access passwords: Yes. Create “identities” to support them.
- 2FA: Yes, on both the admin interface and the webmail interface
- Filtering: Excellent, based on SIEVE. GUI editor doesn’t support multiple actions when filing into a folder, but full SIEVE functionality is exposed.
- Spam settings:
- On the domain level, filter aggressiveness, Greylisting on/off, black and white lists
- On the mailbox level, filter aggressiveness, black and whitelists, action to take with spam; compatible with filters.
- Server storage location: France; legal jurisdiction Switzerland
- Plan as reviewed: mini [pricing link]
- Cost per year: $90
- Mail storage included: 30GB (“soft” quota)
- Limits on send/receive volume: 1000 messgaes in/day, 100 messages out/day (“soft” quotas)
- Aliases: Unlimited on an unlimited number of domains
- Additional mailboxes: Unlimited and free; uses pooled quotas, but individual quotas can be set
Others
Here are a few others that I didn’t think worthy of getting a trial:
- mxroute was recommended by several. Lots of concerning things in their policy, such as:
- if you repeatedly send mail to unroutable recipients, they may publish the addresses on Github
- they will terminate your account if they think you are “rude” or want to contest a charge
- they reserve the right to cancel your service at any time for any (or no) reason.
- Proton keeps coming up, and I will not consider it so long as I am locked into their client on mobile.
- Skiff comes up sometimes, but they were acquired by Notion.
- Disroot comes up; this discussion highlights a number of reasons why I avoid them. Their Terms of Service (ToS) is inconsistent with a general-purpose email account (I guess for targeting nonprofits and activists, that could make sense). Particularly laughable is that they claim to be friends of Open Source, but then would take down your account if you upload “copyrighted” material. News flash: in order for an Open Source license to be meaningful, the underlying work is copyrighted. It is perfectly legal to upload copyrighted material when you wrote it or have the license to do so!
Conclusions
There are a lot of good options for email hosting today, and in particular I appreciate the excellent personal support from companies like Migadu and Runbox. Support small businesses!
Try the Last Internet Kermit Server
$ grep kermit /etc/services
kermit 1649/tcp
What is this mysterious protocol? Who uses it and what is its story?
This story is a winding one, beginning in 1981. Kermit is, to the best of my knowledge, the oldest actively-maintained software package with an original developer still participating. It is also a scripting language, an Internet server, a (scriptable!) SSH client, and a file transfer protocol.
And my first use of it was talking to my HP-48GX calculator over a 9600bps serial link. Yes, that calculator had a Kermit server built in.
But let’s back up and talk about serial ports and Modems.
Serial Ports and Modems
In my piece The PC & Internet Revolution in Rural America, I recently talked about getting a modem – what an excitement it was to get one! I realize that many people today have never used a serial line or a modem, so let’s briefly discuss.
Before Ethernet and Wifi took off in a big way, in the 1990s-2000s, two computers would talk to each other over a serial line and a modem. By modern standards, these were slow; 300bps was a common early speed. They also (at least in the beginning) had no kind of error checking. Characters could be dropped or changed. Sometimes even those speeds were faster than the receiving device could handle. Some serial links were 7-bit, and wouldn’t even pass all 7-bit characters; for instance, sending a Ctrl-S could lock up a remote until you sent Ctrl-Q.
And computers back in the 1970s and 1980s weren’t as uniform as they are now. They used different character sets, different line endings, and even had different notions of what a file is. Today’s notion of a file as whatever set of binary bytes an application wants it to be was by no means universal; some systems treated a file as a set of fixed-length records, for instance.
So there were a lot of challenges in reliably moving files between systems. Kermit was introduced to reliably move files between systems using serial lines, automatically working around the varieties of serial lines, detecting errors and retransmitting, managing transmit speeds, and adapting between architectures as appropriate. Quite a task! And perhaps this explains why it was supported on a calculator with a primitive CPU by today’s standards.
Serial communication, by the way, is still commonplace, though now it isn’t prominent in everyone’s home PC setup. It’s used a lot in industrial equipment, avionics, embedded systems, and so forth.
The key point about serial lines is that they aren’t inherently multiplexed or packetized. Whereas an Ethernet network is designed to let many dozens of applications use it at once, a serial line typically runs only one (unless it is something like PPP, which is designed to do multiplexing over the serial line).
So it become useful to be able to both log in to a machine and transfer files with it. That is, incidentally, still useful today.
Kermit and XModem/ZModem
I wondered: why did we end up with two diverging sets of protocols, created at about the same time? The Kermit website has the answer: essentially, BBSs could assume 8-bit clean connections, so XModem and ZModem had much less complexity to worry about. Kermit, on the other hand, was highly flexible. Although ZModem came out a few years before Kermit had its performance optimizations, by about 1993 Kermit was on par or faster than ZModem.
Beyond serial ports
As LANs and the Internet came to be popular, people started to use telnet (and later ssh) to connect to remote systems, rather than serial lines and modems. FTP was an early way to transfer files across the Internet, but it had its challenges. Kermit added telnet support, as well as later support for ssh (as a wrapper around the ssh
command you already know). Now you could easily log in to a machine and exchange files with it without missing a beat.
And so it was that the Internet Kermit Service Daemon (IKSD) came into existence. It allows a person to set up a Kermit server, which can authenticate against local accounts or present anonymous access akin to FTP.
And so I established the quux.org Kermit Server, which runs the Unix IKSD (part of the Debian ckermit package).
Trying Out the quux.org Kermit Server
There are more instructions on the quux.org Kermit Server page! You can connect to it using either telnet or the kermit program. I won’t duplicate all of the information here, but here’s what it looks like to connect:
$ kermit
C-Kermit 10.0 Beta.08, 15 Dec 2022, for Linux+SSL (64-bit)
Copyright (C) 1985, 2022,
Trustees of Columbia University in the City of New York.
Open Source 3-clause BSD license since 2011.
Type ? or HELP for help.
(/tmp/t/) C-Kermit>iksd /user:anonymous kermit.quux.org
DNS Lookup... Trying 135.148.101.37... Reverse DNS Lookup... (OK)
Connecting to host glockenspiel.complete.org:1649
Escape character: Ctrl-\ (ASCII 28, FS): enabled
Type the escape character followed by C to get back,
or followed by ? to see other options.
----------------------------------------------------
>>> Welcome to the Internet Kermit Service at kermit.quux.org <<<
To log in, use 'anonymous' as the username, and any non-empty password
Internet Kermit Service ready at Fri Aug 4 22:32:17 2023
C-Kermit 10.0 Beta.08, 15 Dec 2022
kermit
Enter e-mail address as Password: [redacted]
Anonymous login.
You are now connected to the quux kermit server.
Try commands like HELP, cd gopher, dir, and the like. Use INTRO
for a nice introduction.
(~/) IKSD>
You can even recursively download the entire Kermit mirror: over 1GB of files!
Conclusions
So, have fun. Enjoy this experience from the 1980s.
And note that Kermit also makes a better ssh client than ssh in a lot of ways; see ideas on my Kermit page.
This page also has a permanent home on my website, where it may be periodically updated.
Recommendations for Tools for Backing Up and Archiving to Removable Media
I have several TB worth of family photos, videos, and other data. This needs to be backed up — and archived.
Backups and archives are often thought of as similar. And indeed, they may be done with the same tools at the same time. But the goals differ somewhat:
Backups are designed to recover from a disaster that you can fairly rapidly detect.
Archives are designed to survive for many years, protecting against disaster not only impacting the original equipment but also the original person that created them.
Reflecting on this, it implies that while a nice ZFS snapshot-based scheme that supports twice-hourly backups may be fantastic for that purpose, if you think about things like family members being able to access it if you are incapacitated, or accessibility in a few decades’ time, it becomes much less appealing for archives. ZFS doesn’t have the wide software support that NTFS, FAT, UDF, ISO-9660, etc. do.
This post isn’t about the pros and cons of the different storage media, nor is it about the pros and cons of cloud storage for archiving; these conversations can readily be found elsewhere. Let’s assume, for the point of conversation, that we are considering BD-R optical discs as well as external HDDs, both of which are too small to hold the entire backup set.
What would you use for archiving in these circumstances?
Establishing goals
The goals I have are:
- Archives can be restored using Linux or Windows (even though I don’t use Windows, this requirement will ensure the broadest compatibility in the future)
- The archival system must be able to accommodate periodic updates consisting of new files, deleted files, moved files, and modified files, without requiring a rewrite of the entire archive dataset
- Archives can ideally be mounted on any common OS and the component files directly copied off
- Redundancy must be possible. In the worst case, one could manually copy one drive/disc to another. Ideally, the archiving system would automatically track making n copies of data.
- While a full restore may be a goal, simply finding one file or one directory may also be a goal. Ideally, an archiving system would be able to quickly tell me which discs/drives contain a given file.
- Ideally, preserves as much POSIX metadata as possible (hard links, symlinks, modification date, permissions, etc). However, for the archiving case, this is less important than for the backup case, with the possible exception of modification date.
- Must be easy enough to do, and sufficiently automatable, to allow frequent updates without error-prone or time-consuming manual hassle
I would welcome your ideas for what to use. Below, I’ll highlight different approaches I’ve looked into and how they stack up.
Basic copies of directories
The initial approach might be one of simply copying directories across. This would work well if the data set to be archived is smaller than the archival media. In that case, you could just burn or rsync a new copy with every update and be done. Unfortunately, this is much less convenient with data of the size I’m dealing with. rsync is unavailable in that case. With some datasets, you could manually design some rsyncs to store individual directories on individual devices, but that gets unwieldy fast and isn’t scalable.
You could use something like my datapacker program to split the data across multiple discs/drives efficiently. However, updates will be a problem; you’d have to re-burn the entire set to get a consistent copy, or rely on external tools like mtree to reflect deletions. Not very convenient in any case.
So I won’t be using this.
tar or zip
While you can split tar and zip files across multiple media, they have a lot of issues. GNU tar’s incremental mode is clunky and buggy; zip is even worse. tar files can’t be read randomly, making it extremely time-consuming to extract just certain files out of a tar file.
The only thing going for these formats (and especially zip) is the wide compatibility for restoration.
dar
Here we start to get into the more interesting tools. Dar is, in my opinion, one of the best Linux tools that few people know about. Since I first wrote about dar in 2008, it’s added some interesting new features; among them, binary deltas and cloud storage support. So, dar has quite a few interesting features that I make use of in other ways, and could also be quite helpful here:
- Dar can both read and write files sequentially (streaming, like tar), or with random-access (quick seek to extract a subset without having to read the entire archive)
- Dar can apply compression to individual files, rather than to the archive as a whole, faciliting both random access and resilience (corruption in one file doesn’t invalidate all subsequent files). Dar also supports numerous compression algorithms including gzip, bzip2, xz, lzo, etc., and can omit compressing already-compressed files.
- The end of each dar file contains a central directory (dar calls this a catalog). The catalog contains everything necessary to extract individual files from the archive quickly, as well as everything necessary to make a future incremental archive based on this one. Additionally, dar can make and work with “isolated catalogs” — a file containing the catalog only, without data.
- Dar can split the archive into multiple pieces called slices. This can best be done with fixed-size slices (–slice and –first-slice options), which let the catalog regord the slice number and preserves random access capabilities. With the –execute option, dar can easily wait for a given slice to be burned, etc.
- Dar normally stores an entire new copy of a modified file, but can optionally store an rdiff binary delta instead. This has the potential to be far smaller (think of a case of modifying metadata for a photo, for instance).
Additionally, dar comes with a dar_manager program. dar_manager makes a database out of dar catalogs (or archives). This can then be used to identify the precise archive containing a particular version of a particular file.
All this combines to make a useful system for archiving. Isolated catalogs are tiny, and it would be easy enough to include the isolated catalogs for the entire set of archives that came before (or even the dar_manager database file) with each new incremental archive. This would make restoration of a particular subset easy.
The main thing to address with dar is that you do need dar to extract the archive. Every dar release comes with source code and a win64 build. dar also supports building a statically-linked Linux binary. It would therefore be easy to include win64 binary, Linux binary, and source with every archive run. dar is also a part of multiple Linux and BSD distributions, which are archived around the Internet. I think this provides a reasonable future-proofing to make sure dar archives will still be readable in the future.
The other challenge is user ability. While dar is highly portable, it is fundamentally a CLI tool and will require CLI abilities on the part of users. I suspect, though, that I could write up a few pages of instructions to include and make that a reasonably easy process. Not everyone can use a CLI, but I would expect a person that could follow those instructions could be readily-enough found.
One other benefit of dar is that it could easily be used with tapes. The LTO series is liked by various hobbyists, though it could pose formidable obstacles to non-hobbyists trying to aceess data in future decades. Additionally, since the archive is a big file, it lends itself to working with par2 to provide redundancy for certain amounts of data corruption.
git-annex
git-annex is an interesting program that is designed to facilitate managing large sets of data and moving it between repositories. git-annex has particular support for offline archive drives and tracks which drives contain which files.
The idea would be to store the data to be archived in a git-annex repository. Then git-annex commands could generate filesystem trees on the external drives (or trees to br burned to read-only media).
In a post about using git-annex for blu-ray backups, an earlier thread about DVD-Rs was mentioned.
This has a few interesting properties. For one, with due care, the files can be stored on archival media as regular files. There are some different options for how to generate the archives; some of them would place the entire git-annex metadata on each drive/disc. With that arrangement, one could access the individual files without git-annex. With git-annex, one could reconstruct the final (or any intermediate) state of the archive appropriately, handling deltions, renames, etc. You would also easily be able to know where copies of your files are.
The practice is somewhat more challenging. Hundreds of thousands of files — what I would consider a medium-sized archive — can pose some challenges, running into hours-long execution if used in conjunction with the directory special remote (but only minutes-long with a standard git-annex repo).
Ruling out the directory special remote, I had thought I could maybe just work with my files in git-annex directly. However, I ran into some challenges with that approach as well. I am uncomfortable with git-annex mucking about with hard links in my source data. While it does try to preserve timestamps in the source data, these are lost on the clones. I wrote up my best effort to work around all this.
In a forum post, the author of git-annex comments that “I don’t think that CDs/DVDs are a particularly good fit for git-annex, but it seems a couple of users have gotten something working.” The page he references is Managing a large number of files archived on many pieces of read-only medium. Some of that discussion is a bit dated (for instance, the directory special remote has the importtree feature that implements what was being asked for there), but has some interesting tips.
git-annex supplies win64 binaries, and git-annex is included with many distributions as well. So it should be nearly as accessible as dar in the future. Since git-annex would be required to restore a consistent recovery image, similar caveats as with dar apply; CLI experience would be needed, along with some written instructions.
Bacula and BareOS
Although primarily tape-based archivers, these do also also nominally support drives and optical media. However, they are much more tailored as backup tools, especially with the ability to pull from multiple machines. They require a database and extensive configuration, making them a poor fit for both the creation and future extractability of this project.
Conclusions
I’m going to spend some more time with dar and git-annex, testing them out, and hope to write some future posts about my experiences.
Easily Accessing All Your Stuff with a Zero-Trust Mesh VPN
Probably everyone is familiar with a regular VPN. The traditional use case is to connect to a corporate or home network from a remote location, and access services as if you were there.
But these days, the notion of “corporate network” and “home network” are less based around physical location. For instance, a company may have no particular office at all, may have a number of offices plus a number of people working remotely, and so forth. A home network might have, say, a PVR and file server, while highly portable devices such as laptops, tablets, and phones may want to talk to each other regardless of location. For instance, a family member might be traveling with a laptop, another at a coffee shop, and those two devices might want to communicate, in addition to talking to the devices at home.
And, in both scenarios, there might be questions about giving limited access to friends. Perhaps you’d like to give a friend access to part of your file server, or as a company, you might have contractors working on a limited project.
Pretty soon you wind up with a mess of VPNs, forwarded ports, and tricks to make it all work. With the increasing prevalence of CGNAT, a lot of times you can’t even open a port to the public Internet. Each application or device probably has its own gateway just to make it visible on the Internet, some of which you pay for.
Then you add on the question of: should you really trust your LAN anyhow? With possibilities of guests using it, rogue access points, etc., the answer is probably “no”.
We can move the responsibility for dealing with NAT, fluctuating IPs, encryption, and authentication, from the application layer further down into the network stack. We then arrive at a much simpler picture for all.
So this page is fundamentally about making the network work, simply and effectively.
How do we make the Internet work in these scenarios?
We’re going to combine three concepts:
- A VPN, providing fully encrypted and authenticated communication and stable IPs
- Mesh Networking, in which devices automatically discover optimal paths to reach each other
- Zero-trust networking, in which we do not need to trust anything about the underlying LAN, because all our traffic uses the secure systems in points 1 and 2.
By combining these concepts, we arrive at some nice results:
- You can
ssh hostname
, where hostname is one of your machines (server, laptop, whatever), and as long ashostname
is up, you can reach it, wherever it is, wherever you are.- Combined with mosh, these sessions will be durable even across moving to other host networks.
- You could just as well use telnet, because the underlying network should be secure.
- You don’t have to mess with encryption keys, certs, etc., for every internal-only service. Since IPs are now trustworthy, that’s all you need. hosts.allow could make a comeback!
- You have a way of transiting out of extremely restrictive networks. Every tool discussed here has a way of falling back on routing things via a broker (relay) on TCP port 443 if all else fails.
There might sometimes be tradeoffs. For instance:
- On LANs faster than 1Gbps, performance may degrade due to encryption and encapsulation overhead. However, these tools should let hosts discover the locality of each other and not send traffic over the Internet if the devices are local.
- With some of these tools, hosts local to each other (on the same LAN) may be unable to find each other if they can’t reach the control plane over the Internet (Internet is down or provider is down)
Some other features that some of the tools provide include:
- Easy sharing of limited access with friends/guests
- Taking care of everything you need, including SSL certs, for exposing a certain on-net service to the public Internet
- Optional routing of your outbound Internet traffic via an exit node on your network. Useful, for instance, if your local network is blocking tons of stuff.
Let’s dive in.
Types of Mesh VPNs
I’ll go over several types of meshes in this article:
-
Fully decentralized with automatic hop routing
This model has no special central control plane. Nodes discover each other in various ways, and establish routes to each other. These routes can be direct connections over the Internet, or via other nodes. This approach offers the greatest resilience. Examples I’ll cover include Yggdrasil and tinc.
-
Automatic peer-to-peer with centralized control
In this model, nodes, by default, communicate by establishing direct links between them. A regular node never carries traffic on behalf of other nodes. Special-purpose relays are used to handle cases in which NAT traversal is impossible. This approach tends to offer simple setup. Examples I’ll cover include Tailscale, Zerotier, Nebula, and Netmaker.
-
Roll your own and hybrid approaches
This is a “grab bag” of other ideas; for instance, running Yggdrasil over Tailscale.
Terminology
For the sake of consistency, I’m going to use common language to discuss things that have different terms in different ecosystems:
- Every tool discussed here has a way of dealing with NAT traversal. It may assist with establishing direct connections (eg, STUN), and if that fails, it may simply relay traffic between nodes. I’ll call such a relay a “broker”. This may or may not be the same system that is a control plane for a tool.
- All of these systems operate over lower layers that are unencrypted. Those lower layers may be a LAN (wired or wireless, which may or may not have Internet access), or the public Internet (IPv4 and/or IPv6). I’m going to call the unencrypted lower layer, whatever it is, the “clearnet”.
Evaluation Criteria
Here are the things I want to see from a solution:
- Secure, with all communications end-to-end encrypted and authenticated, and prevention of traffic from untrusted devices.
- Flexible, adapting to changes in network topology quickly and automatically.
- Resilient, without single points of failure, and with devices local to each other able to communicate even if cut off from the Internet or other parts of the network.
- Private, minimizing leakage of information or metadata about me and my systems
- Able to traverse CGNAT without having to use a broker whenever possible
- A lesser requirement for me, but still a nice to have, is the ability to include others via something like Internet publishing or inviting guests.
- Fully or nearly fully Open Source
- Free or very cheap for personal use
- Wide operating system support, including headless Linux on x86_64 and ARM.
Fully Decentralized VPNs with Automatic Hop Routing
Two systems fit this description: Yggdrasil and Tinc. Let’s dive in.
Yggdrasil
I’ll start with Yggdrasil because I’ve written so much about it already. It featured in prior posts such as:
- Make the Internet Yours Again With an Instant Mesh Network, which described the tyranny of IP rigidity and using Yggdrasil as a global mesh overlay.
- Using Yggdrasil As an Automatic Mesh Fabric to Connect All Your Docker Containers, VMs, and Servers is, in a significant sense, a more specific implementation of the ideas contained here; it’s a private Yggdrasil mesh providing the communications layer for dispersed Docker containers.
- Recovering Our Lost Free Will Online: Tools and Techniques That Are Available Now features Yggdrasil.
Yggdrasil can be a private mesh VPN, or something more
Yggdrasil can be a private mesh VPN, just like the other tools covered here. It’s unique, however, in that a key goal of the project is to also make it useful as a planet-scale global mesh network. As such, Yggdrasil is a testbed of new ideas in distributed routing designed to scale up to massive sizes and all sorts of connection conditions. As of 2023-04-10, the main global Yggdrasil mesh has over 5000 nodes in it. You can choose whether or not to participate.
Every node in a Yggdrasil mesh has a public/private keypair. Each node then has an IPv6 address (in a private address space) derived from its public key. Using these IPv6 addresses, you can communicate right away.
Yggdrasil differs from most of the other tools here in that it does not necessarily seek to establish a direct link on the clearnet between, say, host A and host G for them to communicate. It will prefer such a direct link if it exists, but it is perfectly happy if it doesn’t.
The reason is that every Yggdrasil node is also a router in the Yggdrasil mesh. Let’s sit with that concept for a moment. Consider:
- If you have a bunch of machines on your LAN, but only one of them can peer over the clearnet, that’s fine; all the other machines will discover this route to the world and use it when necessary.
- All you need to run a broker is just a regular node with a public IP address. If you are participating in the global mesh, you can use one (or more) of the free public peers for this purpose.
- It is not necessary for every node to know about the clearnet IP address of every other node (improving privacy). In fact, it’s not even necessary for every node to know about the existence of all the other nodes, so long as it can find a route to a given node when it’s asked to.
- Yggdrasil can find one or more routes between nodes, and it can use this knowledge of multiple routes to aggressively optimize for varying network conditions, including combinations of, say, downloads and low-latency ssh sessions.
Behind the scenes, Yggdrasil calculates optimal routes between nodes as necessary, using a mesh-wide DHT for initial contact and then deriving more optimal paths. (You can also read more details about the routing algorithm.)
One final way that Yggdrasil is different from most of the other tools is that there is no separate control server. No node is “special”, in charge, the sole keeper of metadata, or anything like that. The entire system is completely distributed and auto-assembling.
Meeting neighbors
There are two ways that Yggdrasil knows about peers:
- By broadcast discovery on the local LAN
- By listening on a specific port (or being told to connect to a specific host/port)
Sometimes this might lead to multiple ways to connect to a node; Yggdrasil prefers the connection auto-discovered by broadcast first, then the lowest-latency of the defined path. In other words, when your laptops are in the same room as each other on your local LAN, your packets will flow directly between them without traversing the Internet.
Unique uses
Yggdrasil is uniquely suited to network-challenged situations. As an example, in a post-disaster situation, Internet access may be unavailable or flaky, yet there may be many local devices – perhaps ones that had never known of each other before – that could share information. Yggdrasil meets this situation perfectly. The combination of broadcast auto-detection, distributed routing, and so forth, basically means that if there is any physical path between two nodes, Yggdrasil will find and enable it.
Ad-hoc wifi is rarely used because it is a real pain. Yggdrasil actually makes it useful! Its broadcast discovery doesn’t require any IP address provisioned on the interface at all (it just uses the IPv6 link-local address), so you don’t need to figure out a DHCP server or some such. And, Yggdrasil will tend to perform routing along the contours of the RF path. So you could have a laptop in the middle of a long distance relaying communications from people farther out, because it could see both. Or even a chain of such things.
Yggdrasil: Security and Privacy
Yggdrasil’s mesh is aggressively greedy. It will peer with any node it can find (unless told otherwise) and will find a route to anywhere it can. There are two main ways to make sure you keep unauthorized traffic out: by restricting who can talk to your mesh, and by firewalling the Yggdrasil interface. Both can be used, and they can be used simultaneously.
I’ll discuss firewalling more at the end of this article. Basically, you’ll almost certainly want to do this if you participate in the public mesh, because doing so is akin to having a globally-routable public IP address direct to your device.
If you want to restrict who can talk to your mesh, you just disable the broadcast feature on all your nodes (empty MulticastInterfaces
section in the config), and avoid telling any of your nodes to connect to a public peer. You can set a list of authorized public keys that can connect to your nodes’ listening interfaces, which you’ll probably want to do. You will probably want to either open up some inbound ports (if you can) or set up a node with a known clearnet IP on a place like a $5/mo VPS to help with NAT traversal (again, setting AllowedPublicKeys
as appropriate). Yggdrasil doesn’t allow filtering multicast clients by public key, only by network interface, so that’s why we disable broadcast discovery. You can easily enough teach Yggdrasil about static internal LAN IPs of your nodes and have things work that way. (Or, set up an internal “gateway” node or two, that the clients just connect to when they’re local). But fundamentally, you need to put a bit more thought into this with Yggdrasil than with the other tools here, which are closed-only.
Compared to some of the other tools here, Yggdrasil is better about information leakage; nodes only know details, such as clearnet IPs, of directly-connected peers. You can obtain the list of directly-connected peers of any known node in the mesh – but that list is the public keys of the directly-connected peers, not the clearnet IPs.
Some of the other tools contain a limited integrated firewall of sorts (with limited ACLs and such). Yggdrasil does not, but is fully compatible with on-host firewalls. I recommend these anyway even with many other tools.
Yggdrasil: Connectivity and NAT traversal
Compared to the other tools, Yggdrasil is an interesting mix. It provides a fully functional mesh and facilitates connectivity in situations in which no other tool can. Yet its NAT traversal, while it exists and does work, results in using a broker under some of the more challenging CGNAT situations more often than some of the other tools, which can impede performance.
Yggdrasil’s underlying protocol is TCP-based. Before you run away screaming that it must be slow and unreliable like OpenVPN over TCP – it’s not, and it is even surprisingly good around bufferbloat. I’ve found its performance to be on par with the other tools here, and it works as well as I’d expect even on flaky 4G links.
Overall, the NAT traversal story is mixed. On the one hand, you can run a node that listens on port 443 – and Yggdrasil can even make it speak TLS (even though that’s unnecessary from a security standpoint), so you can likely get out of most restrictive firewalls you will ever encounter. If you join the public mesh, know that plenty of public peers do listen on port 443 (and other well-known ports like 53, plus random high-numbered ones).
If you connect your system to multiple public peers, there is a chance – though a very small one – that some public transit traffic might be routed via it. In practice, public peers hopefully are already peered with each other, preventing this from happening (you can verify this with yggdrasilctl debug_remotegetpeers key=ABC...
). I have never experienced a problem with this. Also, since latency is a factor in routing for Yggdrasil, it is highly unlikely that random connections we use are going to be competitive with datacenter peers.
Yggdrasil: Sharing with friends
If you’re open to participating in the public mesh, this is one of the easiest things of all. Have your friend install Yggdrasil, point them to a public peer, give them your Yggdrasil IP, and that’s it. (Well, presumably you also open up your firewall – you did follow my advice to set one up, right?)
If your friend is visiting at your location, they can just hop on your wifi, install Yggdrasil, and it will automatically discover a route to you. Yggdrasil even has a zero-config mode for ephemeral nodes such as certain Docker containers.
Yggdrasil doesn’t directly support publishing to the clearnet, but it is certainly possible to proxy (or even NAT) to/from the clearnet, and people do.
Yggdrasil: DNS
There is no particular extra DNS in Yggdrasil. You can, of course, run a DNS server within Yggdrasil, just as you can anywhere else. Personally I just add relevant hosts to /etc/hosts
and leave it at that, but it’s up to you.
Yggdrasil: Source code, pricing, and portability
Yggdrasil is fully open source (LGPLv3 plus additional permissions in an exception) and highly portable. It is written in Go, and has prebuilt binaries for all major platforms (including a Debian package which I made).
There is no charge for anything with Yggdrasil. Listed public peers are free and run by volunteers. You can run your own peers if you like; they can be public and unlisted, public and listed (just submit a PR to get it listed), or private (accepting connections only from certain nodes’ keys). A “peer” in this case is just a node with a known clearnet IP address.
Yggdrasil encourages use in other projects. For instance, NNCP integrates a Yggdrasil node for easy communication with other NNCP nodes.
Yggdrasil conclusions
Yggdrasil is tops in reliability (having no single point of failure) and flexibility. It will maintain opportunistic connections between peers even if the Internet is down. The unique added feature of being able to be part of a global mesh is a nice one. The tradeoffs include being more prone to need to use a broker in restrictive CGNAT environments. Some other tools have clients that override the OS DNS resolver to also provide resolution of hostnames of member nodes; Yggdrasil doesn’t, though you can certainly run your own DNS infrastructure over Yggdrasil (or, for that matter, let public DNS servers provide Yggdrasil answers if you wish).
There is also a need to pay more attention to firewalling or maintaining separation from the public mesh. However, as I explain below, many other options have potential impacts if the control plane, or your account for it, are compromised, meaning you ought to firewall those, too. Still, it may be a more immediate concern with Yggdrasil.
Although Yggdrasil is listed as experimental, I have been using it for over a year and have found it to be rock-solid. They did change how mesh IPs were calculated when moving from 0.3 to 0.4, causing a global renumbering, so just be aware that this is a possibility while it is experimental.
tinc
tinc is the oldest tool on this list; version 1.0 came out in 2003! You can think of tinc as something akin to “an older Yggdrasil without the public option.”
I will be discussing tinc 1.0.36, the latest stable version, which came out in 2019. The development branch, 1.1, has been going since 2011 and had its latest release in 2021. The last commit to the Github repo was in June 2022.
Tinc is the only tool here to support both tun and tap style interfaces. I go into the difference more in the Zerotier review below. Tinc actually provides a better tap implementation than Zerotier, with various sane options for broadcasts, but I still think the call for an Ethernet, as opposed to IP, VPN is small.
To configure tinc, you generate a per-host configuration and then distribute it to every tinc node. It contains a host’s public key. Therefore, adding a host to the mesh means distributing its key everywhere; de-authorizing it means removing its key everywhere. This makes it rather unwieldy.
tinc can do LAN broadcast discovery and mesh routing, but generally speaking you must manually teach it where to connect initially. Somewhat confusingly, the examples all mention listing a public address for a node. This doesn’t make sense for a laptop, and I suspect you’d just omit it. I think that address is used for something akin to a Yggdrasil peer with a clearnet IP.
Unlike all of the other tools described here, tinc has no tool to inspect the running state of the mesh.
Some of the properties of tinc made it clear I was unlikely to adopt it, so this review wasn’t as thorough as that of Yggdrasil.
tinc: Security and Privacy
As mentioned above, every host in the tinc mesh is authenticated based on its public key. However, to be more precise, this key is validated only at the point it connects to its next hop peer. (To be sure, this is also the same as how the list of allowed pubkeys works in Yggdrasil.) Since IPs in tinc are not derived from their key, and any host can assign itself whatever mesh IP it likes, this implies that a compromised host could impersonate another.
It is unclear whether packets are end-to-end encrypted when using a tinc node as a router. The fact that they can be routed at the kernel level by the tun interface implies that they may not be.
tinc: Connectivity and NAT traversal
I was unable to find much information about NAT traversal in tinc, other than that it does support it. tinc can run over UDP or TCP and auto-detects which to use, preferring UDP.
tinc: Sharing with friends
tinc has no special support for this, and the difficulty of configuration makes it unlikely you’d do this with tinc.
tinc: Source code, pricing, and portability
tinc is fully open source (GPLv2). It is written in C and generally portable. It supports some very old operating systems. Mobile support is iffy.
tinc does not seem to be very actively maintained.
tinc conclusions
I haven’t mentioned performance in my other reviews (see the section at the end of this post). But, it is so poor as to only run about 300Mbps on my 2.5Gbps network. That’s 1/3 the speed of Yggdrasil or Tailscale. Combine that with the unwieldiness of adding hosts and some uncertainties in security, and I’m not going to be using tinc.
Automatic Peer-to-Peer Mesh VPNs with centralized control
These tend to be the options that are frequently discussed. Let’s talk about the options.
Tailscale
Tailscale is a popular choice in this type of VPN. To use Tailscale, you first sign up on tailscale.com. Then, you install the tailscale client on each machine. On first run, it prints a URL for you to click on to authorize the client to your mesh (“tailnet”). Tailscale assigns a mesh IP to each system. The Tailscale client lets the Tailscale control plane gather IP information about each node, including all detectable public and private clearnet IPs.
When you attempt to contact a node via Tailscale, the client will fetch the known contact information from the control plane and attempt to establish a link. If it can contact over the local LAN, it will (it doesn’t have broadcast autodetection like Yggdrasil; the information must come from the control plane). Otherwise, it will try various NAT traversal options. If all else fails, it will use a broker to relay traffic; Tailscale calls a broker a DERP relay server. Unlike Yggdrasil, a Tailscale node never relays traffic for another; all connections are either direct P2P or via a broker.
Tailscale, like several others, is based around Wireguard; though wireguard-go rather than the in-kernel Wireguard.
Tailscale has a number of somewhat unique features in this space:
- Funnel, which lets you expose ports on your system to the public Internet via the VPN.
- Exit nodes, which automate the process of routing your public Internet traffic over some other node in the network. This is possible with every tool mentioned here, but Tailscale makes switching it on or off a couple of quick commands away.
- Node sharing, which lets you share a subset of your network with guests
- A fantastic set of documentation, easily the best of the bunch.
Funnel, in particular, is interesting. With a couple of “tailscale serve”-style commands, you can expose a directory tree (or a development webserver) to the world. Tailscale gives you a public hostname, obtains a cert for it, and proxies inbound traffic to you. This is subject to some unspecified bandwidth limits, and you can only choose from three public ports, so it’s not really a production solution – but as a quick and easy way to demonstrate something cool to a friend, it’s a neat feature.
Tailscale: Security and Privacy
With Tailscale, as with the other tools in this category, one of the main threats to consider is the control plane. What are the consequences of a compromise of Tailscale’s control plane, or of the credentials you use to access it?
Let’s begin with the credentials used to access it. Tailscale operates no identity system itself, instead relying on third parties. For individuals, this means Google, Github, or Microsoft accounts; Okta and other SAML and similar identity providers are also supported, but this runs into complexity and expense that most individuals aren’t wanting to take on. Unfortunately, all three of those types of accounts often have saved auth tokens in a browser. Personally I would rather have a separate, very secure, login.
If a person does compromise your account or the Tailscale servers themselves, they can’t directly eavesdrop on your traffic because it is end-to-end encrypted. However, assuming an attacker obtains access to your account, they could:
- Tamper with your Tailscale ACLs, permitting new actions
- Add new nodes to the network
- Forcibly remove nodes from the network
- Enable or disable optional features
Of note is that they cannot just commandeer an existing IP. I would say the riskiest possibility here is that could add new nodes to the mesh. Because they could also tamper with your ACLs, they could then proceed to attempt to access all your internal services. They could even turn on service collection and have Tailscale tell them what and where all the services are.
Therefore, as with other tools, I recommend a local firewall on each machine with Tailscale. More on that below.
Tailscale has a new alpha feature called tailnet lock which helps with this problem. It requires existing nodes in the mesh to sign a request for a new node to join. Although this doesn’t address ACL tampering and some of the other things, it does represent a significant help with the most significant concern. However, tailnet lock is in alpha, only available on the Enterprise plan, and has a waitlist, so I have been unable to test it.
Any Tailscale node can request the IP addresses belonging to any other Tailscale node. The Tailscale control plane captures, and exposes to you, this information about every node in your network: the OS hostname, IP addresses and port numbers, operating system, creation date, last seen timestamp, and NAT traversal parameters. You can optionally enable service data capture as well, which sends data about open ports on each node to the control plane.
Tailscale likes to highlight their key expiry and rotation feature. By default, all keys expire after 180 days, and traffic to and from the expired node will be interrupted until they are renewed (basically, you re-login with your provider and do a renew operation). Unfortunately, the only mention I can see of warning of impeding expiration is in the Windows client, and even there you need to edit a registry key to get the warning more than the default 24 hours in advance. In short, it seems likely to cut off communications when it’s most important. You can disable key expiry on a per-node basis in the admin console web interface, and I mostly do, due to not wanting to lose connectivity at an inopportune time.
Tailscale: Connectivity and NAT traversal
When thinking about reliability, the primary consideration here is being able to reach the Tailscale control plane. While it is possible in limited circumstances to reach nodes without the Tailscale control plane, it is “a fairly brittle setup” and notably will not survive a client restart. So if you use Tailscale to reach other nodes on your LAN, that won’t work unless your Internet is up and the control plane is reachable.
Assuming your Internet is up and Tailscale’s infrastructure is up, there is little to be concerned with. Your own comfort level with cloud providers and your Internet should guide you here.
Tailscale wrote a fantastic article about NAT traversal and they, predictably, do very well with it. Tailscale prefers UDP but falls back to TCP if needed. Broker (DERP) servers step in as a last resort, and Tailscale clients automatically select the best ones. I’m not aware of anything that is more successful with NAT traversal than Tailscale. This maximizes the situations in which a direct P2P connection can be used without a broker.
I have found Tailscale to be a bit slow to notice changes in network topography compared to Yggdrasil, and sometimes needs a kick in the form of restarting the client process to re-establish communications after a network change. However, it’s possible (maybe even probable) that if I’d waited a bit longer, it would have sorted this all out.
Tailscale: Sharing with friends
I touched on the funnel feature earlier. The sharing feature lets you give an invite to an outsider. By default, a person accepting a share can make only outgoing connections to the network they’re invited to, and cannot receive incoming connections from that network – this makes sense. When sharing an exit node, you get a checkbox that lets you share access to the exit node as well. Of course, the person accepting the share needs to install the Tailnet client. The combination of funnel and sharing make Tailscale the best for ad-hoc sharing.
Tailscale: DNS
Tailscale’s DNS is called MagicDNS. It runs as a layer atop your standard DNS – taking over /etc/resolv.conf
on Linux – and provides resolution of mesh hostnames and some other features. This is a concept that is pretty slick.
It also is a bit flaky on Linux; dueling programs want to write to /etc/resolv.conf
. I can’t really say this is entirely Tailscale’s fault; they document the problem and some workarounds.
I would love to be able to add custom records to this service; for instance, to override the public IP for a service to use the in-mesh IP. Unfortunately, that’s not yet possible. However, MagicDNS can query existing nameservers for certain domains in a split DNS setup.
Tailscale: Source code, pricing, and portability
Tailscale is almost fully open source and the client is highly portable. The client is open source (BSD 3-clause) on open source platforms, and closed source on closed source platforms. The DERP servers are open source. The coordination server is closed source, although there is an open source coordination server called Headscale (also BSD 3-clause) made available with Tailscale’s blessing and informal support. It supports most, but not all, features in the Tailscale coordination server.
Tailscale’s pricing (which does not apply when using Headscale) provides a free plan for 1 user with up to 20 devices. A Personal Pro plan expands that to 100 devices for $48 per year - not a bad deal at $4/mo. A “Community on Github” plan also exists, and then there are more business-oriented plans as well. See the pricing page for details.
As a small note, I appreciated Tailscale’s install script. It properly added Tailscale’s apt key in a way that it can only be used to authenticate the Tailscale repo, rather than as a systemwide authenticator. This is a nice touch and speaks well of their developers.
Tailscale conclusions
Tailscale is tops in sharing and has a broad feature set and excellent documentation. Like other solutions with a centralized control plane, device communications can stop working if the control plane is unreachable, and the threat model of the control plane should be carefully considered.
Zerotier
Zerotier is a close competitor to Tailscale, and is similar to it in a lot of ways. So rather than duplicate all of the Tailscale information here, I’m mainly going to describe how it differs from Tailscale.
The primary difference between the two is that Zerotier emulates an Ethernet network via a Linux tap interface, while Tailscale emulates a TCP/IP network via a Linux tun interface.
However, Zerotier has a number of things that make it be a somewhat imperfect Ethernet emulator. For one, it has a problem with broadcast amplification; the machine sending the broadcast sends it to all the other nodes that should receive it (up to a set maximum). I wouldn’t want to have a lot of programs broadcasting on a slow link. While in theory this could let you run Netware or DECNet across Zerotier, I’m not really convinced there’s much call for that these days, and Zerotier is clearly IP-focused as it allocates IP addresses and such anyhow. Zerotier provides special support for emulated ARP (IPv4) and NDP (IPv6). While you could theoretically run Zerotier as a bridge, this eliminates the zero trust principle, and Tailscale supports subnet routers, which provide much of the same feature set anyhow.
A somewhat obscure feature, but possibly useful, is Zerotier’s built-in support for multipath WAN for the public interface. This actually lets you do a somewhat basic kind of channel bonding for WAN.
Zerotier: Security and Privacy
The picture here is similar to Tailscale, with the difference that you can create a Zerotier-local account rather than relying on cloud authentication. I was unable to find as much detail about Zerotier as I could about Tailscale - notably I couldn’t find anything about how “sticky” an IP address is. However, the configuration screen lets me delete a node and assign additional arbitrary IPs within a subnet to other nodes, so I think the assumption here is that if your Zerotier account (or the Zerotier control plane) is compromised, an attacker could remove a legit device, add a malicious one, and assign the previous IP of the legit device to the malicious one. I’m not sure how to mitigate against that risk, as firewalling specific IPs is ineffective if an attacker can simply take them over. Zerotier also lacks anything akin to Tailnet Lock.
For this reason, I didn’t proceed much further in my Zerotier evaluation.
Zerotier: Connectivity and NAT traversal
Like Tailscale, Zerotier has NAT traversal with STUN. However, it looks like it’s more limited than Tailscale’s, and in particular is incompatible with double NAT that is often seen these days. Zerotier operates brokers (“root servers”) that can do relaying, including TCP relaying. So you should be able to connect even from hostile networks, but you are less likely to form a P2P connection than with Tailscale.
Zerotier: Sharing with friends
I was unable to find any special features relating to this in the Zerotier documentation. Therefore, it would be at the same level as Yggdrasil: possible, maybe even not too difficult, but without any specific help.
Zerotier: DNS
Unlike Tailscale, Zerotier does not support automatically adding DNS entries for your hosts. Therefore, your options are approximately the same as Yggdrasil, though with the added option of pushing configuration pointing to your own non-Zerotier DNS servers to the client.
Zerotier: Source code, pricing, and portability
The client ZeroTier One is available on Github under a custom “business source license” which prevents you from using it in certain settings. This license would preclude it being included in Debian. Their library, libzt, is available under the same license. The pricing page mentions a community edition for self hosting, but the documentation is sparse and it was difficult to understand what its feature set really is.
The free plan lets you have 1 user with up to 25 devices. Paid plans are also available.
Zerotier conclusions
Frankly I don’t see much reason to use Zerotier. The “virtual Ethernet” model seems to be a weird hybrid that doesn’t bring much value. I’m concerned about the implications of a compromise of a user account or the control plane, and it lacks a lot of Tailscale features (MagicDNS and sharing). The only thing it may offer in particular is multipath WAN, but that’s esoteric enough – and also solvable at other layers – that it doesn’t seem all that compelling to me. Add to that the strange license and, to me anyhow, I don’t see much reason to bother with it.
Netmaker
Netmaker is one of the projects that is making noise these days. Netmaker is the only one here that is a wrapper around in-kernel Wireguard, which can make a performance difference when talking to peers on a 1Gbps or faster link. Also, unlike other tools, it has an ingress gateway feature that lets people that don’t have the Netmaker client, but do have Wireguard, participate in the VPN. I believe I also saw a reference somewhere to nodes as routers as with Yggdrasil, but I’m failing to dig it up now.
The project is in a bit of an early state; you can sign up for an “upcoming closed beta” with a SaaS host, but really you are generally pointed to self-hosting using the code in the github repo. There are community and enterprise editions, but it’s not clear how to actually choose. The server has a bunch of components: binary, CoreDNS, database, and web server. It also requires elevated privileges on the host, in addition to a container engine. Contrast that to the single binary that some others provide.
It looks like releases are frequent, but sometimes break things, and have a somewhat more laborious upgrade processes than most.
I don’t want to spend a lot of time managing my mesh. So because of the heavy needs of the server, the upgrades being labor-intensive, it taking over iptables and such on the server, I didn’t proceed with a more in-depth evaluation of Netmaker. It has a lot of promise, but for me, it doesn’t seem to be in a state that will meet my needs yet.
Nebula
Nebula is an interesting mesh project that originated within Slack, seems to still be primarily sponsored by Slack, but is also being developed by Defined Networking (though their product looks early right now). Unlike the other tools in this section, Nebula doesn’t have a web interface at all. Defined Networking looks likely to provide something of a SaaS service, but for now, you will need to run a broker (“lighthouse”) yourself; perhaps on a $5/mo VPS.
Due to the poor firewall traversal properties, I didn’t do a full evaluation of Nebula, but it still has a very interesting design.
Nebula: Security and Privacy
Since Nebula lacks a traditional control plane, the root of trust in Nebula is a CA (certificate authority). The documentation gives this example of setting it up:
./nebula-cert sign -name "lighthouse1" -ip "192.168.100.1/24"
./nebula-cert sign -name "laptop" -ip "192.168.100.2/24" -groups "laptop,home,ssh"
./nebula-cert sign -name "server1" -ip "192.168.100.9/24" -groups "servers"
./nebula-cert sign -name "host3" -ip "192.168.100.10/24"
So the cert contains your IP, hostname, and group allocation. Each host in the mesh gets your CA certificate, and the per-host cert and key generated from each of these steps.
This leads to a really nice security model. Your CA is the gatekeeper to what is trusted in your mesh. You can even have it airgapped or something to make it exceptionally difficult to breach the perimeter.
Nebula contains an integrated firewall. Because the ability to keep out unwanted nodes is so strong, I would say this may be the one mesh VPN you might consider using without bothering with an additional on-host firewall.
You can define static mappings from a Nebula mesh IP to a clearnet IP. I haven’t found information on this, but theoretically if NAT traversal isn’t required, these static mappings may allow Nebula nodes to reach each other even if Internet is down. I don’t know if this is truly the case, however.
Nebula: Connectivity and NAT traversal
This is a weak point of Nebula. Nebula sends all traffic over a single UDP port; there is no provision for using TCP. This is an issue at certain hotel and other public networks which open only TCP egress ports 80 and 443.
I couldn’t find a lot of detail on what Nebula’s NAT traversal is capable of, but according to a certain Github issue, this has been a sore spot for years and isn’t as capable as Tailscale.
You can designate nodes in Nebula as brokers (relays). The concept is the same as Yggdrasil, but it’s less versatile. You have to manually designate what relay to use. It’s unclear to me what happens if different nodes designate different relays. Keep in mind that this always happens over a UDP port.
Nebula: Sharing with friends
There is no particular support here.
Nebula: DNS
Nebula has experimental DNS support. In contrast with Tailscale, which has an internal DNS server on every node, Nebula only runs a DNS server on a lighthouse. This means that it can’t forward requests to a DNS server that’s upstream for your laptop’s particular current location. Actually, Nebula’s DNS server doesn’t forward at all. It also doesn’t resolve its own name.
The Nebula documentation makes reference to using multiple lighthouses, which you may want to do for DNS redundancy or performance, but it’s unclear to me if this would make each lighthouse form a complete picture of the network.
Nebula: Source code, pricing, and portability
Nebula is fully open source (MIT). It consists of a single Go binary and configuration. It is fairly portable.
Nebula conclusions
I am attracted to Nebula’s unique security model. I would probably be more seriously considering it if not for the lack of support for TCP and poor general NAT traversal properties. Its datacenter connectivity heritage does show through.
Roll your own and hybrid
Here is a grab bag of ideas:
Running Yggdrasil over Tailscale
One possibility would be to use Tailscale for its superior NAT traversal, then allow Yggdrasil to run over it. (You will need a firewall to prevent Tailscale from trying to run over Yggdrasil at the same time!) This creates a closed network with all the benefits of Yggdrasil, yet getting the NAT traversal from Tailscale.
Drawbacks might be the overhead of the double encryption and double encapsulation. A good Yggdrasil peer may wind up being faster than this anyhow.
Public VPN provider for NAT traversal
A public VPN provider such as Mullvad will often offer incoming port forwarding and nodes in many cities. This could be an attractive way to solve a bunch of NAT traversal problems: just use one of those services to get you an incoming port, and run whatever you like over that.
Be aware that a number of public VPN clients have a “kill switch” to prevent any traffic from egressing without using the VPN; see, for instance, Mullvad’s. You’ll need to disable this if you are running a mesh atop it.
Other
Combining with local firewalls
For most of these tools, I recommend using a local firewal in conjunction with them. I have been using firehol and find it to be quite nice. This means you don’t have to trust the mesh, the control plane, or whatever. The catch is that you do need your mesh VPN to provide strong association between IP address and node. Most, but not all, do.
Performance
I tested some of these for performance using iperf3 on a 2.5Gbps LAN. Here are the results. All speeds are in Mbps.
Tool | iperf3 (default) | iperf3 -P 10 | iperf3 -R |
---|---|---|---|
Direct (no VPN) | 2406 | 2406 | 2764 |
Wireguard (kernel) | 1515 | 1566 | 2027 |
Yggdrasil | 892 | 1126 | 1105 |
Tailscale | 950 | 1034 | 1085 |
Tinc | 296 | 300 | 277 |
You can see that Wireguard was significantly faster than the other options. Tailscale and Yggdrasil were roughly comparable, and Tinc was terrible.
IP collisions
When you are communicating over a network such as these, you need to trust that the IP address you are communicating with belongs to the system you think it does. This protects against two malicious actor scenarios:
- Someone compromises one machine on your mesh and reconfigures it to impersonate a more important one
- Someone connects an unauthorized system to the mesh, taking over a trusted IP, and uses the privileges of the trusted IP to access resources
To summarize the state of play as highlighted in the reviews above:
- Yggdrasil derives IPv6 addresses from a public key
- tinc allows any node to set any IP
- Tailscale IPs aren’t user-assignable, but the assignment algorithm is unknown
- Zerotier allows any IP to be allocated to any node at the control plane
- I don’t know what Netmaker does
- Nebula IPs are baked into the cert and signed by the CA, but I haven’t verified the enforcement algorithm
So this discussion really only applies to Yggdrasil and Tailscale. tinc and Zerotier lack detailed IP security, while Nebula expects IP allocations to be handled outside of the tool and baked into the certs (therefore enforcing rigidity at that level).
So the question for Yggdrasil and Tailscale is: how easy is it to commandeer a trusted IP?
Yggdrasil has a brief discussion of this. In short, Yggdrasil offers you both a dedicated IP and a rarely-used /64 prefix which you can delegate to other machines on your LAN. Obviously by taking the dedicated IP, a lot more bits are available for the hash of the node’s public key, making “collisions technically impractical, if not outright impossible.” However, if you use the /64 prefix, a collision may be more possible. Yggdrasil’s hashing algorithm includes some optimizations to make this more difficult. Yggdrasil includes a genkeys
tool that uses more CPU cycles to generate keys that are maximally difficult to collide with.
Tailscale doesn’t document their IP assignment algorithm, but I think it is safe to say that the larger subnet you use, the better. If you try to use a /24 for your mesh, it is certainly conceivable that an attacker could remove your trusted node, then just manually add the 240 or so machines it would take to get that IP reassigned. It might be a good idea to use a purely IPv6 mesh with Tailscale to minimize this problem as well.
So, I think the risk is low in the default configurations of both Yggdrasil and Tailscale (certainly lower than with tinc or Zerotier). You can drive the risk even lower with both.
Final thoughts
For my own purposes, I suspect I will remain with Yggdrasil in some fashion. Maybe I will just take the small performance hit that using a relay node implies. Or perhaps I will get clever and use an incoming VPN port forward or go over Tailscale.
Tailscale was the other option that seemed most interesting. However, living in a region with Internet that goes down more often than I’d like, I would like to just be able to send as much traffic over a mesh as possible, trusting that if the LAN is up, the mesh is up.
I have one thing that really benefits from performance in excess of Yggdrasil or Tailscale: NFS. That’s between two machines that never leave my LAN, so I will probably just set up a direct Wireguard link between them. Heck of a lot easier than trying to do Kerberos!
Finally, I wrote this intending to be useful. I dealt with a lot of complexity and under-documentation, so it’s possible I got something wrong somewhere. Please let me know if you find any errors.
This blog post is a copy of a page on my website. That page may be periodically updated.
The PC & Internet Revolution in Rural America
Inspired by several others (such as Alex Schroeder’s post and Szczeżuja’s prompt), as well as a desire to get this down for my kids, I figure it’s time to write a bit about living through the PC and Internet revolution where I did: outside a tiny town in rural Kansas. And, as I’ve been back in that same area for the past 15 years, I reflect some on the challenges that continue to play out.
Although the stories from the others were primarily about getting online, I want to start by setting some background. Those of you that didn’t grow up in the same era as I did probably never realized that a typical business PC setup might cost $10,000 in today’s dollars, for instance. So let me start with the background.
Nothing was easy
This story begins in the 1980s. Somewhere around my Kindergarten year of school, around 1985, my parents bought a TRS-80 Color Computer 2 (aka CoCo II). It had 64K of RAM and used a TV for display and sound.
This got you the computer. It didn’t get you any disk drive or anything, no joysticks (required by a number of games). So whenever the system powered down, or it hung and you had to power cycle it – a frequent event – you’d lose whatever you were doing and would have to re-enter the program, literally by typing it in.
The floppy drive for the CoCo II cost more than the computer, and it was quite common for people to buy the computer first and then the floppy drive later when they’d saved up the money for that.
I particularly want to mention that computers then didn’t come with a modem. What would be like buying a laptop or a tablet without wifi today. A modem, which I’ll talk about in a bit, was another expensive accessory. To cobble together a system in the 80s that was capable of talking to others – with persistent storage (floppy, or hard drive), screen, keyboard, and modem – would be quite expensive. Adjusted for inflation, if you’re talking a PC-style device (a clone of the IBM PC that ran DOS), this would easily be more expensive than the Macbook Pros of today.
Few people back in the 80s had a computer at home. And the portion of those that had even the capability to get online in a meaningful way was even smaller.
Eventually my parents bought a PC clone with 640K RAM and dual floppy drives. This was primarily used for my mom’s work, but I did my best to take it over whenever possible. It ran DOS and, despite its monochrome screen, was generally a more capable machine than the CoCo II. For instance, it supported lowercase. (I’m not even kidding; the CoCo II pretty much didn’t.) A while later, they purchased a 32MB hard drive for it – what luxury!
Just getting a machine to work wasn’t easy. Say you’d bought a PC, and then bought a hard drive, and a modem. You didn’t just plug in the hard drive and it would work. You would have to fight it every step of the way. The BIOS and DOS partition tables of the day used a cylinder/head/sector method of addressing the drive, and various parts of that those addresses had too few bits to work with the “big” drives of the day above 20MB. So you would have to lie to the BIOS and fdisk in various ways, and sort of work out how to do it for each drive. For each peripheral – serial port, sound card (in later years), etc., you’d have to set jumpers for DMA and IRQs, hoping not to conflict with anything already in the system. Perhaps you can now start to see why USB and PCI were so welcomed.
Sharing and finding resources
Despite the two computers in our home, it wasn’t as if software written on one machine just ran on another. A lot of software for PC clones assumed a CGA color display. The monochrome HGC in our PC wasn’t particularly compatible. You could find a TSR program to emulate the CGA on the HGC, but it wasn’t particularly stable, and there’s only so much you can do when a program that assumes color displays on a monitor that can only show black, dark amber, or light amber.
So I’d periodically get to use other computers – most commonly at an office in the evening when it wasn’t being used.
There were some local computer clubs that my dad took me to periodically. Software was swapped back then; disks copied, shareware exchanged, and so forth. For me, at least, there was no “online” to download software from, and selling software over the Internet wasn’t a thing at all.
Three Different Worlds
There were sort of three different worlds of computing experience in the 80s:
- Home users. Initially using a wide variety of software from Apple, Commodore, Tandy/RadioShack, etc., but eventually coming to be mostly dominated by IBM PC clones
- Small and mid-sized business users. Some of them had larger minicomputers or small mainframes, but most that I had contact with by the early 90s were standardized on DOS-based PCs. More advanced ones had a network running Netware, most commonly. Networking hardware and software was generally too expensive for home users to use in the early days.
- Universities and large institutions. These are the places that had the mainframes, the earliest implementations of TCP/IP, the earliest users of UUCP, and so forth.
The difference between the home computing experience and the large institution experience were vast. Not only in terms of dollars – the large institution hardware could easily cost anywhere from tens of thousands to millions of dollars – but also in terms of sheer resources required (large rooms, enormous power circuits, support staff, etc). Nothing was in common between them; not operating systems, not software, not experience. I was never much aware of the third category until the differences started to collapse in the mid-90s, and even then I only was exposed to it once the collapse was well underway.
You might say to me, “Well, Google certainly isn’t running what I’m running at home!” And, yes of course, it’s different. But fundamentally, most large datacenters are running on x86_64 hardware, with Linux as the operating system, and a TCP/IP network. It’s a different scale, obviously, but at a fundamental level, the hardware and operating system stack are pretty similar to what you can readily run at home. Back in the 80s and 90s, this wasn’t the case. TCP/IP wasn’t even available for DOS or Windows until much later, and when it was, it was a clunky beast that was difficult.
One of the things Kevin Driscoll highlights in his book called Modem World – see my short post about it – is that the history of the Internet we usually receive is focused on case 3: the large institutions. In reality, the Internet was and is literally a network of networks. Gateways to and from Internet existed from all three kinds of users for years, and while TCP/IP ultimately won the battle of the internetworking protocol, the other two streams of users also shaped the Internet as we now know it. Like many, I had no access to the large institution networks, but as I’ve been reflecting on my experiences, I’ve found a new appreciation for the way that those of us that grew up with primarily home PCs shaped the evolution of today’s online world also.
An Era of Scarcity
I should take a moment to comment about the cost of software back then. A newspaper article from 1985 comments that WordPerfect, then the most powerful word processing program, sold for $495 (or $219 if you could score a mail order discount). That’s $1360/$600 in 2022 money. Other popular software, such as Lotus 1-2-3, was up there as well. If you were to buy a new PC clone in the mid to late 80s, it would often cost $2000 in 1980s dollars. Now add a printer – a low-end dot matrix for $300 or a laser for $1500 or even more. A modem: another $300. So the basic system would be $3600, or $9900 in 2022 dollars. If you wanted a nice printer, you’re now pushing well over $10,000 in 2022 dollars.
You start to see one barrier here, and also why things like shareware and piracy – if it was indeed even recognized as such – were common in those days.
So you can see, from a home computer setup (TRS-80, Commodore C64, Apple ][, etc) to a business-class PC setup was an order of magnitude increase in cost. From there to the high-end minis/mainframes was another order of magnitude (at least!) increase. Eventually there was price pressure on the higher end and things all got better, which is probably why the non-DOS PCs lasted until the early 90s.
Increasing Capabilities
My first exposure to computers in school was in the 4th grade, when I would have been about 9. There was a single Apple ][ machine in that room. I primarily remember playing Oregon Trail on it. The next year, the school added a computer lab. Remember, this is a small rural area, so each graduating class might have about 25 people in it; this lab was shared by everyone in the K-8 building. It was full of some flavor of IBM PS/2 machines running DOS and Netware. There was a dedicated computer teacher too, though I think she was a regular teacher that was given somewhat minimal training on computers. We were going to learn typing that year, but I did so well on the very first typing program that we soon worked out that I could do programming instead. I started going to school early – these machines were far more powerful than the XT at home – and worked on programming projects there.
Eventually my parents bought me a Gateway 486SX/25 with a VGA monitor and hard drive. Wow! This was a whole different world. It may have come with Windows 3.0 or 3.1 on it, but I mainly remember running OS/2 on that machine. More on that below.
Programming
That CoCo II came with a BASIC interpreter in ROM. It came with a large manual, which served as a BASIC tutorial as well. The BASIC interpreter was also the shell, so literally you could not use the computer without at least a bit of BASIC.
Once I had access to a DOS machine, it also had a basic interpreter: GW-BASIC. There was a fair bit of software written in BASIC at the time, but most of the more advanced software wasn’t. I wondered how these .EXE and .COM programs were written. I could find vague references to DEBUG.EXE, assemblers, and such. But it wasn’t until I got a copy of Turbo Pascal that I was able to do that sort of thing myself. Eventually I got Borland C++ and taught myself C as well. A few years later, I wanted to try writing GUI programs for Windows, and bought Watcom C++ – much cheaper than the competition, and it could target Windows, DOS (and I think even OS/2).
Notice that, aside from BASIC, none of this was free, and none of it was bundled. You couldn’t just download a C compiler, or Python interpreter, or whatnot back then. You had to pay for the ability to write any kind of serious code on the computer you already owned.
The Microsoft Domination
Microsoft came to dominate the PC landscape, and then even the computing landscape as a whole. IBM very quickly lost control over the hardware side of PCs as Compaq and others made clones, but Microsoft has managed – in varying degrees even to this day – to keep a stranglehold on the software, and especially the operating system, side. Yes, there was occasional talk of things like DR-DOS, but by and large the dominant platform came to be the PC, and if you had a PC, you ran DOS (and later Windows) from Microsoft.
For awhile, it looked like IBM was going to challenge Microsoft on the operating system front; they had OS/2, and when I switched to it sometime around the version 2.1 era in 1993, it was unquestionably more advanced technically than the consumer-grade Windows from Microsoft at the time. It had Internet support baked in, could run most DOS and Windows programs, and had introduced a replacement for the by-then terrible FAT filesystem: HPFS, in 1988. Microsoft wouldn’t introduce a better filesystem for its consumer operating systems until Windows XP in 2001, 13 years later. But more on that story later.
Free Software, Shareware, and Commercial Software
I’ve covered the high cost of software already. Obviously $500 software wasn’t going to sell in the home market. So what did we have?
Mainly, these things:
- Public domain software. It was free to use, and if implemented in BASIC, probably had source code with it too.
- Shareware
- Commercial software (some of it from small publishers was a lot cheaper than $500)
Let’s talk about shareware. The idea with shareware was that a company would release a useful program, sometimes limited. You were encouraged to “register”, or pay for, it if you liked it and used it. And, regardless of whether you registered it or not, were told “please copy!” Sometimes shareware was fully functional, and registering it got you nothing more than printed manuals and an easy conscience (guilt trips for not registering weren’t necessarily very subtle). Sometimes unregistered shareware would have a “nag screen” – a delay of a few seconds while they told you to register. Sometimes they’d be limited in some way; you’d get more features if you registered. With games, it was popular to have a trilogy, and release the first episode – inevitably ending with a cliffhanger – as shareware, and the subsequent episodes would require registration. In any event, a lot of software people used in the 80s and 90s was shareware. Also pirated commercial software, though in the earlier days of computing, I think some people didn’t even know the difference.
Notice what’s missing: Free Software / FLOSS in the Richard Stallman sense of the word. Stallman lived in the big institution world – after all, he worked at MIT – and what he was doing with the Free Software Foundation and GNU project beginning in 1983 never really filtered into the DOS/Windows world at the time. I had no awareness of it even existing until into the 90s, when I first started getting some hints of it as a port of gcc became available for OS/2. The Internet was what really brought this home, but I’m getting ahead of myself.
I want to say again: FLOSS never really entered the DOS and Windows 3.x ecosystems. You’d see it make a few inroads here and there in later versions of Windows, and moreso now that Microsoft has been sort of forced to accept it, but still, reflect on its legacy. What is the software market like in Windows compared to Linux, even today?
Now it is, finally, time to talk about connectivity!
Getting On-Line
What does it even mean to get on line? Certainly not connecting to a wifi access point. The answer is, unsurprisingly, complex. But for everyone except the large institutional users, it begins with a telephone.
The telephone system
By the 80s, there was one communication network that already reached into nearly every home in America: the phone system. Virtually every household (note I don’t say every person) was uniquely identified by a 10-digit phone number. You could, at least in theory, call up virtually any other phone in the country and be connected in less than a minute.
But I’ve got to talk about cost. The way things worked in the USA, you paid a monthly fee for a phone line. Included in that monthly fee was unlimited “local” calling. What is a local call? That was an extremely complex question. Generally it meant, roughly, calling within your city. But of course, as you deal with things like suburbs and cities growing into each other (eg, the Dallas-Ft. Worth metroplex), things got complicated fast. But let’s just say for simplicity you could call others in your city.
What about calling people not in your city? That was “long distance”, and you paid – often hugely – by the minute for it. Long distance rates were difficult to figure out, but were generally most expensive during business hours and cheapest at night or on weekends. Prices eventually started to come down when competition was introduced for long distance carriers, but even then you often were stuck with a single carrier for long distance calls outside your city but within your state. Anyhow, let’s just leave it at this: local calls were virtually free, and long distance calls were extremely expensive.
Getting a modem
I remember getting a modem that ran at either 1200bps or 2400bps. Either way, quite slow; you could often read even plain text faster than the modem could display it. But what was a modem?
A modem hooked up to a computer with a serial cable, and to the phone system. By the time I got one, modems could automatically dial and answer. You would send a command like ATDT5551212 and it would dial 555-1212. Modems had speakers, because often things wouldn’t work right, and the telephone system was oriented around speech, so you could hear what was happening. You’d hear it wait for dial tone, then dial, then – hopefully – the remote end would ring, a modem there would answer, you’d hear the screeching of a handshake, and eventually your terminal would say CONNECT 2400
. Now your computer was bridged to the other; anything going out your serial port was encoded as sound by your modem and decoded at the other end, and vice-versa.
But what, exactly, was “the other end?”
It might have been another person at their computer. Turn on local echo, and you can see what they did. Maybe you’d send files to each other. But in my case, the answer was different: PC Magazine.
PC Magazine and CompuServe
Starting around 1986 (so I would have been about 6 years old), I got to read PC Magazine. My dad would bring copies that were being discarded at his office home for me to read, and I think eventually bought me a subscription directly. This was not just a standard magazine; it ran something like 350-400 pages an issue, and came out every other week. This thing was a monster. It had reviews of hardware and software, descriptions of upcoming technologies, pages and pages of ads (that often had some degree of being informative to them). And they had sections on programming. Many issues would talk about BASIC or Pascal programming, and there’d be a utility in most issues. What do I mean by a “utility in most issues”? Did they include a floppy disk with software?
No, of course not. There was a literal program listing printed in the magazine. If you wanted the utility, you had to type it in. And a lot of them were written in assembler, so you had to have an assembler. An assembler, of course, was not free and I didn’t have one. Or maybe they wrote it in Microsoft C, and I had Borland C, and (of course) they weren’t compatible. Sometimes they would list the program sort of in binary: line after line of a BASIC program, with lines like “64, 193, 253, 0, 53, 0, 87” that you would type in for hours, hopefully correctly. Running the BASIC program would, if you got it correct, emit a .COM file that you could then run. They did have a rudimentary checksum system built in, but it wasn’t even a CRC, so something like swapping two numbers you’d never notice except when the program would mysteriously hang.
Eventually they teamed up with CompuServe to offer a limited slice of CompuServe for the purpose of downloading PC Magazine utilities. This was called PC MagNet. I am foggy on the details, but I believe that for a time you could connect to the limited PC MagNet part of CompuServe “for free” (after the cost of the long-distance call, that is) rather than paying for CompuServe itself (because, OF COURSE, that also charged you per the minute.) So in the early days, I would get special permission from my parents to place a long distance call, and after some nerve-wracking minutes in which we were aware every minute was racking up charges, I could navigate the menus, download what I wanted, and log off immediately.
I still, incidentally, mourn what PC Magazine became. As with computing generally, it followed the mass market. It lost its deep technical chops, cut its programming columns, stopped talking about things like how SCSI worked, and so forth. By the time it stopped printing in 2009, it was no longer a square-bound 400-page beheamoth, but rather looked more like a copy of Newsweek, but with less depth.
Continuing with CompuServe
CompuServe was a much larger service than just PC MagNet. Eventually, our family got a subscription. It was still an expensive and scarce resource; I’d call it only after hours when the long-distance rates were cheapest. Everyone had a numerical username separated by commas; mine was 71510,1421
. CompuServe had forums, and files. Eventually I would use TapCIS to queue up things I wanted to do offline, to minimize phone usage online.
CompuServe eventually added a gateway to the Internet. For the sum of somewhere around $1 a message, you could send or receive an email from someone with an Internet email address! I remember the thrill of one time, as a kid of probably 11 years, sending a message to one of the editors of PC Magazine and getting a kind, if brief, reply back!
But inevitably I had…
The Godzilla Phone Bill
Yes, one month I became lax in tracking my time online. I ran up my parents’ phone bill. I don’t remember how high, but I remember it was hundreds of dollars, a hefty sum at the time. As I watched Jason Scott’s BBS Documentary, I realized how common an experience this was. I think this was the end of CompuServe for me for awhile.
Toll-Free Numbers
I lived near a town with a population of 500. Not even IN town, but near town. The calling area included another town with a population of maybe 1500, so all told, there were maybe 2000 people total I could talk to with a local call – though far fewer numbers, because remember, telephones were allocated by the household. There was, as far as I know, zero modems that were a local call (aside from one that belonged to a friend I met in around 1992). So basically everything was long-distance.
But there was a special feature of the telephone network: toll-free numbers. Normally when calling long-distance, you, the caller, paid the bill. But with a toll-free number, beginning with 1-800, the recipient paid the bill. These numbers almost inevitably belonged to corporations that wanted to make it easy for people to call. Sales and ordering lines, for instance. Some of these companies started to set up modems on toll-free numbers. There were few of these, but they existed, so of course I had to try them!
One of them was a company called PennyWise that sold office supplies. They had a toll-free line you could call with a modem to order stuff. Yes, online ordering before the web! I loved office supplies. And, because I lived far from a big city, if the local K-Mart didn’t have it, I probably couldn’t get it. Of course, the interface was entirely text, but you could search for products and place orders with the modem. I had loads of fun exploring the system, and actually ordered things from them – and probably actually saved money doing so. With the first order they shipped a monster full-color catalog. That thing must have been 500 pages, like the Sears catalogs of the day. Every item had a part number, which streamlined ordering through the modem.
Inbound FAXes
By the 90s, a number of modems became able to send and receive FAXes as well. For those that don’t know, a FAX machine was essentially a special modem. It would scan a page and digitally transmit it over the phone system, where it would – at least in the early days – be printed out in real time (because the machines didn’t have the memory to store an entire page as an image). Eventually, PC modems integrated FAX capabilities.
There still wasn’t anything useful I could do locally, but there were ways I could get other companies to FAX something to me. I remember two of them.
One was for US Robotics. They had an “on demand” FAX system. You’d call up a toll-free number, which was an automated IVR system. You could navigate through it and select various documents of interest to you: spec sheets and the like. You’d key in your FAX number, hang up, and US Robotics would call YOU and FAX you the documents you wanted. Yes! I was talking to a computer (of a sorts) at no cost to me!
The New York Times also ran a service for awhile called TimesFax. Every day, they would FAX out a page or two of summaries of the day’s top stories. This was pretty cool in an era in which I had no other way to access anything from the New York Times. I managed to sign up for TimesFax – I have no idea how, anymore – and for awhile I would get a daily FAX of their top stories. When my family got its first laser printer, I could them even print these FAXes complete with the gothic New York Times masthead. Wow! (OK, so technically I could print it on a dot-matrix printer also, but graphics on a 9-pin dot matrix is a kind of pain that is a whole other article.)
My own phone line
Remember how I discussed that phone lines were allocated per household? This was a problem for a lot of reasons:
- Anybody that tried to call my family while I was using my modem would get a busy signal (unable to complete the call)
- If anybody in the house picked up the phone while I was using it, that would degrade the quality of the ongoing call and either mess up or disconnect the call in progress. In many cases, that could cancel a file transfer (which wasn’t necessarily easy or possible to resume), prompting howls of annoyance from me.
- Generally we all had to work around each other
So eventually I found various small jobs and used the money I made to pay for my own phone line and my own long distance costs. Eventually I upgraded to a 28.8Kbps US Robotics Courier modem even! Yes, you heard it right: I got a job and a bank account so I could have a phone line and a faster modem. Uh, isn’t that why every teenager gets a job?
Now my local friend and I could call each other freely – at least on my end (I can’t remember if he had his own phone line too). We could exchange files using HS/Link, which had the added benefit of allowing split-screen chat even while a file transfer is in progress. I’m sure we spent hours chatting to each other keyboard-to-keyboard while sharing files with each other.
Technology in Schools
By this point in the story, we’re in the late 80s and early 90s. I’m still using PC-style OSs at home; OS/2 in the later years of this period, DOS or maybe a bit of Windows in the earlier years. I mentioned that they let me work on programming at school starting in 5th grade. It was soon apparent that I knew more about computers than anybody on staff, and I started getting pulled out of class to help teachers or administrators with vexing school problems. This continued until I graduated from high school, incidentally – often to my enjoyment, and the annoyance of one particular teacher who, I must say, I was fine with annoying in this way.
That’s not to say that there was institutional support for what I was doing. It was, after all, a small school. Larger schools might have introduced BASIC or maybe Logo in high school. But I had already taught myself BASIC, Pascal, and C by the time I was somewhere around 12 years old. So I wouldn’t have had any use for that anyhow.
There were programming contests occasionally held in the area. Schools would send teams. My school didn’t really “send” anybody, but I went as an individual. One of them was run by a local college (but for jr. high or high school students. Years later, I met one of the professors that ran it. He remembered me, and that day, better than I did. The programming contest had problems one could solve in BASIC or Logo. I knew nothing about what to expect going into it, but I had lugged my computer and screen along, and asked him, “Can I write my solutions in C?” He was, apparently, stunned, but said sure, go for it. I took first place that day, leading to some rather confused teams from much larger schools.
The Netware network that the school had was, as these generally were, itself isolated. There was no link to the Internet or anything like it. Several schools across three local counties eventually invested in a fiber-optic network linking them together. This built a larger, but still closed, network. Its primary purpose was to allow students to be exposed to a wider variety of classes at high schools. Participating schools had an “ITV room”, outfitted with cameras and mics. So students at any school could take classes offered over ITV at other schools. For instance, only my school taught German classes, so people at any of those participating schools could take German. It was an early “Zoom room.” But alongside the TV signal, there was enough bandwidth to run some Netware frames. By about 1995 or so, this let one of the schools purchase some CD-ROM software that was made available on a file server and could be accessed by any participating school. Nice! But Netware was mainly about file and printer sharing; there wasn’t even a facility like email, at least not on our deployment.
BBSs
My last hop before the Internet was the BBS. A BBS was a computer program, usually ran by a hobbyist like me, on a computer with a modem connected. Callers would call it up, and they’d interact with the BBS. Most BBSs had discussion groups like forums and file areas. Some also had games. I, of course, continued to have that most vexing of problems: they were all long-distance.
There were some ways to help with that, chiefly QWK and BlueWave. These, somewhat like TapCIS in the CompuServe days, let me download new message posts for reading offline, and queue up my own messages to send later. QWK and BlueWave didn’t help with file downloading, though.
BBSs get networked
BBSs were an interesting thing. You’d call up one, and inevitably somewhere in the file area would be a BBS list. Download the BBS list and you’ve suddenly got a list of phone numbers to try calling. All of them were long distance, of course. You’d try calling them at random and have a success rate of maybe 20%. The other 80% would be defunct; you might get the dreaded “this number is no longer in service” or the even more dreaded angry human answering the phone (and of course a modem can’t talk to a human, so they’d just get silence for probably the nth time that week). The phone company cared nothing about BBSs and recycled their numbers just as fast as any others.
To talk to various people, or participate in certain discussion groups, you’d have to call specific BBSs. That’s annoying enough in the general case, but even more so for someone paying long distance for it all, because it takes a few minutes to establish a connection to a BBS: handshaking, logging in, menu navigation, etc.
But BBSs started talking to each other. The earliest successful such effort was FidoNet, and for the duration of the BBS era, it remained by far the largest. FidoNet was analogous to the UUCP that the institutional users had, but ran on the much cheaper PC hardware. Basically, BBSs that participated in FidoNet would relay email, forum posts, and files between themselves overnight. Eventually, as with UUCP, by hopping through this network, messages could reach around the globe, and forums could have worldwide participation – asynchronously, long before they could link to each other directly via the Internet. It was almost entirely volunteer-run.
Running my own BBS
At age 13, I eventually chose to set up my own BBS. It ran on my single phone line, so of course when I was dialing up something else, nobody could dial up me. Not that this was a huge problem; in my town of 500, I probably had a good 1 or 2 regular callers in the beginning.
In the PC era, there was a big difference between a server and a client. Server-class software was expensive and rare. Maybe in later years you had an email client, but an email server would be completely unavailable to you as a home user. But with a BBS, I could effectively run a server. I even ran serial lines in our house so that the BBS could be connected from other rooms! Since I was running OS/2, the BBS didn’t tie up the computer; I could continue using it for other things.
FidoNet had an Internet email gateway. This one, unlike CompuServe’s, was free. Once I had a BBS on FidoNet, you could reach me from the Internet using the FidoNet address. This didn’t support attachments, but then email of the day didn’t really, either.
Various others outside Kansas ran FidoNet distribution points. I believe one of them was mgmtsys; my memory is quite vague, but I think they offered a direct gateway and I would call them to pick up Internet mail via FidoNet protocols, but I’m not at all certain of this.
Pros and Cons of the Non-Microsoft World
As mentioned, Microsoft was and is the dominant operating system vendor for PCs. But I left that world in 1993, and here, nearly 30 years later, have never really returned. I got an operating system with more technical capabilities than the DOS and Windows of the day, but the tradeoff was a much smaller software ecosystem. OS/2 could run DOS programs, but it ran OS/2 programs a lot better. So if I were to run a BBS, I wanted one that had a native OS/2 version – limiting me to a small fraction of available BBS server software. On the other hand, as a fully 32-bit operating system, there started to be OS/2 ports of certain software with a Unix heritage; most notably for me at the time, gcc. At some point, I eventually came across the RMS essays and started to be hooked.
Internet: The Hunt Begins
I certainly was aware that the Internet was out there and interesting. But the first problem was: how the heck do I get connected to the Internet?
Learning Link and Gopher
ISPs weren’t really a thing; the first one in my area (though still a long-distance call) started in, I think, 1994. One service that one of my teachers got me hooked up with was Learning Link. Learning Link was a nationwide collaboration of PBS stations and schools, designed to build on the educational mission of PBS. The nearest Learning Link station was more than a 3-hour drive away… but critically, they had a toll-free access number, and my teacher convinced them to let me use it. I connected via a terminal program and a modem, like with most other things. I don’t remember much about it, but I do remember a very important thing it had: Gopher. That was my first experience with Gopher.
Learning Link was hosted by a Unix derivative (Xenix), but it didn’t exactly give everyone a shell. I seem to recall it didn’t have open FTP access either. The Gopher client had FTP access at some point; I don’t recall for sure if it did then. If it did, then when a Gopher server referred to an FTP server, I could get to it. (I am unclear at this point if I could key in an arbitrary FTP location, or knew how, at that time.) I also had email access there, but I don’t recall exactly how; probably Pine. If that’s correct, that would have dated my Learning Link access as no earlier than 1992.
I think my access time to Learning Link was limited. And, since the only way to get out on the Internet from there was Gopher and Pine, I was somewhat limited in terms of technology as well. I believe that telnet services, for instance, weren’t available to me.
Computer labs
There was one place that tended to have Internet access: colleges and universities. In 7th grade, I participated in a program that resulted in me being invited to visit Duke University, and in 8th grade, I participated in National History Day, resulting in a trip to visit the University of Maryland. I probably sought out computer labs at both of those. My most distinct memory was finding my way into a computer lab at one of those universities, and it was full of NeXT workstations. I had never seen or used NeXT before, and had no idea how to operate it. I had brought a box of floppy disks, unaware that the DOS disks probably weren’t compatible with NeXT.
Closer to home, a small college had a computer lab that I could also visit. I would go there in summer or when it wasn’t used with my stack of floppies. I remember downloading disk images of FLOSS operating systems: FreeBSD, Slackware, or Debian, at the time. The hash marks from the DOS-based FTP client would creep across the screen as the 1.44MB disk images would slowly download. telnet was also available on those machines, so I could telnet to things like public-access Archie servers and libraries – though not Gopher. Still, FTP and telnet access opened up a lot, and I learned quite a bit in those years.
Continuing the Journey
At some point, I got a copy of the Whole Internet User’s Guide and Catalog, published in 1994. I still have it. If it hadn’t already figured it out by then, I certainly became aware from it that Unix was the dominant operating system on the Internet. The examples in Whole Internet covered FTP, telnet, gopher – all assuming the user somehow got to a Unix prompt. The web was introduced about 300 pages in; clearly viewed as something that wasn’t page 1 material. And it covered the command-line www client before introducing the graphical Mosaic. Even then, though, the book highlighted Mosaic’s utility as a front-end for Gopher and FTP, and even the ability to launch telnet sessions by clicking on links. But having a copy of the book didn’t equate to having any way to run Mosaic. The machines in the computer lab I mentioned above all ran DOS and were incapable of running a graphical browser. I had no SLIP or PPP (both ways to run Internet traffic over a modem) connectivity at home. In short, the Web was something for the large institutional users at the time.
CD-ROMs
As CD-ROMs came out, with their huge (for the day) 650MB capacity, various companies started collecting software that could be downloaded on the Internet and selling it on CD-ROM. The two most popular ones were Walnut Creek CD-ROM and Infomagic. One could buy extensive Shareware and gaming collections, and then even entire Linux and BSD distributions. Although not exactly an Internet service per se, it was a way of bringing what may ordinarily only be accessible to institutional users into the home computer realm.
Free Software Jumps In
As I mentioned, by the mid 90s, I had come across RMS’s writings about free software – most probably his 1992 essay Why Software Should Be Free. (Please note, this is not a commentary on the more recently-revealed issues surrounding RMS, but rather his writings and work as I encountered them in the 90s.) The notion of a Free operating system – not just in cost but in openness – was incredibly appealing. Not only could I tinker with it to a much greater extent due to having source for everything, but it included so much software that I’d otherwise have to pay for. Compilers! Interpreters! Editors! Terminal emulators! And, especially, server software of all sorts. There’d be no way I could afford or run Netware, but with a Free Unixy operating system, I could do all that. My interest was obviously piqued. Add to that the fact that I could actually participate and contribute – I was about to become hooked on something that I’ve stayed hooked on for decades.
But then the question was: which Free operating system? Eventually I chose FreeBSD to begin with; that would have been sometime in 1995. I don’t recall the exact reasons for that. I remember downloading Slackware install floppies, and probably the fact that Debian wasn’t yet at 1.0 scared me off for a time. FreeBSD’s fantastic Handbook – far better than anything I could find for Linux at the time – was no doubt also a factor.
The de Raadt Factor
Why not NetBSD or OpenBSD? The short answer is Theo de Raadt. Somewhere in this time, when I was somewhere between 14 and 16 years old, I asked some questions comparing NetBSD to the other two free BSDs. This was on a NetBSD mailing list, but for some reason Theo saw it and got a flame war going, which CC’d me. Now keep in mind that even if NetBSD had a web presence at the time, it would have been minimal, and I would have – not all that unusually for the time – had no way to access it. I was certainly not aware of the, shall we say, acrimony between Theo and NetBSD. While I had certainly seen an online flamewar before, this took on a different and more disturbing tone; months later, Theo randomly emailed me under the subject “SLIME” saying that I was, well, “SLIME”. I seem to recall periodic emails from him thereafter reminding me that he hates me and that he had blocked me. (Disclaimer: I have poor email archives from this period, so the full details are lost to me, but I believe I am accurately conveying these events from over 25 years ago)
This was a surprise, and an unpleasant one. I was trying to learn, and while it is possible I didn’t understand some aspect or other of netiquette (or Theo’s personal hatred of NetBSD) at the time, still that is not a reason to flame a 16-year-old (though he would have had no way to know my age). This didn’t leave any kind of scar, but did leave a lasting impression; to this day, I am particularly concerned with how FLOSS projects handle poisonous people. Debian, for instance, has come a long way in this over the years, and even Linus Torvalds has turned over a new leaf. I don’t know if Theo has.
In any case, I didn’t use NetBSD then. I did try it periodically in the years since, but never found it compelling enough to justify a large switch from Debian. I never tried OpenBSD for various reasons, but one of them was that I didn’t want to join a community that tolerates behavior such as Theo’s from its leader.
Moving to FreeBSD
Moving from OS/2 to FreeBSD was final. That is, I didn’t have enough hard drive space to keep both. I also didn’t have the backup capacity to back up OS/2 completely. My BBS, which ran Virtual BBS (and at some point also AdeptXBBS) was deleted and reincarnated in a different form. My BBS was a member of both FidoNet and VirtualNet; the latter was specific to VBBS, and had to be dropped. I believe I may have also had to drop the FidoNet link for a time. This was the biggest change of computing in my life to that point. The earlier experiences hadn’t literally destroyed what came before. OS/2 could still run my DOS programs. Its command shell was quite DOS-like. It ran Windows programs. I was going to throw all that away and leap into the unknown.
I wish I had saved a copy of my BBS; I would love to see the messages I exchanged back then, or see its menu screens again. I have little memory of what it looked like. But other than that, I have no regrets. Pursuing Free, Unixy operating systems brought me a lot of enjoyment and a good career.
That’s not to say it was easy. All the problems of not being in the Microsoft ecosystem were magnified under FreeBSD and Linux. In a day before EDID, monitor timings had to be calculated manually – and you risked destroying your monitor if you got them wrong. Word processing and spreadsheet software was pretty much not there for FreeBSD or Linux at the time; I was therefore forced to learn LaTeX and actually appreciated that. Software like PageMaker or CorelDraw was certainly nowhere to be found for those free operating systems either. But I got a ton of new capabilities.
I mentioned the BBS didn’t shut down, and indeed it didn’t. I ran what was surely a supremely unique oddity: a free, dialin Unix shell server in the middle of a small town in Kansas. I’m sure I provided things such as pine for email and some help text and maybe even printouts for how to use it. The set of callers slowly grew over the time period, in fact.
And then I got UUCP.
Enter UUCP
Even throughout all this, there was no local Internet provider and things were still long distance. I had Internet Email access via assorted strange routes, but they were all… strange. And, I wanted access to Usenet. In 1995, it happened.
The local ISP I mentioned offered UUCP access. Though I couldn’t afford the dialup shell (or later, SLIP/PPP) that they offered due to long-distance costs, UUCP’s very efficient batched processes looked doable. I believe I established that link when I was 15, so in 1995.
I worked to register my domain, complete.org
, as well. At the time, the process was a bit lengthy and involved downloading a text file form, filling it out in a precise way, sending it to InterNIC, and probably mailing them a check. Well I did that, and in September of 1995, complete.org
became mine. I set up sendmail
on my local system, as well as INN
to handle the limited Usenet newsfeed I requested from the ISP. I even ran Majordomo to host some mailing lists, including some that were surprisingly high-traffic for a few-times-a-day long-distance modem UUCP link!
The modem client programs for FreeBSD were somewhat less advanced than for OS/2, but I believe I wound up using Minicom or Seyon to continue to dial out to BBSs and, I believe, continue to use Learning Link. So all the while I was setting up my local BBS, I continued to have access to the text Internet, consisting of chiefly Gopher for me.
Switching to Debian
I switched to Debian sometime in 1995 or 1996, and have been using Debian as my primary OS ever since. I continued to offer shell access, but added the WorldVU Atlantis menuing BBS system. This provided a return of a more BBS-like interface (by default; shell was still an uption) as well as some BBS door games such as LoRD and TradeWars 2002, running under DOS emulation.
I also continued to run INN, and ran ifgate to allow FidoNet echomail to be presented into INN Usenet-like newsgroups, and netmail to be gated to Unix email. This worked pretty well. The BBS continued to grow in these days, peaking at about two dozen total user accounts, and maybe a dozen regular users.
Dial-up access availability
I believe it was in 1996 that dial up PPP access finally became available in my small town. What a thrill! FINALLY! I could now FTP, use Gopher, telnet, and the web all from home. Of course, it was at modem speeds, but still.
(Strangely, I have a memory of accessing the Web using WebExplorer from OS/2. I don’t know exactly why; it’s possible that by this time, I had upgraded to a 486 DX2/66 and was able to reinstall OS/2 on the old 25MHz 486, or maybe something was wrong with the timeline from my memories from 25 years ago above. Or perhaps I made the occasional long-distance call somewhere before I ditched OS/2.)
Gopher sites still existed at this point, and I could access them using Netscape Navigator – which likely became my standard Gopher client at that point. I don’t recall using UMN text-mode gopher client locally at that time, though it’s certainly possible I did.
The city
Starting when I was 15, I took computer science classes at Wichita State University. The first one was a class in the summer of 1995 on C++. I remember being worried about being good enough for it – I was, after all, just after my HS freshman year and had never taken the prerequisite C class. I loved it and got an A! By 1996, I was taking more classes.
In 1996 or 1997 I stayed in Wichita during the day due to having more than one class. So, what would I do then but… enjoy the computer lab? The CS dept. had two of them: one that had NCD X terminals connected to a pair of SunOS servers, and another one running Windows. I spent most of the time in the Unix lab with the NCDs; I’d use Netscape or pine, write code, enjoy the University’s fast Internet connection, and so forth.
In 1997 I had graduated high school and that summer I moved to Wichita to attend college. As was so often the case, I shut down the BBS at that time. It would be 5 years until I again dealt with Internet at home in a rural community.
By the time I moved to my apartment in Wichita, I had stopped using OS/2 entirely. I have no memory of ever having OS/2 there. Along the way, I had bought a Pentium 166, and then the most expensive piece of computing equipment I have ever owned: a DEC Alpha, which, of course, ran Linux.
ISDN
I must have used dialup PPP for a time, but I eventually got a job working for the ISP I had used for UUCP, and then PPP. While there, I got a 128Kbps ISDN line installed in my apartment, and they gave me a discount on the service for it. That was around 3x the speed of a modem, and crucially was always on and gave me a public IP. No longer did I have to use UUCP; now I got to host my own things! By at least 1998, I was running a web server on www.complete.org, and I had an FTP server going as well.
Even Bigger Cities
In 1999 I moved to Dallas, and there got my first broadband connection: an ADSL link at, I think, 1.5Mbps! Now that was something! But it had some reliability problems. I eventually put together a server and had it hosted at an acquantaince’s place who had SDSL in his apartment. Within a couple of years, I had switched to various kinds of proper hosting for it, but that is a whole other article.
In Indianapolis, I got a cable modem for the first time, with even tighter speeds but prohibitions on running “servers” on it. Yuck.
Challenges
Being non-Microsoft continued to have challenges. Until the advent of Firefox, a web browser was one of the biggest. While Netscape supported Linux on i386, it didn’t support Linux on Alpha. I hobbled along with various attempts at emulators, old versions of Mosaic, and so forth. And, until StarOffice was open-sourced as Open Office, reading Microsoft file formats was also a challenge, though WordPerfect was briefly available for Linux.
Over the years, I have become used to the Linux ecosystem. Perhaps I use Gimp instead of Photoshop and digikam instead of – well, whatever somebody would use on Windows. But I get ZFS, and containers, and so much that isn’t available there.
Yes, I know Apple never went away and is a thing, but for most of the time period I discuss in this article, at least after the rise of DOS, it was niche compared to the PC market.
Back to Kansas
In 2002, I moved back to Kansas, to a rural home near a different small town in the county next to where I grew up. Over there, it was back to dialup at home, but I had faster access at work. I didn’t much care for this, and thus began a 20+-year effort to get broadband in the country. At first, I got a wireless link, which worked well enough in the winter, but had serious problems in the summer when the trees leafed out. Eventually DSL became available locally – highly unreliable, but still, it was something. Then I moved back to the community I grew up in, a few miles from where I grew up. Again I got DSL – a bit better. But after some years, being at the end of the run of DSL meant I had poor speeds and reliability problems. I eventually switched to various wireless ISPs, which continues to the present day; while people in cities can get Gbps service, I can get, at best, about 50Mbps. Long-distance fees are gone, but the speed disparity remains.
Concluding Reflections
I am glad I grew up where I did; the strong community has a lot of advantages I don’t have room to discuss here. In a number of very real senses, having no local services made things a lot more difficult than they otherwise would have been. However, perhaps I could say that I also learned a lot through the need to come up with inventive solutions to those challenges. To this day, I think a lot about computing in remote environments: partially because I live in one, and partially because I enjoy visiting places that are remote enough that they have no Internet, phone, or cell service whatsoever. I have written articles like Tools for Communicating Offline and in Difficult Circumstances based on my own personal experience. I instinctively think about making protocols robust in the face of various kinds of connectivity failures because I experience various kinds of connectivity failures myself.
(Almost) Everything Lives On
In 2002, Gopher turned 10 years old. It had probably been about 9 or 10 years since I had first used Gopher, which was the first way I got on live Internet from my house. It was hard to believe. By that point, I had an always-on Internet link at home and at work. I had my Alpha, and probably also at least PCMCIA Ethernet for a laptop (many laptops had modems by the 90s also). Despite its popularity in the early 90s, less than 10 years after it came on the scene and started to unify the Internet, it was mostly forgotten.
And it was at that moment that I decided to try to resurrect it. The University of Minnesota finally released it under an Open Source license. I wrote the first new gopher server in years, pygopherd, and introduced gopher to Debian. Gopher lives on; there are now quite a few Gopher clients and servers out there, newly started post-2002. The Gemini protocol can be thought of as something akin to Gopher 2.0, and it too has a small but blossoming ecosystem.
Archie, the old FTP search tool, is dead though. Same for WAIS and a number of the other pre-web search tools. But still, even FTP lives on today.
And BBSs? Well, they didn’t go away either. Jason Scott’s fabulous BBS documentary looks back at the history of the BBS, while Back to the BBS from last year talks about the modern BBS scene. FidoNet somehow is still alive and kicking. UUCP still has its place and has inspired a whole string of successors. Some, like NNCP, are clearly direct descendents of UUCP. Filespooler lives in that ecosystem, and you can even see UUCP concepts in projects as far afield as Syncthing and Meshtastic. Usenet still exists, and you can now run Usenet over NNCP just as I ran Usenet over UUCP back in the day (which you can still do as well). Telnet, of course, has been largely supplanted by ssh, but the concept is more popular now than ever, as Linux has made ssh be available on everything from Raspberry Pi to Android.
And I still run a Gopher server, looking pretty much like it did in 2002.
This post also has a permanent home on my website, where it may be periodically updated.
Tools for Communicating Offline and in Difficult Circumstances
Note: this post is also available on my website, where it will be updated periodically.
When things are difficult – maybe there’s been a disaster, or an invasion (this page is being written in 2022 just after Russia invaded Ukraine), or maybe you’re just backpacking off the grid – there are tools that can help you keep in touch, or move your data around. This page aims to survey some of them, roughly in order from easiest to more complex.
Simple radios
Handheld radios shouldn’t be forgotten. They are cheap, small, and easy to operate. Their range isn’t huge – maybe a couple of miles in rural areas, much less in cities – but they can be a useful place to start. They tend to have no actual encryption features (the “privacy” features really aren’t.) In the USA, options are FRS/GMRS and CB.
Syncthing
With Syncthing, you can share files among your devices or with your friends. Syncthing essentially builds a private mesh for file sharing. Devices will auto-discover each other when on the same LAN or Wifi network, and opportunistically sync.
I wrote more about offline uses of Syncthing, and its use with NNCP, in my blog post A simple, delay-tolerant, offline-capable mesh network with Syncthing (+ optional NNCP). Yes, it is a form of a Mesh Network!
Homepage: https://syncthing.net/
Briar
Briar is an instant messaging service based around Android. It’s IM with a twist: it can use a mesh of Bluetooh devices. Or, if Internet is available, Tor. It has even been extended to support the use of SD cards and USB sticks to carry your messages.
Like some others here, it can relay messages for third parties as well.
Homepage: https://briarproject.org/
Manyverse and Scuttlebutt
Manyverse is a client for Scuttlebutt, which is a sort of asynchronous, offline-friendly social network. You can use it to keep in touch with your family and friends, and it supports syncing over Bluetooth and Wifi even in the absence of Internet.
Homepages: https://www.manyver.se/ and https://scuttlebutt.nz/
Yggdrasil
Yggdrasil is a self-healing, fully end-to-end Encrypted Mesh Network. It can work among local devices or on the global Internet. It has network services that can egress onto things like Tor, I2P, and the public Internet. Yggdrasil makes a perfect companion to ad-hoc wifi as it has auto peer discovery on the local network.
I talked about it in more detail in my blog post Make the Internet Yours Again With an Instant Mesh Network.
Homepage: https://yggdrasil-network.github.io/
Ad-Hoc Wifi
Few people know about the ad-hoc wifi mode. Ad-hoc wifi lets devices in range talk to each other without an access point. You just all set your devices to the same network name and password and there you go. However, there often isn’t DHCP, so IP configuration can be a bit of a challenge. Yggdrasil helps here.
NNCP
Moving now to more advanced tools, NNCP lets you assemble a network of peers that can use Asynchronous Communication over sneakernet, USB drives, radios, CD-Rs, Internet, tor, NNCP over Yggdrasil, Syncthing, Dropbox, S3, you name it . NNCP supports multi-hop file transfer and remote execution. It is fully end-to-end encrypted. Think of it as the offline version of ssh.
Homepage: https://nncp.mirrors.quux.org/
Meshtastic
Meshtastic uses long-range, low-power LoRa radios to build a long-distance, encrypted, instant messaging system that is a Mesh Network. It requires specialized hardware, about $30, but will tend to get much better range than simple radios, and with very little power.
Homepages: https://meshtastic.org/ and https://meshtastic.letstalkthis.com/
Portable Satellite Communicators
You can get portable satellite communicators that can send SMS from anywhere on earth with a clear view of the sky. The Garmin InReach mini and Zoleo are two credible options. Subscriptions range from about $10 to $40 per month depending on usage. They also have global SOS features.
Telephone Lines
If you have a phone line and a modem, UUCP can get through just about anything. It’s an older protocol that lacks modern security, but will deal with slow and noisy serial lines well. XBee SX radios also have a serial mode that can work well with UUCP.
Additional Suggestions
It is probably useful to have a Linux live USB stick with whatever software you want to use handy. Debian can be installed from the live environment, or you could use a security-focused distribution such as Tails or Qubes.
References
This page originated in my Mastodon thread and incorporates some suggestions I received there.
It also formed a post on my blog.
Make the Internet Yours Again With an Instant Mesh Network
I’m going to lead with the technical punch line, and then explain it:
Yggdrasil Network is an opportunistic mesh that can be deployed privately or as part of a global-scale network. Each node gets a stable IPv6 address (or even an entire /64) that is derived from its public key and is bound to that node as long as the node wants it (of course, it can generate a new keypair anytime) and is valid wherever the node joins the mesh. All traffic is end-to-end encrypted.
Yggdrasil will automatically discover peers on a LAN via broadcast beacons, and requires zero configuration to peer in such a way. It can also run as an overlay network atop the public Internet. Public peers serve as places to join the global network, and since it’s a mesh, if one device on your LAN joins the global network, the others will automatically have visibility on it also, thanks to the mesh routing.
It neatly solves a lot of problems of portability (my ssh sessions stay live as I move networks, for instance), VPN (incoming ports aren’t required since local nodes can connect to a public peer via an outbound connection), security, and so forth.
Now on to the explanation:
The Tyranny of IP rigidity
Every device on the Internet, at one time, had its own globally-unique IP address. This number was its identifier to the world; with an IP address, you can connect to any machine anywhere. Even now, when you connect to a computer to download a webpage or send a message, under the hood, your computer is talking to the other one by IP address.
Only, now it’s hard to get one. The Internet protocol we all grew up with, version 4 (IPv4), didn’t have enough addresses for the explosive growth we’ve seen. Internet providers and IT departments had to use a trick called NAT (Network Address Translation) to give you a sort of fake IP address, so they could put hundreds or thousands of devices behind a single public one. That, plus the mobility of devices — changing IPs whenever they change locations — has meant that a fundamental rule of the old Internet is now broken:
Every participant is an equal peer. (Well, not any more.)
Nowadays, you can’t you host your own website from your phone. Or share files from your house. (Without, that is, the use of some third-party service that locks you down and acts as an intermediary.)
Back in the 90s, I worked at a university, and I, like every other employee, had a PC on my desk with an unfirewalled public IP. I installed a webserver, and poof – instant website. Nowadays, running a website from home is just about impossible. You may not have a public IP, and if you do, it likely changes from time to time. And even then, your ISP probably blocks you from running servers on it.
In short, you have to buy your way into the resources to participate on the Internet.
I wrote about these problems in more detail in my article Recovering Our Lost Free Will Online.
Enter Yggdrasil
I already gave away the punch line at the top. But what does all that mean?
I’ve set up /etc/hosts on my laptop to use the Yggdrasil IPs for other machines on my LAN. Now I can just “ssh foo” and it will work — from home, from a coffee shop, from a 4G tether, wherever. Now, other tools like tinc can do this, obviously. And I could stop there; I could have a completely closed, private Yggdrasil network.
Or, I can join the global Yggdrasil network. Each device, in addition to accepting peers it finds on the LAN, can also be configured to establish outbound peering connections or accept inbound ones over the Internet. Put a public peer or two in your configuration and you’ve joined the global network. Most people will probably want to do that on every device (because why not?), but you could also do that from just one device on your LAN. Again, there’s no need to explicitly build routes via it; your other machines on the LAN will discover the route’s existence and use it.
This is one of many projects that are working to democratize and decentralize the Internet. So far, it has been quite successful, growing to over 2000 nodes. It is the direct successor to the earlier cjdns/Hyperboria and BATMAN networks, and aims to be a proof of concept and a viable tool for global expansion.
Finally, think about how much easier development is when you don’t have to necessarily worry about TLS complexity in every single application. When you don’t have to worry about port forwarding and firewall penetration. It’s what the Internet should be.
Facebook’s Blocking Decisions Are Deliberate – Including Their Censorship of Mastodon
In the aftermath of my report of Facebook censoring mentions of the open-source social network Mastodon, there was a lot of conversation about whether or not this was deliberate.
That conversation seemed to focus on whether a human speficially added joinmastodon.org to some sort of blacklist. But that’s not even relevant.
OF COURSE it was deliberate, because of how Facebook tunes its algorithm.
Facebook’s algorithm is tuned for Facebook’s profit. That means it’s tuned to maximize the time people spend on the site — engagement. In other words, it is tuned to keep your attention on Facebook.
Why do you think there is so much junk on Facebook? So much anti-vax, anti-science, conspiracy nonsense from the likes of Breitbart? It’s not because their algorithm is incapable of surfacing the good content; we already know it can because they temporarily pivoted it shortly after the last US election. They intentionally undid its efforts to make high-quality news sources more prominent — twice.
Facebook has said that certain anti-vax disinformation posts violate its policies. It has an extremely cumbersome way to report them, but it can be done and I have. These reports are met with either silence or a response claiming the content didn’t violate their guidelines.
So what algorithm is it that allows Breitbart to not just be seen but to thrive on the platform, lets anti-vax disinformation survive even a human review, while banning mentions of Mastodon?
One that is working exactly as intended.
We may think this algorithm is busted. Clearly, Facebook does not. If their goal is to maximize profit by maximizing engagement, the algorithm is working exactly as designed.
I don’t know if joinmastodon.org was specifically blacklisted by a human. Nor is it relevant.
Facebook’s choice to tolerate and promote the things that service its greed for engagement and money, even if they are the lowest dregs of the web, is deliberate. It is no accident that Breitbart does better than Mastodon on Facebook. After all, which of these does its algorithm detect keep people engaged on Facebook itself more?
Facebook removes the ban
You can see all the screenshots of the censorship in my original post. Now, Facebook has reversed course:
We also don’t know if this reversal was human or algorithmic, but that still is beside the point.
The point is, Facebook intentionally chooses to surface and promote those things that drive engagement, regardless of quality.
Clearly many have wondered if tens of thousands of people have died unnecessary deaths over COVID as a result. One whistleblower says “I have blood on my hands” and President Biden said “they’re killing people” before “walking back his comments slightly”. I’m not equipped to verify those statements. But what do they think is going to happen if they prioritize engagement over quality? Rainbows and happiness?
Update 2024-04-06: It’s happened again. Facebook censored my post about the Marion County Record because the same site was critical of Facebook censoring environmental stories.
Facebook Is Censoring People For Mentioning Open-Source Social Network Mastodon
Update: Facebook has reversed itself over this censorship, but I maintain that whether the censorship was algorithmic or human, it was intentional either way. Details in my new post.
Last November, I made a brief post to Facebook about Mastodon. Mastodon is an open-source and open social network, which is decentralized and all about user control instead of corporate control. I’ve blogged about Mastodon and the dangers of Facebook before, but rarely mentioned Mastodon on Facebook itself.
Today, I received this notice that Facebook had censored my post about Mastodon:
Wonder with me for a second what this one-off post I composed myself might have done to trip Facebook’s filter…. and it is probably obvious that what tripped the filter was the mention of an open source competitor, even though Facebook is much more enormous than Mastodon. I have been a member of Facebook for many years, and this is the one and only time anything like that has happened.
Why they decided today to take down that post – I have no idea.
In case you wondered about their sincerity towards stamping out misinformation — which, on the rare occasions they do something about, they “deprioritize” rather than remove as they did here — this probably answers your question. Or, are they sincere about thinking they’re such a force for good by “connecting the world’s people?” Well, only so long as the world’s people don’t say nice things about alternatives to Facebook, I guess.
“Well,” you might be wondering, “Why not appeal, since they obviously made a mistake?” Because, of course, you can’t:
Indeed I did tick a box that said I disagreed, but there was no place to ask why or to question their action.
So what would cause a non-controversial post from a long-time Facebook member that has never had anything like this happen, to disappear?
Greed. Also fear.
Maybe I’d feel sorry for them if they weren’t acting like a bully.
Edit: There are reports from several others on Mastodon of the same happening this week. I am trying to gather more information. It sounds like it may be happening on Twitter as well.
Edit 2: And here are some other reports from both Facebook and Twitter. Definitely not just me.
Edit 3: While trying to reply to someone on Facebook, that was trying to defend Facebook, I mentioned joinmastodon.org and got this:
Anyone else seeing it?
Edit 4: It is far more than just me, clearly. More reports are out there; for instance, this one and that one.
Roundup of Unique Data/Storage Hosting Options
Recently I have been taking another look at the services at rsync.net and it got me thinking: what would I do with a lot of storage? What might I want to run with it, if it were fairly cheap?
- Backups are an obvious place to start. Borgbackup makes a pretty compelling option: very bandwidth-efficient thanks to block-level rolling hash dedup, encryption fully on the client side, etc. Borg can run over ssh, though does need a server-side program.
- Nextcloud is another option. With Google Photos getting quite expensive now, if you could have a TB of storage that you control, what might you do with it? Nextcloud also includes IM, video chat, and online document editing similar to Google Docs.
- I’ve written before about the really neat properties of Syncthing: distributed synchronization that needs no server component. It also supports untrusted nodes in the mesh, where all content is encrypted before it reaches them. Sometimes an intermediary node is useful; for instance, if nodes A and C are to sync but are rarely online at the same time, an untrusted node B that is always online can facilitate synchronization. A server with some space could help with this.
- A relay for NNCP or UUCP.
- More broadly, you could self-host your photo or video cllection.
Let’s start taking a look at what’s out there. I’m going to try to focus on things that are unique for some reason: pricing, features, etc. Incidentally, good reviews are hard to find due to the proliferation of affiliate links. I have no affiliate relationships with anyone mentioned here and there are no affiliate links in this post.
I’ll start with the highest-end community and commercial options (though both are quite competitive on price for what they are), and then move on to the cheaper options.
Community option: SDF
SDF is somewhat hard to define. “What is SDF?” could prompt answers like:
- A community-run network offering free Unix shells to the public
- A diverse community of people that connect with unique tools. A social network in the 80s sense, sort of.
- A provider of… let me see… VPN, DSL, and even dialup access.
- An organization that runs various Open Source social network services, including Mastodon, Pixelfed (image sharing), PeerTube (video sharing), WordPress, even Minecraft.
- A provider of various services for a nominal charge: $3/mo gets you access to the MetaArray with 800GB of storage space which you have shell access to, and can store stuff on with Nextcloud, host public webpages, etc.
- Thriving communities around amateur radio, musicians, Plan 9, and even – brace yourself – TOPS-20, a DEC operating system first released in 1976 and not updated since 1988.
- There’s even a Wikipedia article about SDF.
There’s a lot there. SDF lets you use things for yourself, of course, but you can also join a community. It’s not a commercial service backed by SLAs — it’s best-effort — but it’s been around more than 30 years and has a great track record.
Top commercial option for backup storage: rsync.net
rsync.net offers storage broadly over SSH: sftp, rsync, scp, borg, rclone, restic, git-annex, git, and such. You do not get a shell, but you do get to run a few noninteractive commands via ssh. You can, for instance, run git clone on the rsync server.
The rsync special sauce is in ZFS. They run raidz3 on their arrays (and also offer dual location setups for an additional fee), offer both free and paid ZFS snapshots, etc. The service is designed to be extremely reliable, particularly for backups, and it seems to me to meet those goals.
Basic storage is $0.025 per GB/mo, but with certain account types such as borg, can be had for $0.015 per GB/mo. The minimum size is 400GB or $10/mo. There are no bandwidth charges. This makes it quite economical even compared to, say, S3. Additional discounts start at 10TB, so 10TB with rsync.net would cost $204.80/mo or $81.92 on the borg plan.
You won’t run Nextcloud on this thing, but for backups that must be reliable, or even a photo collection or something, it makes perfect sense.
When you look into other options, you’ll find that other providers are a lot more vague about their storage setup than rsync.net.
Various offerings from Hetzner
Hetzner is one of Europe’s large hosting companies, and they have several options of interest.
Their Storage Box competes directly with the rsync.net service. Their per-GB storage cost is lower than rsync.net, and although they do include a certain amount of free bandwidth with each account, bandwidth is not unlimited and could result in charges. Still, if you don’t drive 2x or more your storage usage in bandwidth each month, it would be cheaper than rsync. The Storage Box also uses ZFS with some kind of redundancy, though they don’t specifcy details.
What differentiates them from rsync.net is the protocol support. They support sftp, scp, Borg, ssh, rsync, etc. just as rsync.net does. But then they also throw in Samba/CIFS, FTPS, HTTPS, and WebDAV – all optionally enabled or disabled by you. Although things like sshfs exist, they aren’t particularly optimal for some use cases, and CIFS support may just be what you need in some situations.
10TB with Hetzner would cost EUR 39.90/mo, or about $48.84/mo. (This figure is higher for Europeans, who also have to pay VAT.)
Hetzner also offers a Storage Share, which is a private Nextcloud instance. 10TB of that is exactly the same cost as 10TB of the Storage Box. You can add your own users, groups, etc. to this as your are the Nextcloud admin of your instance. Hetzner throws in automatic updates (which is great, as updates have been a pain in my side for a long time). Nextcloud is ideal for things like photo sharing, even has email and chat built in, etc. For about the same price at 2TB of Google One, you can have 2TB of Nextcloud with all those services for yourself. Not bad. You can also mount a Nextcloud instance with WebDAV.
Interestingly, Nextcloud supports “external storages” as backend for the data. It supports another Nextcloud instance, OpenStack or S3 object storage, and SFTP, SMB/CIFS, and WebDAV. If you’re thinking you’d like both SFTP and Nextcloud access to a pool of storage, I imagine you could always get a large Storage Box from Hetzner (internal transfer is free), pair it with a small Nextcloud instance, and link the two with Nextcloud external storage.
Dedicated Servers
If you want a more DIY approach, you can find some interesting deals on actual dedicated server hardware – you get the entire machine to yourself. I’ve been using OVH’s SoYouStart for a number of years, with good experienaces, and they have a number of server configurations available. For instance, for $45.99, you can get a Xeon box with 4x2TB drives and 32GB RAM. With RAID5 or raidz1, that’s 6TB of available space – and cheaper than the 6TB from rsync.net (though less redundant) plus you get the whole box to yourself too. OVH directly has some more storage servers; for instance, you can get a box with 4x4TB + 1x500GB SSD for $86.75/mo, giving you 12TB available with RAID5/raidz1, plus a 16GB server to do what you want with.
Hetzner also has some larger options available, for instance 2x4TB at EUR39 or 2x8TB at EUR54, both with 64GB of RAM.
Bargain Corner
Yes, you can find 10TB for $25/mo. It’s hosted on ceph, by what appears to be mostly a single person (though with a lot of experience and a fair bit of transparency). You’re not going to have the round-the-clock support experience as with rsync.net, nor its raidz3 level of redundancy – but if you don’t need that, there are quite a few options.
Let’s start with Lima Labs. Yes, 10TB is $25/mo, and they support sftp, rsync, borg, and even NFS mounts on storage backed by Ceph. The owner, Sam, seems to be a nice guy but the service isn’t going to be on the scale of rsync.net or Hetzner. That may or may not be OK for your needs – I mean, you can even get 1TB for $5/mo, so there are some fantastic deals to be had here.
BorgBase does Borg hosting and borg hosting only. You can get 1TB for $6.67/mo or, for instance, 10TB for $53.46. They don’t say much about their infrastructure and it’s hard to get a read on the company, but for Borg backups, it could be a nice option.
Bargain Corner Part 2: Seedboxes
There’s a market out there of companies offering BitTorrent seeding and downloading services. Typically, these services offer you Unix ssh access to a shell, give you a bunch of space on completely non-redundant drives (theory being that the data on them is transient), lots of bandwidth, for a low price. Some people use them for BitTorrent, others for media serving and such.
If you are willing to take the lowest in drive redundancy, there are some deals to be had. Whatbox is a popular leader here, and has an extensive wiki with info. Or you can find some seedbox.io “shared storage” plans – for instance, 12TB for $32.49/mo. But it’s completely non-redundant drives.
Seedbox has a partner company, Walker Servers, with some interesting deals; for instance, 4x8TB for EUR 52.45. Not bad for 24TB usable with RAID5 – but Walker Servers is completely unknown to me and doesn’t publish a phone number. So, YMMV.
Conclusion
I’m sure I’ve left out many quality options here, but hopefully this is enough to lay out a general lay of the land. Leave other suggestions in the comments.