Tag Archives: Debian

Censorship Is Complicated: What Internet History Says about Meta/Facebook

January 8, 2025Freedom, Online Lifebbs, censorship, Debian, facebook, free speech, Internet, usenetJohn Goerzen

In light of this week’s announcement by Meta (Facebook, Instagram, Threads, etc), I have been pondering this question: Why am I, a person that has long been a staunch advocate of free speech and encryption, leery of sites that talk about being free speech-oriented? And, more to the point, why an I — a person that has been censored by Facebook for mentioning the Open Source social network Mastodon — not cheering a “lighter touch”?

The answers are complicated, and take me back to the early days of social networking. Yes, I mean the 1980s and 1990s.

Before digital communications, there were barriers to reaching a lot of people. Especially money. This led to a sort of self-censorship: it may be legal to write certain things, but would a newspaper publish a letter to the editor containing expletives? Probably not.

As digital communications started to happen, suddenly people could have their own communities. Not just free from the same kinds of monetary pressures, but free from outside oversight (parents, teachers, peers, community, etc.) When you have a community that the majority of people lack the equipment to access — and wouldn’t understand how to access even if they had the equipment — you have a place where self-expression can be unleashed.

And, as J. C. Herz covers in what is now an unintentional history (her book Surfing on the Internet was published in 1995), self-expression WAS unleashed. She enjoyed the wit and expression of everything from odd corners of Usenet to the text-based open world of MOOs and MUDs. She even talks about groups dedicated to insults (flaming) in positive terms.

But as I’ve seen time and again, if there are absolutely no rules, then whenever a group gets big enough — more than a few dozen people, say — there are troublemakers that ruin it for everyone. Maybe it’s trolling, maybe it’s vicious attacks, you name it — it will arrive and it will be poisonous.

I remember the debates within the Debian community about this. Debian is one of the pillars of the Internet today, a nonprofit project with free speech in its DNA. And yet there were inevitably the poisonous people. Debian took too long to learn that allowing those people to run rampant was causing more harm than good, because having a well-worn Delete key and a tolerance for insults became a requirement for being a Debian developer, and that drove away people that had no desire to deal with such things. (I should note that Debian strikes a much better balance today.)

But in reality, there were never absolutely no rules. If you joined a BBS, you used it at the whim of the owner (the “sysop” or system operator). The sysop may be a 16-yr-old running it from their bedroom, or a retired programmer, but in any case they were letting you use their resources for free and they could kick you off for any or no reason at all. So if you caused trouble, or perhaps insulted their cat, you’re banned. But, in all but the smallest towns, there were other options you could try.

On the other hand, sysops enjoyed having people call their BBSs and didn’t want to drive everyone off, so there was a natural balance at play. As networks like Fidonet developed, a sort of uneasy approach kicked in: don’t be excessively annoying, and don’t be easily annoyed. Like it or not, it seemed to generally work. A BBS that repeatedly failed to deal with troublemakers could risk removal from Fidonet.

On the more institutional Usenet, you generally got access through your university (or, in a few cases, employer). Most universities didn’t really even know they were running a Usenet server, and you were generally left alone. Until you did something that annoyed somebody enough that they tracked down the phone number for your dean, in which case real-world consequences would kick in. A site may face the Usenet Death Penalty — delinking from the network — if they repeatedly failed to prevent malicious content from flowing through their site.

Some BBSs let people from minority communities such as LGBTQ+ thrive in a place of peace from tormentors. A lot of them let people be themselves in a way they couldn’t be “in real life”. And yes, some harbored trolls and flamers.

The point I am trying to make here is that each BBS, or Usenet site, set their own policies about what their own users could do. These had to be harmonized to a certain extent with the global community, but in a certain sense, with BBSs especially, you could just use a different one if you didn’t like what the vibe was at a certain place.

That this free speech ethos survived was never inevitable. There were many attempts to regulate the Internet, and it was thanks to the advocacy of groups like the EFF that we have things like strong encryption and a degree of freedom online.

With the rise of the very large platforms — and here I mean CompuServe and AOL at first, and then Facebook, Twitter, and the like later — the low-friction option of just choosing a different place started to decline. You could participate on a Fidonet forum from any of thousands of BBSs, but you could only participate in an AOL forum from AOL. The same goes for Facebook, Twitter, and so forth. Not only that, but as social media became conceived of as very large sites, it became impossible for a person with enough skill, funds, and time to just start a site themselves. Instead of neading a few thousand dollars of equipment, you’d need tens or hundreds of millions of dollars of equipment and employees.

All that means you can’t really run Facebook as a nonprofit. It is a business. It should be absolutely clear to everyone that Facebook’s mission is not the one they say it is — “[to] give people the power to build community and bring the world closer together.” If that was their goal, they wouldn’t be creating AI users and AI spam and all the rest. Zuck isn’t showing courage; he’s sucking up to Trump and those that will pay the price are those that always do: women and minorities.

Really, the point of any large social network isn’t to build community. It’s to make the owners their next billion. They do that by convincing people to look at ads on their site. Zuck is as much a windsock as anyone else; he will adjust policies in whichever direction he thinks the wind is blowing so as to let him keep putting ads in front of eyeballs, and stomp all over principles — even free speech — doing it. Don’t expect anything different from any large commercial social network either. Bluesky is going to follow the same trajectory as all the others.

The problem with a one-size-fits-all content policy is that the world isn’t that kind of place. For instance, I am a pacifist. There is a place for a group where pacifists can hang out with each other, free from the noise of the debate about pacifism. And there is a place for the debate. Forcing everyone that signs up for the conversation to sign up for the debate is harmful. Preventing the debate is often also harmful. One company can’t square this circle.

Beyond that, the fact that we care so much about one company is a problem on two levels. First, it indicates how succeptible people are to misinformation and such. I don’t have much to offer on that point. Secondly, it indicates that we are too centralized.

We have a solution there: Mastodon. Mastodon is a modern, open source, decentralized social network. You can join any instance, easily migrate your account from one server to another, and so forth. You pick an instance that suits you. There are thousands of others you can choose from. Some aggressively defederate with instances known to harbor poisonous people; some don’t.

And, to harken back to the BBS era, if you have some time, some skill, and a few bucks, you can run your own Mastodon instance.

Personally, I still visit Facebook on occasion because some people I care about are mainly there. But it is such a terrible experience that I rarely do. Meta is becoming irrelevant to me. They are on a path to becoming irrelevant to many more as well. Maybe this is the moment to go “shrug, this sucks” and try something better.

(And when you do, feel free to say hi to me at @jgoerzen@floss.social on Mastodon.)

Backing up every few minutes with simplesnap

February 13, 2014Linuxbackups, Debian, filesystems, zfsJohn Goerzen

I’ve written a lot lately about ZFS, and one of its very nice features is the ability to make snapshots that are lightweight, space-efficient, and don’t hurt performance (unlike, say, LVM snapshots).

ZFS also has “zfs send” and “zfs receive” commands that can send the content of the snapshot, or a delta between two snapshots, as a data stream – similar in concept to an amped-up tar file. These can be used to, for instance, very efficiently send backups to another machine. Rather than having to stat() every single file on a filesystem as rsync has to, it sends effectively an intelligent binary delta — which is also intelligent about operations such as renames.

Since my last search for backup tools, I’d been using BackupPC for my personal systems. But since I switched them to ZFS on Linux, I’ve been wanting to try something better.

There are a lot of tools out there to take ZFS snapshots and send them to another machine, and I summarized them on my wiki. I found zfSnap to work well for taking and rotating snapshots, but I didn’t find anything that matched my criteria for sending them across the network. It seemed par for the course for these tools to think nothing of opening up full root access to a machine from others, whereas I would much rather lock it down with command= in authorized_keys.

So I wrote my own, called simplesnap. As usual, I wrote extensive documentation for it as well, even though it is very simple to set up and use.

So, with BackupPC, a backup of my workstation took almost 8 hours. (Its “incremental” might take as few as 3 hours) With ZFS snapshots and simplesnap, it takes 25 seconds. 25 seconds!

So right now, instead of backing up once a day, I back up once an hour. There’s no reason I couldn’t back up every 5 minutes, in fact. The data consumes less space, is far faster to manage, and doesn’t require a nightly hours-long cleanup process like BackupPC does — zfs destroy on a snapshot just takes a few seconds.

I use a pair of USB disks for backups, and rotate them to offsite storage periodically. They simply run ZFS atop dm-crypt (for security) and it works quite well even on those slow devices.

Although ZFS doesn’t do file-level dedup like BackupPC does, and the lz4 compression I’ve set ZFS to use is less efficient than the gzip-like compression BackupPC uses, still the backups are more space-efficient. I am not quite sure why, but I suspect it’s because there is a lot less metadata to keep track of, and perhaps also because BackupPC has to store a new copy of a file if even a byte changes, whereas ZFS can store just the changed blocks.

Incidentally, I’ve packaged both zfSnap and simplesnap for Debian and both are waiting in NEW.

Why and how to run ZFS on Linux

January 23, 2014Debian, LinuxDebian, filesystems, zfsJohn Goerzen

I’m writing a bit about ZFS these days, and I thought I’d write a bit about why I am using it, why it might or might not be interesting for you, and what you might do about it.

ZFS Features and Background

ZFS is not just a filesystem in the traditional sense, though you can use it that way. It is an integrated storage stack, which can completely replace the need for LVM, md-raid, and even hardware RAID controllers. This permits quite a bit of flexibility and optimization not present when building a stack involving those components. For instance, if a drive in a RAID fails, it needs only rebuild the parts that have actual data stored on them.

Let’s look at some of the features of ZFS:

Full checksumming of all data and metadata, providing protection against silent data corruption. The only other Linux filesystem to offer this is btrfs.
ZFS is a transactional filesystem that ensures consistent data and metadata.
ZFS is copy-on-write, with snapshots that are cheap to create and impose virtually undetectable performance hits. Compare to LVM snapshots, which make writes notoriously slow and require an fsck and mount to get to a readable point.
ZFS supports easy rollback to previous snapshots.
ZFS send/receive can perform incremental backups much faster than rsync, particularly on systems with many unmodified files. Since it works from snapshots, it guarantees a consistent point-in-time image as well.
Snapshots can be turned into writeable “clones”, which simply use copy-on-write semantics. It’s like a cp -r that completes almost instantly and takes no space until you change it.
The datasets (“filesystems” or “logical volumes” in LVM terms) in a zpool (“volume group”, to use LVM terms) can shrink or grow dynamically. They can have individual maximum and minimum sizes set, but unlike LVM, where if, say, /usr gets bigger than you thought, you have to manually allocate more space to it, ZFS datasets can use any space available in the pool.
ZFS is designed to run well in big iron, and scales to massive amounts of storage. It supports SSDs as L2 cache and ZIL (intent log) devices.
ZFS has some built-in compression methods that are quite CPU-efficient and can yield not just space but performance benefits in almost all cases involving compressible data.
ZFS pools can host zvols, a block device under /dev that stores its data in the zpool. zvols support TRIM/DISCARD, so are ideal for storing VM images, as they can instantly release space released by the guest OS. They can also be snapshotted and backed up like the rest of ZFS.

Although it is often considered a server filesystem, ZFS has been used in plenty of other situations for some time now, with ports to FreeBSD, Linux, and MacOS. I find it particularly useful:

To have faith that my photos, backups, and paperwork archives are intact. zpool scrub at any time will read the entire dataset and verify the integrity of every bit.
I can create snapshots of my system before running apt-get dist-upgrade, making it easy to track down issues or roll back to a known-good configuration. Ideal for people tracking sid or testing. One can also easily simply boot from a previous snapshot.
Many scripts exist that make frequent snapshots, and retain the for a period of time as a way of protecting work in progress against an accidental rm. There is no reason not to snapshot /home every 5 minutes, for instance. It’s almost as good as storing / in git.

The added level of security in having cheap snapshots available is almost worth it by itself.

ZFS drawbacks

Compared to other Linux filesystems, there are a few drawbacks of ZFS:

CDDL will prevent it from ever being part of the Linus kernel tree
It is more RAM-hungry than most, although with tuning it can even run on the Raspberry Pi.
A 64-bit kernel is strongly preferred, even in low-memory situations.
Performance on many small files may be less than ext4
The ZFS cache does not shrink and expand in response to changing RAM usage conditions on the system as well as the normal Linux cache does.
Compared to btrfs, ZFS lacks some features of btrfs, such as being able to shrink an existing pool or easily change storage allocation on the fly. On the other hand, the features in ZFS have never caused me a kernel panic, and half the things I liked about btrfs seem to have.
ZFS is already quite stable on Linux. However, the GRUB, init, and initramfs code supporting booting from a ZFS root and /boot is less stable. If you want to go 100% ZFS, be prepared to tweak your system to get it to boot properly. Once done, however, it is quite stable.

Converting to ZFS

I have written up an extensive HOWTO on converting an existing system to use ZFS. It covers workarounds for all the boot-time bugs I have encountered as well as documenting all steps needed to make it happen. It works quite well.

Additional Hints

If setting up zvols to be used by VirtualBox or some such system, you might be interested in managing zvol ownership and permissions with udev.

Debian-Live Rescue image with ZFS On Linux; Ditched btrfs

January 22, 2014Linuxbtrfs, Debian, filesystems, zfsJohn Goerzen

I’m a geek. I enjoy playing with different filesystems, version control systems, and, well, for that matter, radios.

I have lately started to worry about the risks of silent data corruption, and as such, looked to switch my personal systems to either ZFS or btrfs, both of which offer built-in checksumming of all data and metadata. I initially opted for btrfs, because of its tighter integration into the Linux kernel and ability to shrink an existing btrfs filesystem.

However, as I wrote last month, that experiment was not a success. I had too many serious performance regressions and one too many kernel panics and decided it wasn’t worth it. And that the SuSE people got it wrong, deeply wrong, when they declared btrfs ready for production. I never lost any data, to its credit. But it simply reduces uptime too much.

That left ZFS. Before I build a system, I always want to make sure I can repair it. So I started with the Debian Live rescue image, and added the zfsonlinux.org repository to it, along with some key packages to enable the ZFS kernel modules, GRUB support, and initramfs support. The resulting image is described, and can be downloaded from, my ZFS Rescue Disc wiki page, which also has a link to my source tree on github.

In future blog posts in the series, I will describe the process of converting existing Debian installations to use ZFS, of getting them to boot from ZFS, some bugs I encountered along the way, and some surprising performance regressions in ZFS compared to ext4 and btrfs.

Administering Dozens of Debian Servers

December 16, 2008DebianDebian, security, systemadminJohn Goerzen

At work, we have quite a few Debian servers. We have a few physical machines, then a number of virtual machines running under Xen. These servers are split up mainly along task-oriented lines: DNS server, LDAP server, file server, print server, mail server, several web app servers, ERP system, and the like.

In the past, we had fewer virtual instances and combined more services into a single OS install. This led to some difficulties, especially with upgrades. If we wanted to upgrade the OS for, say, the file server, we’d have to upgrade the web apps and test them along with it at the same time. This was not a terribly sustainable approach, hence the heavier reliance on smaller virtual environments.

All these virtual environments have led to their own issues. One of them is getting security patches installed. At present, that’s a mainly manual task. In the past, I used cron-apt a bit, but it seemed to be rather fragile. I’m wondering what people are using to get security updates onto servers in an automated fashion these days.

The other issue is managing the configuration of these things. We have some bits of configuration that are pretty similar between servers — the mail system setup, for instance. Most of them are just simple SMTP clients that need to be able to send out cron reports and the like. We had tried using cfengine2 for this, but it didn’t work out well. I don’t know if it was our approach or not, but we found that hacking cfengine2 after making changes on systems was too time-consuming, and so that task slipped and eventually cfengine2 wasn’t doing what it should anymore. And that even with taking advantage of it being able to do things like put the local hostname in the right places.

I’ve thought a bit about perhaps providing some locally-built packages that establish these config files, or load them up with our defaults. That approach has worked out well for me before, though it also means that pushing out changes isn’t a simple hack of a config file somewhere anymore.

It seems like a lot of the cfengine2/bcfg tools are designed for environments where servers are more homogenous than ours. bcfg2, in particular, goes down that road; it makes it difficult to be able to log on to a web server, apt-get install a few PHP modules that we need for a random app, and just proceed.

Any suggestions?

New version of datapacker

September 25, 2008Softwaredatapacker, Debian, haskellJohn Goerzen

I wrote before about datapacker, but I didn’t really describe what it is or how it’s different from other similar programs.

So, here’s the basic problem the other day. I have a bunch of photos spanning nearly 20 years stored on my disk. I wanted to burn almost all of them to DVDs. I can craft rules with find(1) to select the photos I want, and then I need to split them up into individual DVDs. There are a number of tools that did that, but not quite powerful enough for what I want.

When you think about splitting things up like this, there are a lot of ways you can split things. Do you want to absolutely minimize the number of DVDs? Or do you keep things in a sorted order, and just start a new DVD when the first one fills up? Maybe you are adding an index to the first DVD, and need a different size for it.

Well, datapacker 1.0.1 can solve all of these problems. As its manpage states, “datapacker is a tool in the traditional Unix style; it can be used in pipes and call other tools.” datapacker accepts lists of files to work on as command-line parameters, piped in from find, piped in from find -print0. It can also output its results in various parser-friendly formats, call other programs directly in a manner similar to find -exec, or create hardlink or symlink forests for ease of burning to DVD (or whatever you’ll be doing with it).

So, what I did was this:

find Pictures -type f -and -not -iwholename "Pictures/2001/*.tif" -and \ -not -wholename "Pictures/Tabor/*" -print0 | \ datapacker -0Dp -s 4g --sort -a hardlink -b ~/bins/%03d -

So I generate a list of photos to process with find. Then datapacker is told to read the list of files to process in a null-separated way (-0), generate bins that mimic the source directory structure (-D), organize into bins preserving order (-p), use a 4GB size per bin (-s 4g), sort the input prior to processing (–sort), create hardlinks for the files (-a hardlink), and then name the bins with a 3-digit number under ~/bins, and finally, read the list of files from stdin (-). By using –sort and -p, the output will be sorted by year (Pictures/2000, Pictures/2001, etc), so that photos from all years aren’t all mixed in on the discs.

This generates 13 DVD-sized bins in a couple of seconds. A simple for loop then can use mkisofs or growisofs to burn them.

The datapacker manpage also contains an example for calling mkisofs directly for each bin, generating ISOs without even an intermediate hardlink forest.

So, when I wrote about datapacker last time, people asked how it differed from other tools. Many of them had different purposes in mind. So I’m not trying to say one tool or the other is better, just highlighting differences. Most of these appear to not have anything like datapacker –deep-links.

gaffiter: No xargs-convenient output, no option to pass results directly to shell commands. Far more complex source (1671 lines vs. 228 lines)

dirsplit: Park of mkisofs package. Uses a random iterative approach, few options.

packcd: Similar packing algorithm options, but few input/input options. No ability to read a large file list from stdin. Could have issues with command line length.

Thoughtfulness on the OpenSSL bug

May 14, 2008DebianDebian, securityJohn Goerzen

By now, I’m sure you all have read about the OpenSSL bug discovered in Debian.

There’s a lot being written about it. There’s a lot of misinformation floating about, too. First thing to do is read this post, which should clear up some of that.

Now then, I’d like to think a little about a few things people have been saying.

People shouldn’t try to fix bugs they don’t understand.

At first, that sounds like a fine guideline. But when I thought about it a bit, I think it’s actually more along the lines of useless.

First of all, there is this problem: how do you know whether or not you understand it? Obviously, sometimes you know you don’t understand code well. But there are times when you think you do, but don’t. Especially when we’re talking about C and its associated manual memory management and manual error handling. I’d say that, for a C program of any given size, very few people really understand it. Especially since you may be dealing with functions that call other functions 5 deep, and one of those functions modifies what you thought was an input-only parameter in certain rare cases. Maybe it’s documented to do that, maybe not, but of course documentation cannot always be trusted either.

I’d say it’s more useful to say that people should get peer review of code whenever possible. Which, by the way, did occur here.

The Debian maintainer of this package {is an idiot, should be fired, should be banned}

I happen to know that the Debian programmer that made this patch is a very sharp individual. I have worked with him on several occasions and I would say that kicking him out of maintaining OpenSSL would be a quite stupid thing to do.

He is, like the rest of us, human. We might find that other people are considerably less perfect than he.

Nobody that isn’t running Debian or Ubuntu has any need to worry. This is all Debian’s fault.

I guess you missed the part of the advisory that mentioned that it also fixed an OpenSSL upstream bug (that *everyone* is vulnerable to) that permitted arbitrary code execution in a certain little-used protocol? OpenSSL has a history of security bugs over the years.

Of course, the big keygen bug is a Debian-specific thing.

Debian should send patches upstream

This is general practice in Debian. It happens so often, in fact, that the Debian bug-tracking system has had — for probably more than a decade — a feature that lets a Debian developer record that a bug reported to Debian has been forwarded to an upstream developer or bug-tracking system.

It is routine to send both bug reports and patches upstream. Some Debian developers are more closely aligned with upstream than others. In some cases, Debian developers are part of the upstream team. In others, upstream may be friendly and responsive enough that Debian developers run any potential patches to upstream code by them before committing them to Debian. (I tend to do this for Bacula). In some cases, upstream is busy and doesn’t respond fast or reliably or helpfully enough to permit Debian to make security updates or other important fixes in a timely manner. And sometimes, upstream is plain AWOL.

Of course, it benefits Debian developers to send patches upstream, because then they have a smaller diff to maintain when each new version comes out.

In this particular case, communication with upstream happened, but the end result just fell through the cracks.

Debian shouldn’t patch security-related stuff itself, ever

Well, that’s not a very realistic viewpoint. Every Linux distribution does this, for several reasons. First, a given stable release of a distribution may be older than the current state of the art upstream software, and some upstreams are not interested in patching old versions, while the new upstream versions introduce changes too significant to go into a security update. Secondly, some upstreams do not respond in a timely manner, and Debian wants to protect its users ASAP. Finally, some upstreams are simply bad at security, and having smart folks from Debian — and other distributions — write security patches is a benefit to the community.

Debian Developers 7 Years Ago

May 23, 2007DebianDebianJohn Goerzen

Today while looking for something else, I stumbled across a DVD with the “last archive” of my old personal website. On it were a number of photos from the 2000 Annual Linux Conference in Atlanta, and the Debian developers that were there. These were posted in public for several years.

I’ve now posted all of them on flickr, preserving the original captions.

Here’s the obligatory sample:

That’s Joey Hess, using what I think was his Vaio. Most acrobatic keyboardist ever. Probably the only person that could write Perl with one hand comfortably.

What else can you see? The best of show award that Debian won that is now in my basement due to a complicated series of events, the Debian machines that were being shown off at the show, Sean Perry and Manoj, the photo with long-term corrupted caption, and of course, numerous shots of Branden.

I know the size stinks. It was scanned at a web resolution for 2000. I do still have the negatives somewhere and will post the rest of them, in higher res, when I find them.

Click here to view the full set.

HP Officially Supports Debian

August 14, 2006DebianDebian, hpJohn Goerzen

Yes, it’s true.

I have two words for this: Woohoo! Finally.

How to solve “The following packages cannot be authenticated”

June 15, 2006Debianapt, Debian, gnupg, gpgJohn Goerzen

Users of Debian’s testing or unstable distributions may be noticing messages from apt saying things like:

WARNING: The following packages cannot be authenticated!
  foo bar baz
Install these packages without verification [y/N]?

I noticed today that google doesn’t turn up good hits for the fix. The fix is really simple:

apt-get install debian-archive-keyring
apt-get update

That’s it. You now have secure packages from Debian. Nice, eh?

The Changelog

Comments on family, technology, and society