The Hidden Drawbacks of P2P (And a Defense of Signal)

Not long ago, I posted a roundup of secure messengers with off-the-grid capabilities. Some conversation followed, which led me to consider some of the problems with P2P protocols.

P2P and Privacy

Brave adopting IPFS has driven a lot of buzz lately. IPFS is essentially a decentralized, distributed web. This concept has a lot of promise. But take a look at the IPFS privacy document. Some things to highlight:

  • “Nodes announce a variety of information essential to the DHT’s function β€” including their unique node identifiers (PeerIDs) and the CIDs of data that they’re providing β€” and because of this, information about which nodes are retrieving and/or reproviding which CIDs is publicly available.”
  • “those DHT queries happen in public. Because of this, it’s possible that third parties could be monitoring this traffic to determine what CIDs are being requested, when, and by whom.”
  • “nodes’ unique identifiers are themselves public…your PeerID is still a long-lived, unique identifier for your node. Keep in mind that it’s possible to do a DHT lookup on your PeerID and, particularly if your node is regularly running from the same location (like your home), find your IP address…Additionally, longer-term monitoring of the public IPFS network could yield information about what CIDs your node is requesting and/or reproviding and when.”

So in this case, you have traded giving information about what you request to specific sites to giving it to potentially hundreds of untrusted peers, some of which may be logging this for nefarious purposes. Worse, you have a durable PeerID that can be used for tracking and tied to your IP address — a data collector’s dream. This PeerID, combined with DHT requests and the CIDs (Content ID) of the things you host (implying you viewed them in the past), can be used to establish a picture of what you are requesting now and requested recently.

Similar can be said from everything like Scuttlebutt to GNU Jami; any service that operates on a P2P basis will likely reveal your IP, and tie your identity to it (and your IP address history). In some cases, as with Jami, this would be limited to friends you add; in others, as with Scuttlebutt and IPFS, it could be revealed to anyone.

The advantages of P2P are undeniable and profound, but few are effectively addressing the privacy implications. The one I know of that is, Briar, routes all traffic over Tor; every node is reached by a Tor onion service.

Federation: somewhat better

In a federated model, every client connects to a server, and there are many servers participating in a federation with each other. Matrix and Mastodon are examples of a federated model. In this scenario, only one server — your own homeserver — can track you by IP. End-to-end encryption is certainly possible in a federated model, and Matrix supports it. This does give a third party (the specific server you use) knowledge of your IP, but that knowledge can be significantly limited.

A downside of this approach is that if your particular homeserver is down, you are unable to communicate. Truly decentralized P2P solutions don’t have that problem — thought they do have a related one, which is that clients communicating with each other must both be online simultaneously in order for messages to be transmitted, and this can be a real challenge for mobile devices.

Centralization and Signal

Signal is centralized; it has one central server farm, and if it is down, you can’t communicate or choose any other server, either. We saw it go down recently after Elon Musk mentioned it.

Still, I recommend Signal for the general public. Here’s why.

Signal brings encryption and privacy to meet people where they’re at, not the other way around. People don’t have to choose a server, it can automatically recognize contacts that use Signal, it has emojis, attachments, secure voice and video calling, and (aside from the Musk incident), it all just works. It feels like, and is, a polished, modern experience with the bells and whistles people are used to.

I’m a huge fan of Matrix (aka Element) and even run my own instance. It has huge promise. But it is Not. There. Yet. Why do I saw this about Matrix?

  • Synapse, the only currently viable Matrix server, is not ready. My Matrix instance hosts ONE person, me. Synapse uses many GB of RAM and 10+GB of disk space. Despite extensive tuning, nothing helped much. It’s caused OOMs more than once. It can’t be hosted on a Raspberry Pi or even one of the cheaper VPSs.
  • Now then, how about choosing a Matrix instance? Well, you could just tell a person to use matrix.org. But then it spent a good portion of last year unable to federate with other popular nodes due to Synapse limitations. Or you could pick a random node, but will it be up when someone needs to say “my car broke down?” Some are run from a dorm computer, some by a team in a datacenter, some by one person with EC2, and you can’t really know. Will your homeserver be stable and long-lived? Hard to say.
  • Voice and video calling are not there yet in Matrix. Matrix has two incompatible video calling methods (Jitsi and built-in), neither work consistently well, both are hard to manage, and both have NAT challenges.
  • Matrix is so hard to set up on a server that there is matrix-docker-ansible-deploy. This makes it much better, but it is STILL terribly hard to deploy, and very simple things like “how do I delete a user” or “let me shrink down this 30GB database” are barely there yet, if at all.
  • Encryption isn’t mandatory in Matrix. E2EE has been getting dramatically better in the last few releases, but it is still optional, especially for what people would call “group chats” (rooms). Signal is ALWAYS encrypted. Always. (Unless, I guess, you set it as your SMS provider on Android). You’ve got to take the responsibility off the user to verify encryption status, and instead make it the one and only way to use the ecosystem.

Again, I love MAtrix. I use it every day to interact with Matrix, IRC, Slack, and Discord channels. It has a ton of promise. But would I count on it to carry a “my car’s broken down and I’m stranded” message? No.

How about some of the other options out there? I mentioned Briar above. It’s fantastic and its offline options are novel and promising. But in common usage, it can’t deliver a message unless both devices are online simultaneously, and doesn’t run on iOS (though both are being worked on). It also can’t send photos or do voice or video calling.

Some of these same limitations apply to most of the other Signal alternatives also. either that, or they are encryption-optional, or terribly hard to set up and use. I recently mentioned Status, which shows a ton of promise, but has no voice or video calling capabilities. Scuttlebutt is a fantastic protocol with extremely difficult onboarding (lengthy process, error-prone finding a pub, multi-GB initial download, etc.) And many of these leak IP addresses as discussed above.

So Signal gives people:

  • Dead-simple setup
  • Store-and-forward delivery (devices need not be online simultaneously)
  • Encrypted everything, including voice and video calls, and the ability to send photos and video encrypted

If you are going to tell someone, “it’s so EASY to get your texts away from Facebook and AT&T”, then Signal is the thing you’ve got to point them to. It may not be in two years, but for now, it is. Do not let the perfect be the enemy of the good. It advances the status quo without harming usability, which nothing else does yet.

I am aware of all of the very legitimate criticisms of Signal. They are real and they are why I am excited that there are so many alternatives with promise, some of which I use actively. Let us technical people use, debug, contribute to, and evangelize the alternatives.

And while we’re doing that, tell Grandma to contact us on Signal.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.