I’m going to lead with the technical punch line, and then explain it:
Yggdrasil Network is an opportunistic mesh that can be deployed privately or as part of a global-scale network. Each node gets a stable IPv6 address (or even an entire /64) that is derived from its public key and is bound to that node as long as the node wants it (of course, it can generate a new keypair anytime) and is valid wherever the node joins the mesh. All traffic is end-to-end encrypted.
Yggdrasil will automatically discover peers on a LAN via broadcast beacons, and requires zero configuration to peer in such a way. It can also run as an overlay network atop the public Internet. Public peers serve as places to join the global network, and since it’s a mesh, if one device on your LAN joins the global network, the others will automatically have visibility on it also, thanks to the mesh routing.
It neatly solves a lot of problems of portability (my ssh sessions stay live as I move networks, for instance), VPN (incoming ports aren’t required since local nodes can connect to a public peer via an outbound connection), security, and so forth.
Now on to the explanation:
The Tyranny of IP rigidity
Every device on the Internet, at one time, had its own globally-unique IP address. This number was its identifier to the world; with an IP address, you can connect to any machine anywhere. Even now, when you connect to a computer to download a webpage or send a message, under the hood, your computer is talking to the other one by IP address.
Only, now it’s hard to get one. The Internet protocol we all grew up with, version 4 (IPv4), didn’t have enough addresses for the explosive growth we’ve seen. Internet providers and IT departments had to use a trick called NAT (Network Address Translation) to give you a sort of fake IP address, so they could put hundreds or thousands of devices behind a single public one. That, plus the mobility of devices — changing IPs whenever they change locations — has meant that a fundamental rule of the old Internet is now broken:
Every participant is an equal peer. (Well, not any more.)
Nowadays, you can’t you host your own website from your phone. Or share files from your house. (Without, that is, the use of some third-party service that locks you down and acts as an intermediary.)
Back in the 90s, I worked at a university, and I, like every other employee, had a PC on my desk with an unfirewalled public IP. I installed a webserver, and poof – instant website. Nowadays, running a website from home is just about impossible. You may not have a public IP, and if you do, it likely changes from time to time. And even then, your ISP probably blocks you from running servers on it.
In short, you have to buy your way into the resources to participate on the Internet.
I wrote about these problems in more detail in my article Recovering Our Lost Free Will Online.
Enter Yggdrasil
I already gave away the punch line at the top. But what does all that mean?
I’ve set up /etc/hosts on my laptop to use the Yggdrasil IPs for other machines on my LAN. Now I can just “ssh foo” and it will work — from home, from a coffee shop, from a 4G tether, wherever. Now, other tools like tinc can do this, obviously. And I could stop there; I could have a completely closed, private Yggdrasil network.
Or, I can join the global Yggdrasil network. Each device, in addition to accepting peers it finds on the LAN, can also be configured to establish outbound peering connections or accept inbound ones over the Internet. Put a public peer or two in your configuration and you’ve joined the global network. Most people will probably want to do that on every device (because why not?), but you could also do that from just one device on your LAN. Again, there’s no need to explicitly build routes via it; your other machines on the LAN will discover the route’s existence and use it.
This is one of many projects that are working to democratize and decentralize the Internet. So far, it has been quite successful, growing to over 2000 nodes. It is the direct successor to the earlier cjdns/Hyperboria and BATMAN networks, and aims to be a proof of concept and a viable tool for global expansion.
Finally, think about how much easier development is when you don’t have to necessarily worry about TLS complexity in every single application. When you don’t have to worry about port forwarding and firewall penetration. It’s what the Internet should be.
@jgoerzen I read it, it’s very interestingI tried yggrdasil myself and I ended up wondering: ok nice but what is this thing for ?Your article clarifies thatKudos π
2/ One other cool thing: All you need to talk to a peer on #Yggdrasil is a switch. You can take 5 laptops, plug them into your switch, and boom – they talk to each other securely. No need for DHCP, for radvd, statically setting IPs, or whatever. Same could be done with wifi – just set out an access point, doesn’t even need to be plugged into anything, and they can talk.And if just one of those devices has outside access – all are instantly a part of the global net.
yggdrasil
@AbbieNormal Thanks! Glad to help. I’m really excited about technologies like this.
@jgoerzen Thanks for that informative primer!
3/ Heck, you don’t even need an access point. I forgot to mention you could just set your wifi to ad-hoc mode! Yggdrasil is a perfect solution for some of the annoyances there.
@jgoerzen I’ve heard mumblings about Yggdrasil recently but other than in ancient myths (one norse based, one linux based), I didn’t know what it was. Now I actually know that it is something I want to look into! Thanks!
@jgoerzen very cool. Thanks. Will probably try.Big question: interoperability with current Internet? If if put a website up on Yggdrasil can I point random people not running Yggdrasil themselves at it?—curious
@jgoerzen how many users it has today? I’ m looking for alternatives to i2p… perhaps, how the dns works there?
i’m curious: are yggdrasil IPs publicly routable? you mentioned those being accessible over 4G and so on, but does that require running a client on all your boxes?
in other words, if it’s an overlay network, is there a gateway in or out?
could i use this as a IPv6 tunnel for my IPv4-less uplink?
Yggdrasil IPs are not visible on the “clearnet” (the “default” Internet). Since an Yggdrasil IP is derived from the node’s public key, rather than its provider or host network, that does mean that it doesn’t use the conventional routing model either.
The usual approach is to run a Yggdrasil client on each box. That way, each box can roam wherever (eg, home, coffee shop, work/school, etc.) and they will have connectivity with each other regardless of physical location. Since Yggdrasil can discover peers on the LAN automatically, this is particularly nice and doesn’t even require Internet for LAN connectivity.
It can, of course, overlay over the clearnet (both IPv4 and IPv6). On the services page https://yggdrasil-network.github.io/services.html you can find a SOCKS proxy from Yggdrasil to clearnet, as well as various I2P, IPFS, Tor, etc. endpoints. I am not aware of a general public proxy from clearnet into Yggdrasil, but there is nothing preventing that.
Yggdrasil also gives you a /64 of IPv6 space, so an alternative approach is to install it on a router (or some other device) and provide access to the entire network that way. However, that approach sacrifices end-to-end encryption (since then it becomes end-to-router encryption).
I’m not quite sure what you mean by “IPv6 tunnel for my IPv4-less uplink”. You can certainly route traffic over it, and it could of course be used as a VPN to link two /64s. You might also be interested in tinc, which is designed as a closed VPN despite having some mesh features like Yggdrasil.
by “IPv6 tunnel for my IPv4-less uplink” i mean stuff like the IPv6 tunnels he.net provides for us poor folks still stuck in the legacy IPv4 world. it gives you access to the full internet over IPv6.
but from your description, I understand that’s not the point of yggrasil. the IPv6 space is not publicly routable.
I realize that it’s already end-to-end encrypted, but how does yggdrasil interact with web services that are currently run over https (with letsencrypt certs)? Are we forced to use plain http and disable (or whitelist) things like https-everywhere in browsers, or have you figured out a good way to handle https within yggdrasil?
It doesn’t interact well with Let’s Encrypt certs, at least by default. The general convention is to not use HTTPS over Yggdrasil due to no need and the complexity of cert verification for that.
Note: this post is also available on my website, where it will be updated periodically.
When things are difficult β maybe thereβs been a disaster, or an invasion (this page is being written in 2022 just after Russia invaded Ukraine), or maybe youβre just backpacking off the grid β there are tools that can help you keep in touch, or move your data around. This page aims to survey some of them, roughly in order from easiest to more complex.
Simple radios
Handheld radios shouldnβt be forgotten. They are cheap, small, and easy to operate. Their range isnβt huge β maybe a couple of miles in rural areas, much less in cities β but they can be a useful place to start. They tend to have no actual encryption features (the βprivacyβ features really arenβt.) In the USA, options are FRS/GMRS and CB.
Syncthing
With Syncthing, you can share files among your devices or with your friends. Syncthing essentially builds a private mesh for file sharing. Devices will auto-discover each other when on the same LAN or Wifi network, and opportunistically sync.
I wrote more about offline uses of Syncthing, and its use with NNCP, in my blog post A simple, delay-tolerant, offline-capable mesh network with Syncthing (+ optional NNCP). Yes, it is a form of a Mesh Network!
Homepage: https://syncthing.net/
Briar
Briar is an instant messaging service based around Android. Itβs IM with a twist: it can use a mesh of Bluetooh devices. Or, if Internet is available, Tor. It has even been extended to support the use of SD cards and USB sticks to carry your messages.
Like some others here, it can relay messages for third parties as well.
Homepage: https://briarproject.org/
Manyverse and Scuttlebutt
Manyverse is a client for Scuttlebutt, which is a sort of asynchronous, offline-friendly social network. You can use it to keep in touch with your family and friends, and it supports syncing over Bluetooth and Wifi even in the absence of Internet.
Homepages: https://www.manyver.se/ and https://scuttlebutt.nz/
Yggdrasil
Yggdrasil is a self-healing, fully end-to-end Encrypted Mesh Network. It can work among local devices or on the global Internet. It has network services that can egress onto things like Tor, I2P, and the public Internet. Yggdrasil makes a perfect companion to ad-hoc wifi as it has auto peer discovery on the local network.
I talked about it in more detail in my blog post Make the Internet Yours Again With an Instant Mesh Network.
Homepage: https://yggdrasil-network.github.io/
Ad-Hoc Wifi
Few people know about the ad-hoc wifi mode. Ad-hoc wifi lets devices in range talk to each other without an access point. You just all set your devices to the same network name and password and there you go. However, there often isnβt DHCP, so IP configuration can be a bit of a challenge. Yggdrasil helps here.
NNCP
Moving now to more advanced tools, NNCP lets you assemble a network of peers that can use Asynchronous Communication over sneakernet, USB drives, radios, CD-Rs, Internet, tor, NNCP over Yggdrasil, Syncthing, Dropbox, S3, you name it . NNCP supports multi-hop file transfer and remote execution. It is fully end-to-end encrypted. Think of it as the offline version of ssh.
Homepage: https://nncp.mirrors.quux.org/
Meshtastic
Meshtastic uses long-range, low-power LoRa radios to build a long-distance, encrypted, instant messaging system that is a Mesh Network. It requires specialized hardware, about $30, but will tend to get much better range than simple radios, and with very little power.
Homepages: https://meshtastic.org/ and https://meshtastic.letstalkthis.com/
Portable Satellite Communicators
You can get portable satellite communicators that can send SMS from anywhere on earth with a clear view of the sky. The Garmin InReach mini and Zoleo are two credible options. Subscriptions range from about $10 to $40 per month depending on usage. They also have global SOS features.
Telephone Lines
If you have a phone line and a modem, UUCP can get through just about anything. Itβs an older protocol that lacks modern security, but will deal with slow and noisy serial lines well. XBee SX radios also have a serial mode that can work well with UUCP.
Additional Suggestions
It is probably useful to have a Linux live USB stick with whatever software you want to use handy. Debian can be installed from the live environment, or you could use a security-focused distribution such as Tails or Qubes.
References
This page originated in my Mastodon thread and incorporates some suggestions I received there.
It also formed a post on my blog.
@jvalleroy I keep thinking of more things to say after I write “/end” π So #Yggdrasil also works on a hyperlocal scale. Take 5 laptops and connect them to an ad-hoc wifi network, and with #Yggdrasil, they’ll auto-discover each other and communicate – even if they don’t even have IPs assigned! If just one of those nodes can also reach the Internet, then all of a sudden all of them can talk to global Yggdrasil also, because they automatically discover the route. Very cool. /end
Yggdrasil
@netopwibby Now in the “each site gets a /64” scenario, you have the one Yggdrasil gateway, so the nodes on the network don’t have the classic #Yggdrasil benefits of IP portability and such. But, all your stuff from printers to cameras can just use the network nicely. You’ve basically replaced the current Internet backbone (or layered atop it, depending). 5/
Yggdrasil
@netopwibby So let’s say you have a company with offices in five cities. You want the various networks to all see each other, seamlessly, and securely. A classic approach might involve VPNs. But then you quickly get into topology questions: who connects to whom? What happens if one site goes down – can all the others keep communicating between themselves? #tinc or #Yggdrasil can address this. 6/
Yggdrasil
tinc
@netopwibby With #Yggdrasil in this scenario, you could establish links from each site to each other site (if you wish). If a backhoe accident takes one of those links down, Yggdrasil will automatically figure out how to route traffic between A and B via, say, C. You can build up whatever topology you like, and you don’t have to teach Yggdrasil about it – it will /discover/ it, and also discover and adapt to changes in it (such as outages). 7/
Yggdrasil
@netopwibby So compared to VPNs and leased lines, this is a lot easier to manage. Still, in my use cases, I haven’t (yet) used the /64 because I have generally put Yggdrasil directly on each machine I want to use with it. But there are all sorts of options.Basically, #Yggdrasil lets you build your own #Internet, how you like, without all the expense and complexity. Pretty nifty. /end
Internet
Yggdrasil
@jgoerzen This was incredibly enlightening, thank you for taking the time to ELI5 haha! ππΎ
Why not zerotier or tailgate? I can’t see any difference in general except pure IPv6
They’re targeting fundamentally different things (I assume you meant Tailscale rather than Tailgate). Their free tier is limited to around 20 devices; Yggdrasil’s global mesh has thousands and you can join it, or build a private mesh to whatever scale you want too.
Update 2023-04: The version of this page on my public website has some important updates, including how to use broadcast detection in Docker, Yggdrasil zero-config for ephemeral containers, and more. See it for the most current information.
Sometimes you might want to run Docker containers on more than one host. Maybe you want to run some at one hosting facility, some at another, and so forth.
Maybe youβd like run VMs at various places, and let them talk to Docker containers and bare metal servers wherever they are.
And maybe youβd like to be able to easily migrate any of these from one provider to another.
There are all sorts of very complicated ways to set all this stuff up. But thereβs also a simple one: Yggdrasil.
My blog post Make the Internet Yours Again With an Instant Mesh Network explains some of the possibilities of Yggdrasil in general terms. Here I want to show you how to use Yggdrasil to solve some of these issues more specifically. Because Yggdrasil is always Encrypted, some of the security lifting is done for us.
Background
Often in Docker, we connect multiple containers to a single network that runs on a given host. That much is easy. Once you start talking about containers on multiple hosts, then you start adding layers and layers of complexity. Once you start talking multiple providers, maybe multiple continents, then the complexity can increase. And, if you want to integrate everything from bare metal servers to VMs into this β well, there are ways, but theyβre not easy.
Iβm a believer in the KISS principle. Letβs not make things complex when we donβt have to.
Enter Yggdrasil
As Iβve explained before, Yggdrasil can automatically form a global mesh network. This is pretty cool! As most people use it, they join it to the main Yggdrasil network. But Yggdrasil can be run entirely privately as well. You can run your own private mesh, and thatβs what weβll talk about here.
All we have to do is run Yggdrasil inside each container, VM, server, or whatever. We handle some basics of connectivity, and bam! Everything is host- and location-agnostic.
Setup in Docker
The installation of Yggdrasil on a regular system is pretty straightforward. Docker is a bit more complicated for several reasons:
It blocks IPv6 inside containers by default
The default set of permissions doesnβt permit you to set up tunnels inside a container
It doesnβt typically pass multicast (broadcast) packets
Normally, Yggdrasil could auto-discover peers on a LAN interface. However, aside from some esoteric Docker networking approaches, Docker doesnβt permit that. So my approach is going to be setting up one or more Yggdrasil βrouterβ containers on a given Docker host. All the other containers talk directly to the βrouterβ container and itβs all good.
Basic installation
In my Dockerfile, I have something like this:
FROM jgoerzen/debian-base-security:bullseye
RUN echo "deb http://deb.debian.org/debian bullseye-backports main" >> /etc/apt/sources.list &&
apt-get --allow-releaseinfo-change update &&
apt-get -y --no-install-recommends -t bullseye-backports install yggdrasil
...
COPY yggdrasil.conf /etc/yggdrasil/
RUN set -x;
chown root:yggdrasil /etc/yggdrasil/yggdrasil.conf &&
chmod 0750 /etc/yggdrasil/yggdrasil.conf &&
systemctl enable yggdrasil
The magic parameters to
docker run
to make Yggdrasil work are:--cap-add=NET_ADMIN --sysctl net.ipv6.conf.all.disable_ipv6=0 --device=/dev/net/tun:/dev/net/tun
This example uses my docker-debian-base images, so if you use them as well, youβll also need to add their parameters.
Note that it is NOT necessary to use
--privileged
. In fact, due to the network namespaces in use in Docker, this command does not let the container modify the hostβs networking (unless you use--net=host
, which I do not recommend).The
--sysctl
parameter was the result of a lot of banging my head against the wall. Apparently Docker tries to disable IPv6 in the container by default. Annoying.Configuration of the router container(s)
The idea is that the router node (or more than one, if you want redundancy) will be the only ones to have an open incoming port. Although the normal Yggdrasil case of directly detecting peers in a broadcast domain is more convenient and more robust, this can work pretty well too.
You can, of course, generate a template
yggdrasil.conf
withyggdrasil -genconf
like usual. Some things to note for this one:Youβll want to change
Listen
to something likeListen: ["tls://[::]:12345"]
where 12345 is the port number youβll be listening on.Youβll want to disable the
MulticastInterfaces
entirely by just setting it to[]
since it doesnβt work anyway.If you expose the port to the Internet, youβll certainly want to firewall it to only authorized peers. Setting
AllowedPublicKeys
is another useful step.If you have more than one router container on a host, each of them will both
Listen
and act as a client to the others. See below.Configuration of the non-router nodes
Again, you can start with a simple configuration. Some notes here:
Youβll want to set
Peers
to something likePeers: ["tls://routernode:12345"]
whererouternode
is the Docker hostname of the router container, and 12345 is its port number as defined above. If you have more than one local router container, you can simply list them all here. Yggdrasil will then fail over nicely if any one of them go down.Listen
should be empty.As above,
MulticastInterfaces
should be empty.Using the interfaces
At this point, you should be able to
ping6
between your containers. If you have multiple hosts running Docker, you can simply set up the router nodes on each to connect to each other. Now you have direct, secure, container-to-container communication that is host-agnostic! You can also set up Yggdrasil on a bare metal server or VM using standard procedures and everything will just talk nicely!Security notes
Yggdrasilβs mesh is aggressively greedy. It will peer with any node it can find (unless told otherwise) and will find a route to anywhere it can. There are two main ways to make sure your internal comms stay private: by restricting who can talk to your mesh, and by firewalling the Yggdrasil interface. Both can be used, and they can be used simultaneously.
By disabling multicast discovery, you eliminate the chance for random machines on the LAN to join the mesh. By making sure that you firewall off (outside of Yggdrasil) who can connect to a Yggdrasil node with a listening port, you can authorize only your own machines. And, by setting
AllowedPublicKeys
on the nodes with listening ports, you can authenticate the Yggdrasil peers. Note that part of the benefit of the Yggdrasil mesh is normally that you donβt have to propagate a configuration change to every participatory node β thatβs a nice thing in general!You can also run a firewall inside your container (I like
firehol
for this purpose) and aggressively firewall the IPs that are allowed to connect via the Yggdrasil interface. I like to set a stable interface name likeygg0
inyggdrasil.conf
, and then it becomes pretty easy to firewall the services. The Docker parameters that allow Yggdrasil to run are also sufficient to run firehol.Naming Yggdrasil peers
You probably donβt want to hard-code Yggdrasil IPs all over the place. There are a few solutions:
You could run an internal DNS service
You can do a bit of scripting around Dockerβs
--add-host
command to add things to /etc/hostsOther hints & conclusion
Here are some other helpful use cases:
If you are migrating between hosts, you could leave your reverse proxy up at both hosts, both pointing to the target containers over Yggdrasil. The targets will be automatically found from both sides of the migration while you wait for DNS caches to update and such.
This can make services integrate with local networks a lot more painlessly than they might otherwise.
This is just an idea. The point of Yggdrasil is expanding our ideas of what we can do with a network, so hereβs one such expansion. Have fun!
Note: This post also has a permanent home on my webiste, where it may be periodically updated.
Probably everyone is familiar with a regular VPN. The traditional use case is to connect to a corporate or home network from a remote location, and access services as if you were there.
But these days, the notion of βcorporate networkβ and βhome networkβ are less based around physical location. For instance, a company may have no particular office at all, may have a number of offices plus a number of people working remotely, and so forth. A home network might have, say, a PVR and file server, while highly portable devices such as laptops, tablets, and phones may want to talk to each other regardless of location. For instance, a family member might be traveling with a laptop, another at a coffee shop, and those two devices might want to communicate, in addition to talking to the devices at home.
And, in both scenarios, there might be questions about giving limited access to friends. Perhaps youβd like to give a friend access to part of your file server, or as a company, you might have contractors working on a limited project.
Pretty soon you wind up with a mess of VPNs, forwarded ports, and tricks to make it all work. With the increasing prevalence of CGNAT, a lot of times you canβt even open a port to the public Internet. Each application or device probably has its own gateway just to make it visible on the Internet, some of which you pay for.
Then you add on the question of: should you really trust your LAN anyhow? With possibilities of guests using it, rogue access points, etc., the answer is probably βnoβ.
We can move the responsibility for dealing with NAT, fluctuating IPs, encryption, and authentication, from the application layer further down into the network stack. We then arrive at a much simpler picture for all.
So this page is fundamentally about making the network work, simply and effectively.
How do we make the Internet work in these scenarios?
Weβre going to combine three concepts:
A VPN, providing fully encrypted and authenticated communication and stable IPs
Mesh Networking, in which devices automatically discover optimal paths to reach each other
Zero-trust networking, in which we do not need to trust anything about the underlying LAN, because all our traffic uses the secure systems in points 1 and 2.
By combining these concepts, we arrive at some nice results:
You can
ssh hostname
, where hostname is one of your machines (server, laptop, whatever), and as long ashostname
is up, you can reach it, wherever it is, wherever you are.Combined with mosh, these sessions will be durable even across moving to other host networks.
You could just as well use telnet, because the underlying network should be secure.
You donβt have to mess with encryption keys, certs, etc., for every internal-only service. Since IPs are now trustworthy, thatβs all you need. hosts.allow could make a comeback!
You have a way of transiting out of extremely restrictive networks. Every tool discussed here has a way of falling back on routing things via a broker (relay) on TCP port 443 if all else fails.
There might sometimes be tradeoffs. For instance:
On LANs faster than 1Gbps, performance may degrade due to encryption and encapsulation overhead. However, these tools should let hosts discover the locality of each other and not send traffic over the Internet if the devices are local.
With some of these tools, hosts local to each other (on the same LAN) may be unable to find each other if they canβt reach the control plane over the Internet (Internet is down or provider is down)
Some other features that some of the tools provide include:
Easy sharing of limited access with friends/guests
Taking care of everything you need, including SSL certs, for exposing a certain on-net service to the public Internet
Optional routing of your outbound Internet traffic via an exit node on your network. Useful, for instance, if your local network is blocking tons of stuff.
Letβs dive in.
Types of Mesh VPNs
Iβll go over several types of meshes in this article:
Fully decentralized with automatic hop routing
This model has no special central control plane. Nodes discover each other in various ways, and establish routes to each other. These routes can be direct connections over the Internet, or via other nodes. This approach offers the greatest resilience. Examples Iβll cover include Yggdrasil and tinc.
Automatic peer-to-peer with centralized control
In this model, nodes, by default, communicate by establishing direct links between them. A regular node never carries traffic on behalf of other nodes. Special-purpose relays are used to handle cases in which NAT traversal is impossible. This approach tends to offer simple setup. Examples Iβll cover include Tailscale, Zerotier, Nebula, and Netmaker.
Roll your own and hybrid approaches
This is a βgrab bagβ of other ideas; for instance, running Yggdrasil over Tailscale.
Terminology
For the sake of consistency, Iβm going to use common language to discuss things that have different terms in different ecosystems:
Every tool discussed here has a way of dealing with NAT traversal. It may assist with establishing direct connections (eg, STUN), and if that fails, it may simply relay traffic between nodes. Iβll call such a relay a βbrokerβ. This may or may not be the same system that is a control plane for a tool.
All of these systems operate over lower layers that are unencrypted. Those lower layers may be a LAN (wired or wireless, which may or may not have Internet access), or the public Internet (IPv4 and/or IPv6). Iβm going to call the unencrypted lower layer, whatever it is, the βclearnetβ.
Evaluation Criteria
Here are the things I want to see from a solution:
Secure, with all communications end-to-end encrypted and authenticated, and prevention of traffic from untrusted devices.
Flexible, adapting to changes in network topology quickly and automatically.
Resilient, without single points of failure, and with devices local to each other able to communicate even if cut off from the Internet or other parts of the network.
Private, minimizing leakage of information or metadata about me and my systems
Able to traverse CGNAT without having to use a broker whenever possible
A lesser requirement for me, but still a nice to have, is the ability to include others via something like Internet publishing or inviting guests.
Fully or nearly fully Open Source
Free or very cheap for personal use
Wide operating system support, including headless Linux on x86_64 and ARM.
Fully Decentralized VPNs with Automatic Hop Routing
Two systems fit this description: Yggdrasil and Tinc. Letβs dive in.
Yggdrasil
Iβll start with Yggdrasil because Iβve written so much about it already. It featured in prior posts such as:
Make the Internet Yours Again With an Instant Mesh Network, which described the tyranny of IP rigidity and using Yggdrasil as a global mesh overlay.
Using Yggdrasil As an Automatic Mesh Fabric to Connect All Your Docker Containers, VMs, and Servers is, in a significant sense, a more specific implementation of the ideas contained here; itβs a private Yggdrasil mesh providing the communications layer for dispersed Docker containers.
Recovering Our Lost Free Will Online: Tools and Techniques That Are Available Now features Yggdrasil.
Yggdrasil can be a private mesh VPN, or something more
Yggdrasil can be a private mesh VPN, just like the other tools covered here. Itβs unique, however, in that a key goal of the project is to also make it useful as a planet-scale global mesh network. As such, Yggdrasil is a testbed of new ideas in distributed routing designed to scale up to massive sizes and all sorts of connection conditions. As of 2023-04-10, the main global Yggdrasil mesh has over 5000 nodes in it. You can choose whether or not to participate.
Every node in a Yggdrasil mesh has a public/private keypair. Each node then has an IPv6 address (in a private address space) derived from its public key. Using these IPv6 addresses, you can communicate right away.
Yggdrasil differs from most of the other tools here in that it does not necessarily seek to establish a direct link on the clearnet between, say, host A and host G for them to communicate. It will prefer such a direct link if it exists, but it is perfectly happy if it doesnβt.
The reason is that every Yggdrasil node is also a router in the Yggdrasil mesh. Letβs sit with that concept for a moment. Consider:
If you have a bunch of machines on your LAN, but only one of them can peer over the clearnet, thatβs fine; all the other machines will discover this route to the world and use it when necessary.
All you need to run a broker is just a regular node with a public IP address. If you are participating in the global mesh, you can use one (or more) of the free public peers for this purpose.
It is not necessary for every node to know about the clearnet IP address of every other node (improving privacy). In fact, itβs not even necessary for every node to know about the existence of all the other nodes, so long as it can find a route to a given node when itβs asked to.
Yggdrasil can find one or more routes between nodes, and it can use this knowledge of multiple routes to aggressively optimize for varying network conditions, including combinations of, say, downloads and low-latency ssh sessions.
Behind the scenes, Yggdrasil calculates optimal routes between nodes as necessary, using a mesh-wide DHT for initial contact and then deriving more optimal paths. (You can also read more details about the routing algorithm.)
One final way that Yggdrasil is different from most of the other tools is that there is no separate control server. No node is βspecialβ, in charge, the sole keeper of metadata, or anything like that. The entire system is completely distributed and auto-assembling.
Meeting neighbors
There are two ways that Yggdrasil knows about peers:
By broadcast discovery on the local LAN
By listening on a specific port (or being told to connect to a specific host/port)
Sometimes this might lead to multiple ways to connect to a node; Yggdrasil prefers the connection auto-discovered by broadcast first, then the lowest-latency of the defined path. In other words, when your laptops are in the same room as each other on your local LAN, your packets will flow directly between them without traversing the Internet.
Unique uses
Yggdrasil is uniquely suited to network-challenged situations. As an example, in a post-disaster situation, Internet access may be unavailable or flaky, yet there may be many local devices β perhaps ones that had never known of each other before β that could share information. Yggdrasil meets this situation perfectly. The combination of broadcast auto-detection, distributed routing, and so forth, basically means that if there is any physical path between two nodes, Yggdrasil will find and enable it.
Ad-hoc wifi is rarely used because it is a real pain. Yggdrasil actually makes it useful! Its broadcast discovery doesnβt require any IP address provisioned on the interface at all (it just uses the IPv6 link-local address), so you donβt need to figure out a DHCP server or some such. And, Yggdrasil will tend to perform routing along the contours of the RF path. So you could have a laptop in the middle of a long distance relaying communications from people farther out, because it could see both. Or even a chain of such things.
Yggdrasil: Security and Privacy
Yggdrasilβs mesh is aggressively greedy. It will peer with any node it can find (unless told otherwise) and will find a route to anywhere it can. There are two main ways to make sure you keep unauthorized traffic out: by restricting who can talk to your mesh, and by firewalling the Yggdrasil interface. Both can be used, and they can be used simultaneously.
Iβll discuss firewalling more at the end of this article. Basically, youβll almost certainly want to do this if you participate in the public mesh, because doing so is akin to having a globally-routable public IP address direct to your device.
If you want to restrict who can talk to your mesh, you just disable the broadcast feature on all your nodes (empty
MulticastInterfaces
section in the config), and avoid telling any of your nodes to connect to a public peer. You can set a list of authorized public keys that can connect to your nodesβ listening interfaces, which youβll probably want to do. You will probably want to either open up some inbound ports (if you can) or set up a node with a known clearnet IP on a place like a $5/mo VPS to help with NAT traversal (again, settingAllowedPublicKeys
as appropriate). Yggdrasil doesnβt allow filtering multicast clients by public key, only by network interface, so thatβs why we disable broadcast discovery. You can easily enough teach Yggdrasil about static internal LAN IPs of your nodes and have things work that way. (Or, set up an internal βgatewayβ node or two, that the clients just connect to when theyβre local). But fundamentally, you need to put a bit more thought into this with Yggdrasil than with the other tools here, which are closed-only.Compared to some of the other tools here, Yggdrasil is better about information leakage; nodes only know details, such as clearnet IPs, of directly-connected peers. You can obtain the list of directly-connected peers of any known node in the mesh β but that list is the public keys of the directly-connected peers, not the clearnet IPs.
Some of the other tools contain a limited integrated firewall of sorts (with limited ACLs and such). Yggdrasil does not, but is fully compatible with on-host firewalls. I recommend these anyway even with many other tools.
Yggdrasil: Connectivity and NAT traversal
Compared to the other tools, Yggdrasil is an interesting mix. It provides a fully functional mesh and facilitates connectivity in situations in which no other tool can. Yet its NAT traversal, while it exists and does work, results in using a broker under some of the more challenging CGNAT situations more often than some of the other tools, which can impede performance.
Yggdrasilβs underlying protocol is TCP-based. Before you run away screaming that it must be slow and unreliable like OpenVPN over TCP β itβs not, and it is even surprisingly good around bufferbloat. Iβve found its performance to be on par with the other tools here, and it works as well as Iβd expect even on flaky 4G links.
Overall, the NAT traversal story is mixed. On the one hand, you can run a node that listens on port 443 β and Yggdrasil can even make it speak TLS (even though thatβs unnecessary from a security standpoint), so you can likely get out of most restrictive firewalls you will ever encounter. If you join the public mesh, know that plenty of public peers do listen on port 443 (and other well-known ports like 53, plus random high-numbered ones).
If you connect your system to multiple public peers, there is a chance β though a very small one β that some public transit traffic might be routed via it. In practice, public peers hopefully are already peered with each other, preventing this from happening (you can verify this with
yggdrasilctl debug_remotegetpeers key=ABC...
). I have never experienced a problem with this. Also, since latency is a factor in routing for Yggdrasil, it is highly unlikely that random connections we use are going to be competitive with datacenter peers.Yggdrasil: Sharing with friends
If youβre open to participating in the public mesh, this is one of the easiest things of all. Have your friend install Yggdrasil, point them to a public peer, give them your Yggdrasil IP, and thatβs it. (Well, presumably you also open up your firewall β you did follow my advice to set one up, right?)
If your friend is visiting at your location, they can just hop on your wifi, install Yggdrasil, and it will automatically discover a route to you. Yggdrasil even has a zero-config mode for ephemeral nodes such as certain Docker containers.
Yggdrasil doesnβt directly support publishing to the clearnet, but it is certainly possible to proxy (or even NAT) to/from the clearnet, and people do.
Yggdrasil: DNS
There is no particular extra DNS in Yggdrasil. You can, of course, run a DNS server within Yggdrasil, just as you can anywhere else. Personally I just add relevant hosts to
/etc/hosts
and leave it at that, but itβs up to you.Yggdrasil: Source code, pricing, and portability
Yggdrasil is fully open source (LGPLv3 plus additional permissions in an exception) and highly portable. It is written in Go, and has prebuilt binaries for all major platforms (including a Debian package which I made).
There is no charge for anything with Yggdrasil. Listed public peers are free and run by volunteers. You can run your own peers if you like; they can be public and unlisted, public and listed (just submit a PR to get it listed), or private (accepting connections only from certain nodesβ keys). A βpeerβ in this case is just a node with a known clearnet IP address.
Yggdrasil encourages use in other projects. For instance, NNCP integrates a Yggdrasil node for easy communication with other NNCP nodes.
Yggdrasil conclusions
Yggdrasil is tops in reliability (having no single point of failure) and flexibility. It will maintain opportunistic connections between peers even if the Internet is down.
The unique added feature of being able to be part of a global mesh is a nice one.
The tradeoffs include being more prone to need to use a broker in restrictive CGNAT environments. Some other tools have clients that override the OS DNS resolver to also provide resolution of hostnames of member nodes; Yggdrasil doesnβt, though you can certainly run your own DNS infrastructure over Yggdrasil (or, for that matter, let public DNS servers provide Yggdrasil answers if you wish).
There is also a need to pay more attention to firewalling or maintaining separation from the public mesh. However, as I explain below, many other options have potential impacts if the control plane, or your account for it, are compromised, meaning you ought to firewall those, too. Still, it may be a more immediate concern with Yggdrasil.
Although Yggdrasil is listed as experimental, I have been using it for over a year and have found it to be rock-solid. They did change how mesh IPs were calculated when moving from 0.3 to 0.4, causing a global renumbering, so just be aware that this is a possibility while it is experimental.
tinc
tinc is the oldest tool on this list; version 1.0 came out in 2003! You can think of tinc as something akin to βan older Yggdrasil without the public option.β
I will be discussing tinc 1.0.36, the latest stable version, which came out in 2019. The development branch, 1.1, has been going since 2011 and had its latest release in 2021. The last commit to the Github repo was in June 2022.
Tinc is the only tool here to support both tun and tap style interfaces. I go into the difference more in the Zerotier review below. Tinc actually provides a better tap implementation than Zerotier, with various sane options for broadcasts, but I still think the call for an Ethernet, as opposed to IP, VPN is small.
To configure tinc, you generate a per-host configuration and then distribute it to every tinc node. It contains a hostβs public key. Therefore, adding a host to the mesh means distributing its key everywhere; de-authorizing it means removing its key everywhere. This makes it rather unwieldy.
tinc can do LAN broadcast discovery and mesh routing, but generally speaking you must manually teach it where to connect initially. Somewhat confusingly, the examples all mention listing a public address for a node. This doesnβt make sense for a laptop, and I suspect youβd just omit it. I think that address is used for something akin to a Yggdrasil peer with a clearnet IP.
Unlike all of the other tools described here, tinc has no tool to inspect the running state of the mesh.
Some of the properties of tinc made it clear I was unlikely to adopt it, so this review wasnβt as thorough as that of Yggdrasil.
tinc: Security and Privacy
As mentioned above, every host in the tinc mesh is authenticated based on its public key. However, to be more precise, this key is validated only at the point it connects to its next hop peer. (To be sure, this is also the same as how the list of allowed pubkeys works in Yggdrasil.) Since IPs in tinc are not derived from their key, and any host can assign itself whatever mesh IP it likes, this implies that a compromised host could impersonate another.
It is unclear whether packets are end-to-end encrypted when using a tinc node as a router. The fact that they can be routed at the kernel level by the tun interface implies that they may not be.
tinc: Connectivity and NAT traversal
I was unable to find much information about NAT traversal in tinc, other than that it does support it. tinc can run over UDP or TCP and auto-detects which to use, preferring UDP.
tinc: Sharing with friends
tinc has no special support for this, and the difficulty of configuration makes it unlikely youβd do this with tinc.
tinc: Source code, pricing, and portability
tinc is fully open source (GPLv2). It is written in C and generally portable. It supports some very old operating systems. Mobile support is iffy.
tinc does not seem to be very actively maintained.
tinc conclusions
I havenβt mentioned performance in my other reviews (see the section at the end of this post). But, it is so poor as to only run about 300Mbps on my 2.5Gbps network. Thatβs 1/3 the speed of Yggdrasil or Tailscale. Combine that with the unwieldiness of adding hosts and some uncertainties in security, and Iβm not going to be using tinc.
Automatic Peer-to-Peer Mesh VPNs with centralized control
These tend to be the options that are frequently discussed. Letβs talk about the options.
Tailscale
Tailscale is a popular choice in this type of VPN. To use Tailscale, you first sign up on tailscale.com. Then, you install the tailscale client on each machine. On first run, it prints a URL for you to click on to authorize the client to your mesh (βtailnetβ). Tailscale assigns a mesh IP to each system. The Tailscale client lets the Tailscale control plane gather IP information about each node, including all detectable public and private clearnet IPs.
When you attempt to contact a node via Tailscale, the client will fetch the known contact information from the control plane and attempt to establish a link. If it can contact over the local LAN, it will (it doesnβt have broadcast autodetection like Yggdrasil; the information must come from the control plane). Otherwise, it will try various NAT traversal options. If all else fails, it will use a broker to relay traffic; Tailscale calls a broker a DERP relay server. Unlike Yggdrasil, a Tailscale node never relays traffic for another; all connections are either direct P2P or via a broker.
Tailscale, like several others, is based around Wireguard; though wireguard-go rather than the in-kernel Wireguard.
Tailscale has a number of somewhat unique features in this space:
Funnel, which lets you expose ports on your system to the public Internet via the VPN.
Exit nodes, which automate the process of routing your public Internet traffic over some other node in the network. This is possible with every tool mentioned here, but Tailscale makes switching it on or off a couple of quick commands away.
Node sharing, which lets you share a subset of your network with guests
A fantastic set of documentation, easily the best of the bunch.
Funnel, in particular, is interesting. With a couple of βtailscale serveβ-style commands, you can expose a directory tree (or a development webserver) to the world. Tailscale gives you a public hostname, obtains a cert for it, and proxies inbound traffic to you. This is subject to some unspecified bandwidth limits, and you can only choose from three public ports, so itβs not really a production solution β but as a quick and easy way to demonstrate something cool to a friend, itβs a neat feature.
Tailscale: Security and Privacy
With Tailscale, as with the other tools in this category, one of the main threats to consider is the control plane. What are the consequences of a compromise of Tailscaleβs control plane, or of the credentials you use to access it?
Letβs begin with the credentials used to access it. Tailscale operates no identity system itself, instead relying on third parties. For individuals, this means Google, Github, or Microsoft accounts; Okta and other SAML and similar identity providers are also supported, but this runs into complexity and expense that most individuals arenβt wanting to take on. Unfortunately, all three of those types of accounts often have saved auth tokens in a browser. Personally I would rather have a separate, very secure, login.
If a person does compromise your account or the Tailscale servers themselves, they canβt directly eavesdrop on your traffic because it is end-to-end encrypted. However, assuming an attacker obtains access to your account, they could:
Tamper with your Tailscale ACLs, permitting new actions
Add new nodes to the network
Forcibly remove nodes from the network
Enable or disable optional features
Of note is that they cannot just commandeer an existing IP. I would say the riskiest possibility here is that could add new nodes to the mesh. Because they could also tamper with your ACLs, they could then proceed to attempt to access all your internal services. They could even turn on service collection and have Tailscale tell them what and where all the services are.
Therefore, as with other tools, I recommend a local firewall on each machine with Tailscale. More on that below.
Tailscale has a new alpha feature called tailnet lock which helps with this problem. It requires existing nodes in the mesh to sign a request for a new node to join. Although this doesnβt address ACL tampering and some of the other things, it does represent a significant help with the most significant concern. However, tailnet lock is in alpha, only available on the Enterprise plan, and has a waitlist, so I have been unable to test it.
Any Tailscale node can request the IP addresses belonging to any other Tailscale node. The Tailscale control plane captures, and exposes to you, this information about every node in your network: the OS hostname, IP addresses and port numbers, operating system, creation date, last seen timestamp, and NAT traversal parameters. You can optionally enable service data capture as well, which sends data about open ports on each node to the control plane.
Tailscale likes to highlight their key expiry and rotation feature. By default, all keys expire after 180 days, and traffic to and from the expired node will be interrupted until they are renewed (basically, you re-login with your provider and do a renew operation). Unfortunately, the only mention I can see of warning of impeding expiration is in the Windows client, and even there you need to edit a registry key to get the warning more than the default 24 hours in advance. In short, it seems likely to cut off communications when itβs most important. You can disable key expiry on a per-node basis in the admin console web interface, and I mostly do, due to not wanting to lose connectivity at an inopportune time.
Tailscale: Connectivity and NAT traversal
When thinking about reliability, the primary consideration here is being able to reach the Tailscale control plane. While it is possible in limited circumstances to reach nodes without the Tailscale control plane, it is βa fairly brittle setupβ and notably will not survive a client restart. So if you use Tailscale to reach other nodes on your LAN, that wonβt work unless your Internet is up and the control plane is reachable.
Assuming your Internet is up and Tailscaleβs infrastructure is up, there is little to be concerned with. Your own comfort level with cloud providers and your Internet should guide you here.
Tailscale wrote a fantastic article about NAT traversal and they, predictably, do very well with it. Tailscale prefers UDP but falls back to TCP if needed. Broker (DERP) servers step in as a last resort, and Tailscale clients automatically select the best ones. Iβm not aware of anything that is more successful with NAT traversal than Tailscale. This maximizes the situations in which a direct P2P connection can be used without a broker.
I have found Tailscale to be a bit slow to notice changes in network topography compared to Yggdrasil, and sometimes needs a kick in the form of restarting the client process to re-establish communications after a network change. However, itβs possible (maybe even probable) that if Iβd waited a bit longer, it would have sorted this all out.
Tailscale: Sharing with friends
I touched on the funnel feature earlier. The sharing feature lets you give an invite to an outsider. By default, a person accepting a share can make only outgoing connections to the network theyβre invited to, and cannot receive incoming connections from that network β this makes sense. When sharing an exit node, you get a checkbox that lets you share access to the exit node as well. Of course, the person accepting the share needs to install the Tailnet client. The combination of funnel and sharing make Tailscale the best for ad-hoc sharing.
Tailscale: DNS
Tailscaleβs DNS is called MagicDNS. It runs as a layer atop your standard DNS β taking over
/etc/resolv.conf
on Linux β and provides resolution of mesh hostnames and some other features. This is a concept that is pretty slick.It also is a bit flaky on Linux; dueling programs want to write to
/etc/resolv.conf
. I canβt really say this is entirely Tailscaleβs fault; they document the problem and some workarounds.I would love to be able to add custom records to this service; for instance, to override the public IP for a service to use the in-mesh IP. Unfortunately, thatβs not yet possible. However, MagicDNS can query existing nameservers for certain domains in a split DNS setup.
Tailscale: Source code, pricing, and portability
Tailscale is almost fully open source and the client is highly portable. The client is open source (BSD 3-clause) on open source platforms, and closed source on closed source platforms. The DERP servers are open source. The coordination server is closed source, although there is an open source coordination server called Headscale (also BSD 3-clause) made available with Tailscaleβs blessing and informal support. It supports most, but not all, features in the Tailscale coordination server.
Tailscaleβs pricing (which does not apply when using Headscale) provides a free plan for 1 user with up to 20 devices. A Personal Pro plan expands that to 100 devices for $48 per year – not a bad deal at $4/mo. A βCommunity on Githubβ plan also exists, and then there are more business-oriented plans as well. See the pricing page for details.
As a small note, I appreciated Tailscaleβs install script. It properly added Tailscaleβs apt key in a way that it can only be used to authenticate the Tailscale repo, rather than as a systemwide authenticator. This is a nice touch and speaks well of their developers.
Tailscale conclusions
Tailscale is tops in sharing and has a broad feature set and excellent documentation. Like other solutions with a centralized control plane, device communications can stop working if the control plane is unreachable, and the threat model of the control plane should be carefully considered.
Zerotier
Zerotier is a close competitor to Tailscale, and is similar to it in a lot of ways. So rather than duplicate all of the Tailscale information here, Iβm mainly going to describe how it differs from Tailscale.
The primary difference between the two is that Zerotier emulates an Ethernet network via a Linux tap interface, while Tailscale emulates a TCP/IP network via a Linux tun interface.
However, Zerotier has a number of things that make it be a somewhat imperfect Ethernet emulator. For one, it has a problem with broadcast amplification; the machine sending the broadcast sends it to all the other nodes that should receive it (up to a set maximum). I wouldnβt want to have a lot of programs broadcasting on a slow link. While in theory this could let you run Netware or DECNet across Zerotier, Iβm not really convinced thereβs much call for that these days, and Zerotier is clearly IP-focused as it allocates IP addresses and such anyhow. Zerotier provides special support for emulated ARP (IPv4) and NDP (IPv6). While you could theoretically run Zerotier as a bridge, this eliminates the zero trust principle, and Tailscale supports subnet routers, which provide much of the same feature set anyhow.
A somewhat obscure feature, but possibly useful, is Zerotierβs built-in support for multipath WAN for the public interface. This actually lets you do a somewhat basic kind of channel bonding for WAN.
Zerotier: Security and Privacy
The picture here is similar to Tailscale, with the difference that you can create a Zerotier-local account rather than relying on cloud authentication. I was unable to find as much detail about Zerotier as I could about Tailscale – notably I couldnβt find anything about how βstickyβ an IP address is. However, the configuration screen lets me delete a node and assign additional arbitrary IPs within a subnet to other nodes, so I think the assumption here is that if your Zerotier account (or the Zerotier control plane) is compromised, an attacker could remove a legit device, add a malicious one, and assign the previous IP of the legit device to the malicious one. Iβm not sure how to mitigate against that risk, as firewalling specific IPs is ineffective if an attacker can simply take them over. Zerotier also lacks anything akin to Tailnet Lock.
For this reason, I didnβt proceed much further in my Zerotier evaluation.
Zerotier: Connectivity and NAT traversal
Like Tailscale, Zerotier has NAT traversal with STUN. However, it looks like itβs more limited than Tailscaleβs, and in particular is incompatible with double NAT that is often seen these days. Zerotier operates brokers (βroot serversβ) that can do relaying, including TCP relaying. So you should be able to connect even from hostile networks, but you are less likely to form a P2P connection than with Tailscale.
Zerotier: Sharing with friends
I was unable to find any special features relating to this in the Zerotier documentation. Therefore, it would be at the same level as Yggdrasil: possible, maybe even not too difficult, but without any specific help.
Zerotier: DNS
Unlike Tailscale, Zerotier does not support automatically adding DNS entries for your hosts. Therefore, your options are approximately the same as Yggdrasil, though with the added option of pushing configuration pointing to your own non-Zerotier DNS servers to the client.
Zerotier: Source code, pricing, and portability
The client ZeroTier One is available on Github under a custom βbusiness source licenseβ which prevents you from using it in certain settings. This license would preclude it being included in Debian. Their library, libzt, is available under the same license. The pricing page mentions a community edition for self hosting, but the documentation is sparse and it was difficult to understand what its feature set really is.
The free plan lets you have 1 user with up to 25 devices. Paid plans are also available.
Zerotier conclusions
Frankly I donβt see much reason to use Zerotier. The βvirtual Ethernetβ model seems to be a weird hybrid that doesnβt bring much value. Iβm concerned about the implications of a compromise of a user account or the control plane, and it lacks a lot of Tailscale features (MagicDNS and sharing). The only thing it may offer in particular is multipath WAN, but thatβs esoteric enough β and also solvable at other layers β that it doesnβt seem all that compelling to me. Add to that the strange license and, to me anyhow, I donβt see much reason to bother with it.
Netmaker
Netmaker is one of the projects that is making noise these days. Netmaker is the only one here that is a wrapper around in-kernel Wireguard, which can make a performance difference when talking to peers on a 1Gbps or faster link. Also, unlike other tools, it has an ingress gateway feature that lets people that donβt have the Netmaker client, but do have Wireguard, participate in the VPN. I believe I also saw a reference somewhere to nodes as routers as with Yggdrasil, but Iβm failing to dig it up now.
The project is in a bit of an early state; you can sign up for an βupcoming closed betaβ with a SaaS host, but really you are generally pointed to self-hosting using the code in the github repo. There are community and enterprise editions, but itβs not clear how to actually choose. The server has a bunch of components: binary, CoreDNS, database, and web server. It also requires elevated privileges on the host, in addition to a container engine. Contrast that to the single binary that some others provide.
It looks like releases are frequent, but sometimes break things, and have a somewhat more laborious upgrade processes than most.
I donβt want to spend a lot of time managing my mesh. So because of the heavy needs of the server, the upgrades being labor-intensive, it taking over iptables and such on the server, I didnβt proceed with a more in-depth evaluation of Netmaker. It has a lot of promise, but for me, it doesnβt seem to be in a state that will meet my needs yet.
Nebula
Nebula is an interesting mesh project that originated within Slack, seems to still be primarily sponsored by Slack, but is also being developed by Defined Networking (though their product looks early right now). Unlike the other tools in this section, Nebula doesnβt have a web interface at all. Defined Networking looks likely to provide something of a SaaS service, but for now, you will need to run a broker (βlighthouseβ) yourself; perhaps on a $5/mo VPS.
Due to the poor firewall traversal properties, I didnβt do a full evaluation of Nebula, but it still has a very interesting design.
Nebula: Security and Privacy
Since Nebula lacks a traditional control plane, the root of trust in Nebula is a CA (certificate authority). The documentation gives this example of setting it up:
./nebula-cert sign -name "lighthouse1" -ip "192.168.100.1/24"
So the cert contains your IP, hostname, and group allocation. Each host in the mesh gets your CA certificate, and the per-host cert and key generated from each of these steps../nebula-cert sign -name "laptop" -ip "192.168.100.2/24" -groups "laptop,home,ssh"
./nebula-cert sign -name "server1" -ip "192.168.100.9/24" -groups "servers"
./nebula-cert sign -name "host3" -ip "192.168.100.10/24"
This leads to a really nice security model. Your CA is the gatekeeper to what is trusted in your mesh. You can even have it airgapped or something to make it exceptionally difficult to breach the perimeter.
Nebula contains an integrated firewall. Because the ability to keep out unwanted nodes is so strong, I would say this may be the one mesh VPN you might consider using without bothering with an additional on-host firewall.
You can define static mappings from a Nebula mesh IP to a clearnet IP. I havenβt found information on this, but theoretically if NAT traversal isnβt required, these static mappings may allow Nebula nodes to reach each other even if Internet is down. I donβt know if this is truly the case, however.
Nebula: Connectivity and NAT traversal
This is a weak point of Nebula. Nebula sends all traffic over a single UDP port; there is no provision for using TCP. This is an issue at certain hotel and other public networks which open only TCP egress ports 80 and 443.
I couldnβt find a lot of detail on what Nebulaβs NAT traversal is capable of, but according to a certain Github issue, this has been a sore spot for years and isnβt as capable as Tailscale.
You can designate nodes in Nebula as brokers (relays). The concept is the same as Yggdrasil, but itβs less versatile. You have to manually designate what relay to use. Itβs unclear to me what happens if different nodes designate different relays. Keep in mind that this always happens over a UDP port.
Nebula: Sharing with friends
There is no particular support here.
Nebula: DNS
Nebula has experimental DNS support. In contrast with Tailscale, which has an internal DNS server on every node, Nebula only runs a DNS server on a lighthouse. This means that it canβt forward requests to a DNS server thatβs upstream for your laptopβs particular current location. Actually, Nebulaβs DNS server doesnβt forward at all. It also doesnβt resolve its own name.
The Nebula documentation makes reference to using multiple lighthouses, which you may want to do for DNS redundancy or performance, but itβs unclear to me if this would make each lighthouse form a complete picture of the network.
Nebula: Source code, pricing, and portability
Nebula is fully open source (MIT). It consists of a single Go binary and configuration. It is fairly portable.
Nebula conclusions
I am attracted to Nebulaβs unique security model. I would probably be more seriously considering it if not for the lack of support for TCP and poor general NAT traversal properties. Its datacenter connectivity heritage does show through.
Roll your own and hybrid
Here is a grab bag of ideas:
Running Yggdrasil over Tailscale
One possibility would be to use Tailscale for its superior NAT traversal, then allow Yggdrasil to run over it. (You will need a firewall to prevent Tailscale from trying to run over Yggdrasil at the same time!) This creates a closed network with all the benefits of Yggdrasil, yet getting the NAT traversal from Tailscale.
Drawbacks might be the overhead of the double encryption and double encapsulation. A good Yggdrasil peer may wind up being faster than this anyhow.
Public VPN provider for NAT traversal
A public VPN provider such as Mullvad will often offer incoming port forwarding and nodes in many cities. This could be an attractive way to solve a bunch of NAT traversal problems: just use one of those services to get you an incoming port, and run whatever you like over that.
Be aware that a number of public VPN clients have a βkill switchβ to prevent any traffic from egressing without using the VPN; see, for instance, Mullvadβs. Youβll need to disable this if you are running a mesh atop it.
Other
Combining with local firewalls
For most of these tools, I recommend using a local firewal in conjunction with them. I have been using firehol and find it to be quite nice. This means you donβt have to trust the mesh, the control plane, or whatever. The catch is that you do need your mesh VPN to provide strong association between IP address and node. Most, but not all, do.
Performance
I tested some of these for performance using iperf3 on a 2.5Gbps LAN. Here are the results. All speeds are in Mbps.
Tool
iperf3 (default)
iperf3 -P 10
iperf3 -R
Direct (no VPN)
2406
2406
2764
Wireguard (kernel)
1515
1566
2027
Yggdrasil
892
1126
1105
Tailscale
950
1034
1085
Tinc
296
300
277
You can see that Wireguard was significantly faster than the other options. Tailscale and Yggdrasil were roughly comparable, and Tinc was terrible.
IP collisions
When you are communicating over a network such as these, you need to trust that the IP address you are communicating with belongs to the system you think it does. This protects against two malicious actor scenarios:
Someone compromises one machine on your mesh and reconfigures it to impersonate a more important one
Someone connects an unauthorized system to the mesh, taking over a trusted IP, and uses the privileges of the trusted IP to access resources
To summarize the state of play as highlighted in the reviews above:
Yggdrasil derives IPv6 addresses from a public key
tinc allows any node to set any IP
Tailscale IPs arenβt user-assignable, but the assignment algorithm is unknown
Zerotier allows any IP to be allocated to any node at the control plane
I donβt know what Netmaker does
Nebula IPs are baked into the cert and signed by the CA, but I havenβt verified the enforcement algorithm
So this discussion really only applies to Yggdrasil and Tailscale. tinc and Zerotier lack detailed IP security, while Nebula expects IP allocations to be handled outside of the tool and baked into the certs (therefore enforcing rigidity at that level).
So the question for Yggdrasil and Tailscale is: how easy is it to commandeer a trusted IP?
Yggdrasil has a brief discussion of this. In short, Yggdrasil offers you both a dedicated IP and a rarely-used /64 prefix which you can delegate to other machines on your LAN. Obviously by taking the dedicated IP, a lot more bits are available for the hash of the nodeβs public key, making βcollisions technically impractical, if not outright impossible.β However, if you use the /64 prefix, a collision may be more possible. Yggdrasilβs hashing algorithm includes some optimizations to make this more difficult. Yggdrasil includes a
genkeys
tool that uses more CPU cycles to generate keys that are maximally difficult to collide with.Tailscale doesnβt document their IP assignment algorithm, but I think it is safe to say that the larger subnet you use, the better. If you try to use a /24 for your mesh, it is certainly conceivable that an attacker could remove your trusted node, then just manually add the 240 or so machines it would take to get that IP reassigned. It might be a good idea to use a purely IPv6 mesh with Tailscale to minimize this problem as well.
So, I think the risk is low in the default configurations of both Yggdrasil and Tailscale (certainly lower than with tinc or Zerotier). You can drive the risk even lower with both.
Final thoughts
For my own purposes, I suspect I will remain with Yggdrasil in some fashion. Maybe I will just take the small performance hit that using a relay node implies. Or perhaps I will get clever and use an incoming VPN port forward or go over Tailscale.
Tailscale was the other option that seemed most interesting. However, living in a region with Internet that goes down more often than Iβd like, I would like to just be able to send as much traffic over a mesh as possible, trusting that if the LAN is up, the mesh is up.
I have one thing that really benefits from performance in excess of Yggdrasil or Tailscale: NFS. Thatβs between two machines that never leave my LAN, so I will probably just set up a direct Wireguard link between them. Heck of a lot easier than trying to do Kerberos!
Finally, I wrote this intending to be useful. I dealt with a lot of complexity and under-documentation, so itβs possible I got something wrong somewhere. Please let me know if you find any errors.
This blog post is a copy of a page on my website. That page may be periodically updated.