“OK,” you’re probably thinking. “John, you talk a lot about things like Gopher and personal radios, and now you want to talk about building a reliable network out of… USB drives?”
Well, yes. In fact, I’ve already done it.
What is sneakernet?
Normally, “sneakernet” is a sort of tongue-in-cheek reference to using disconnected storage to transport data or messages. By “disconnect storage” I mean anything like CD-ROMs, hard drives, SD cards, USB drives, and so forth. There are times when loading up 12TB on a device and driving it across town is just faster and easier than using the Internet for the same. And, sometimes you need to get data to places that have no Internet at all.
Another reason for sneakernet is security. For instance, if your backup system is online, and your systems being backed up are online, then it could become possible for an attacker to destroy both your primary copy of data and your backups. Or, you might use a dedicated computer with no network connection to do GnuPG (GPG) signing.
What about “reliable” sneakernet, then?
TCP is often considered a “reliable” protocol. That means that the sending side is generally able to tell if its message was properly received. As with most reliable protocols, we have these components:
- After transmitting a piece of data, the sender retains it.
- After receiving a piece of data, the receiver sends an acknowledgment (ACK) back to the sender.
- Upon receiving the acknowledgment, the sender removes its buffered copy of the data.
- If no acknowledgment is received at the sender, it retransmits the data, in case it gets lost in transit.
- It reorders any packets that arrive out of order, so that the recipient’s data stream is ordered correctly.
Now, a lot of the things I just mentioned for sneakernet are legendarily unreliable. USB drives fail, CD-ROMs get scratched, hard drives get banged up. Think about putting these things in a bicycle bag or airline luggage. Some of them are going to fail.
You might think, “well, I’ll just copy files to a USB drive instead of move them, and once I get them onto the destination machine, I’ll delete them from the source.” Congratulations! You are a human retransmit algorithm! We should be able to automate this!
And we can.
Enter NNCP
NNCP is one of those things that almost defies explanation. It is a toolkit for building asynchronous networks. It can use as a carrier: a pipe, TCP network connection, a mounted filesystem (specifically intended for cases like this), and much more. It also supports multi-hop asynchronous routing and asynchronous meshing, but these are beyond the scope of this particular article.
NNCP’s transports that involve live communication between two hops already had all the hallmarks of being reliable; there was a positive ACK and retransmit. As of version 8.7.0, NNCP’s ACKs themselves can also be asynchronous – meaning that every NNCP transport can now be reliable.
Yes, that’s right. Your ACKs can flow over tapes and USB drives if you want them to.
I use this for archiving and backups.
If you aren’t already familiar with NNCP, you might take a look at my NNCP page. I also have a lot of blog posts about NNCP.
Those pages describe the basics of NNCP: the “packet” (the unit of transmission in NNCP, which can be tiny or many TB), the end-to-end encryption, and so forth. The new command we will now be interested in is nncp-ack.
The Basic Idea
Here are the basic steps to processing this stuff with NNCP:
- First, we use
nncp-xfer -rx
to process incoming packets from the USB (or other media) device. This moves them into the NNCP inbound queue, deleting them from the media device, and verifies the packet integrity. - We use
nncp-ack -node $NODE
to create ACK packets responding to the packets we just loaded into the rx queue. It writes a list of generated ACKs onto fd 4, which we save off for later use. - We run
nncp-toss -seen
to process the incoming queue. The use of-seen
causes NNCP to remember the hashes of packets seen before, so a duplicate of an already-seen packet will not be processed twice. This command also processes incoming ACKs for packets we’ve sent out previously; if they pass verification, the relevant packets are removed from the local machine’s tx queue. - Now, we use
nncp-xfer -keep -tx -mkdir -node $NODE
to send outgoing packets to a given node by writing them to a given directory on the media device.-keep
causes them to remain in the outgoing queue. - Finally, we use the list of generated ACK packets saved off in step 2 above. That list is passed to
nncp-rm -node $NODE -pkt < $FILE
to remove those specific packets from the outbound queue. The reason is that there will never be an ACK of ACK packet (that would create an infinite loop), so if we don’t delete them in this manner, they would hang around forever.
You can see these steps follow the same basic outline on upstream’s nncp-ack page.
One thing to keep in mind: if anything else is running nncp-toss
, there is a chance of a race condition between steps 1 and 2 (if nncp-toss gets to it first, it might not get an ack generated). This would sort itself out eventually, presumably, as the sender would retransmit and it would be ACKed later.
Further ideas
NNCP guarantees the integrity of packets, but not ordering between packets; if you need that, you might look into my Filespooler program. It is designed to work with NNCP and can provide ordered processing.
An example script
Here is a script you might try for this sort of thing. It may have more logic than you need – really, you just need the steps above – but hopefully it is clear.
#!/bin/bash
set -eo pipefail
MEDIABASE="/media/$USER"
# The local node name
NODENAME="`hostname`"
# All nodes. NODENAME should be in this list.
ALLNODES="node1 node2 node3"
RUNNNCP=""
# If you need to sudo, use something like RUNNNCP="sudo -Hu nncp"
NNCPPATH="/usr/local/nncp/bin"
ACKPATH="`mktemp -d`"
# Process incoming packets.
#
# Parameters: $1 - the path to scan. Must contain a directory
# named "nncp".
procrxpath () {
while [ -n "$1" ]; do
BASEPATH="$1/nncp"
shift
if ! [ -d "$BASEPATH" ]; then
echo "$BASEPATH doesn't exist; skipping"
continue
fi
echo " *** Incoming: processing $BASEPATH"
TMPDIR="`mktemp -d`"
# This rsync and the one below can help with
# certain permission issues from weird foreign
# media. You could just eliminate it and
# always use $BASEPATH instead of $TMPDIR below.
rsync -rt "$BASEPATH/" "$TMPDIR/"
# You may need these next two lines if using sudo as above.
# chgrp -R nncp "$TMPDIR"
# chmod -R g+rwX "$TMPDIR"
echo " Running nncp-xfer -rx"
$RUNNNCP $NNCPPATH/nncp-xfer -progress -rx "$TMPDIR"
for NODE in $ALLNODES; do
if [ "$NODE" != "$NODENAME" ]; then
echo " Running nncp-ack for $NODE"
# Now, we generate ACK packets for each node we will
# process. nncp-ack writes a list of the created
# ACK packets to fd 4. We'll use them later.
# If using sudo, add -C 5 after $RUNNNCP.
$RUNNNCP $NNCPPATH/nncp-ack -progress -node "$NODE" \
4>> "$ACKPATH/$NODE"
fi
done
rsync --delete -rt "$TMPDIR/" "$BASEPATH/"
rm -fr "$TMPDIR"
done
}
proctxpath () {
while [ -n "$1" ]; do
BASEPATH="$1/nncp"
shift
if ! [ -d "$BASEPATH" ]; then
echo "$BASEPATH doesn't exist; skipping"
continue
fi
echo " *** Outgoing: processing $BASEPATH"
TMPDIR="`mktemp -d`"
rsync -rt "$BASEPATH/" "$TMPDIR/"
# You may need these two lines if using sudo:
# chgrp -R nncp "$TMPDIR"
# chmod -R g+rwX "$TMPDIR"
for DESTHOST in $ALLNODES; do
if [ "$DESTHOST" = "$NODENAME" ]; then
continue
fi
# Copy outgoing packets to this node, but keep them in the outgoing
# queue with -keep.
$RUNNNCP $NNCPPATH/nncp-xfer -keep -tx -mkdir -node "$DESTHOST" -progress "$TMPDIR"
# Here is the key: that list of ACK packets we made above - now we delete them.
# There will never be an ACK for an ACK, so they'd keep sending forever
# if we didn't do this.
if [ -f "$ACKPATH/$DESTHOST" ]; then
echo "nncp-rm for node $DESTHOST"
$RUNNNCP $NNCPPATH/nncp-rm -debug -node "$DESTHOST" -pkt < "$ACKPATH/$DESTHOST"
fi
done
rsync --delete -rt "$TMPDIR/" "$BASEPATH/"
rm -rf "$TMPDIR"
# We only want to write stuff once.
return 0
done
}
procrxpath "$MEDIABASE"/*
echo " *** Initial tossing..."
# We make sure to use -seen to rule out duplicates.
$RUNNNCP $NNCPPATH/nncp-toss -progress -seen
proctxpath "$MEDIABASE"/*
echo "You can unmount devices now."
echo "Done."
This post is also available on my webiste, where it may be periodically updated.
@jgoerzen I’m really glad you wrote this because it was the sneakernet capabilities of NNCP that interest me. I updated the pkgsrc-wip package for NNCP to 8.7.2 about 2 Β½ weeks ago, by the way.
@ND3JR Glad to help! And thanks for uploading the newer NNCP also.
And besides, why shouldn’t an ACK that arrives 2 months later not be perfectly valid? π
@jgoerzen You’re welcome. But “upload” is definitely not the term I’d use. pkgsrc is a source-based packaging system, akin to FreeBSD ports (in fact that’s where it came from) or Gentoo’s portage. Basically I had to update the URL to download the source tarball in the Makefile, and the distinfo (which contains the checksums for any files downloaded). I would have had to update the PLIST file if anything new had been added. But it was easier than I thought it would be.
Uses for #sneakernet 2/ When I travel, I take photos/videos and I want them to be backed up. If I’m in a hotel with decent wifi (never a guarantee!), I can just rsync or Syncthing it home. But what about visiting an island or other remote area? I could take along some micro SDs and copy backups to them. When I’m in town, mail it to myself for less than $1. When I get home, laptop can transmit over LAN and #NNCP would detect SD as dupes – or if my laptop failed, read it in.
NNCP
sneakernet
Uses for #sneakernet 3/ There’s the obvious “I’ve got 20TB to get to my friend across town.” If your Internet connection is like mine, that would take 48 days to send. Might be able to drive it there in 30 minutes.
sneakernet
Uses for #sneakernet 4/ You can expand any of these ideas with “mail it to a friend” also. In the 1970s, long-distance phone calls were extremely expensive. So my relatives recorded “audio letters” on tape and mailed the tapes around. Sometimes a mailbox is more available than a fast Internet connection. You can always type up your emails and mail the (E2E encrypted, of course) SD to a friend. Friend loads, it relays over Internet to your box, is decrypted, and processed.
sneakernet
Uses for #sneakernet 5/ The #kiwix project @kiwix is designed to make #websites accessible #offline. If you have a need to see them offline, that implies a need to get the data to them somehow. Again the kiwix .zim files could be mailed to the recipients on SD cards.
Kiwix
offline
sneakernet
websites
Uses for #sneakernet 6/ I got started with this by desiring an #airgapped machine for sensitive things like tax records, #GnuPG signing, etc. If I was going to be using this often – say, daily or weekly – I didn’t want to manually have to worry about “did this data successfully get there” all the time. I know how often USB drives fail. So, reliable sneakernet FTW. It works beautifully and can even send backups to my backup server (which is also sneakernet-capable).
GnuPG
airgapped
sneakernet
Uses for #sneakernet 7/ I think that #airgapped machines are desperately underused. We all understand their #security benefits (I hope), but probably the reason we don’t often use them is because they’re so HARD to use.Thanks to apt-offline, I even keep my airgapped machine updated via sneakernet!
airgapped
security
sneakernet
Uses for #sneakernet 8/ I love “fusion” approaches also. With #NNCP, I can transfer data opportunistically: via LAN if that’s available, Internet (perhaps with #Yggdrasil) if not, and sneakernet otherwise. Likewise, if I copy packets to a SD card or something but a synchronous route later becomes available, NNCP will just remove what then becomes a duplicate on the SD card at ingest time.It’s really nice to be transport-agnostic to such a level.
NNCP
Yggdrasil
sneakernet
Uses for #sneakernet 9/ #git works over sneakernet. My gitsync-nncp software (doesn’t require NNCP) will asynchronously sync git trees https://salsa.debian.org/jgoerzen/gitsync-nncp . I use it to sync my org-mode and org-roam notes. Sometimes I use a machine only rarely (say, once a month). If it has Internet when I power it up, it’ll download hundreds of missing updates in a second or two. Or if not, I can just copy the files queued to it over USB (and likewise with updates on it) and I’m good.
git
sneakernet
John Goerzen / gitsync-nncp Β· GitLab
Uses for #sneakernet 10/ git-annex by @joeyh is designed to help with moving data across sytems. There is an NNCP remote for git-annex here https://git.sr.ht/~ehmry/git-annex-remote-nncp . Combined with sneakernet, you can now manage very large collections of files that may be difficult to manage any other way.
sneakernet
~ehmry/git-annex-remote-nncp – sourcehut git
Uses for #sneakernet 11/ I hate to type “end” because inevitably I will realize in 10 minutes that I forgot some, but hopefully this is a useful set of ideas. end/
sneakernet
@jgoerzen oh awesome, I did not know about that! (added to the big list of special remotes)
@jgoerzen I kicked around a project that would use reusable mailable USB envelopes. You can fill in the banks..
@joeyh Interest piqued! I thought of SD cards because, even in protective holders, would be really flat. I don’t know if quite flat enough to avoid the “thick mailpiece” surcharge, but I imagine a person could come up with various protective-enough and thin-enough things. Curious about how you envisioned that going! always closed, USB ports on the side or something?
@jgoerzen I’ve mailed thin USB keys in envelopes, and they were fine with regular postage.. even international (10+ countries)I used cardboard inserts with a gap for the key.I have somewhere a folding envelope prototype, the idea was it unfolded to plug in the key, and there were two configurations that exposed either of two sets of addresses, to bounce it between 2 people. just add a stamp each time..
@joeyh Oh very nice! Would those be ones with sort of a “half connector” like this https://support.yubico.com/hc/article_attachments/360011813840/YubiKey-NEO-1000.png ? Or maybe there are really thin USB-C variants these days.
@jgoerzen yes, very much like that
I want to mail 20 TB across town to my friend, but before it gets there, my town is bombed. Data lost. Oh well.