Download A Piece of Internet History

Back in the early 1990s, before there was a World Wide Web, there was the Internet Gopher. It was a distributed information system in the same sense as the web, but didn’t use hypertext and was text-based. Gopher was popular back then, as it made it easy to hop from one server to the next in a way that FTP didn’t.

Gopher has hung on over the years, and is still clinging to life in a way. Back in 2007, I was disturbed at the number of old famous Gopher servers that had disappeared off the Internet without a trace. Some of these used to be known by most users of the Internet in the early 90s. To my knowledge, no archive of this data existed. Nobody like archive.org had ever attempted to save Gopherspace.

So I decided I would. I wrote Gopherbot, a spidering archiver for Gopherspace. I ran it in June 2007, and saved off all the documents and sites it could find. That saved 40GB of data, or about 780,000 documents. Since that time, more servers have died. To my knowledge, this is the only comprehensive archive there is of what Gopherspace was like. (Another person is working on a new 2010 archive run, which I’m guessing will find some new documents but turn up fewer overall than 2007 did.)

When this was done, I compressed the archive with tar and bzip2 and split it out to 4 DVDs and mailed copies to a few people in the Gopher community.

Recently, we’ve noted that hard disk failures have hobbled a few actually maintained Gopher sites, so I read this archive back in and posted it on BitTorrent. If you’d like to own a piece of Internet history, download the torrent file and go to town (and please stick around to seed if you can). This is 15GB compressed, and also includes a rare video interview with two of the founders of Gopher.

There are some plans to potentially host this archive publicly in the manner of archive.org; we’ll have to wait and see if anything comes of it.

Finally, I have tried to find a place willing to be a permanent host of this data, and to date have struck out. If anybody knows of such a place, please get in touch. I regret that so many Gopher sites disappeared before 2007, but life is what it is, and this is the best snapshot of the old Gopherspace that I’m aware of and would like to make sure that this piece of history is preserved.

Update: The torrents are now permaseeded at ibiblio.org. See the 2007 archive and the 2006 mirror collection.

Update: The ibiblio mirror is now down, but you can find them on archive.org. See the 2007 archive and the 2006 mirror collection.

49 thoughts on “Download A Piece of Internet History

  1. Try the University of Kent at Canterbury – mirrorservice.org – they have the UK’s largest mirror of content

  2. One big reason for gopher’s sudden demise is that UMich decided they owned it, and sent out vaguely threatening letters concerning licensing. Shortly after, NCSA chimed in and promised to do no such thing with the newfangled WWW. Gopher’s fate was instantly sealed and the rest is history.

    1. That is, of course, part of the reason for it. UMN actually GPL’d their code about 10 years ago now, but of course far too late for any kind of resurgence.

  3. Contact the folks at http://www.ibiblio.org/:
    “Home to one of the largest “collections of collections” on the Internet, ibiblio.org is a conservancy of freely available information, including software, music, literature, art, history, science, politics, and cultural studies. ibiblio.org is a collaboration of the School of Information and Library Science and the School of Journalism and Mass Communication at The University of North Carolina – Chapel Hill.” I bet they’ll be glad to host this!

  4. John,

    I have an almost-empty dedicated server and would consider hosting the whole thing in its entirety. Contact me if you’re still looking for a home for it.

    Mike

  5. Why not try the Library of Congress? They just added Twitter, Gopher seems like a worthy addition in context.

    Regards,

    Armistral

  6. First of all, thanks for doing this! Our digital history is constantly in peril of falling over, degrading or even just being carelessly deleted.

    Couple of questions–
    i) Any chance you could distribute (physical) DVDs? Maybe just charge for the media and postage costs. I for one can’t download that much data over my connection (or, I could, but only taking months and at great expense).

    ii) What would the copyright status of this archive be? Presumably you can’t claim ownership, but also presumably the original files’ copyright is with their original authors/owners. Just something to think about especially in these IP-craziness-filled times.

    1. I distributed DVDs to interested parties back in 2007, and sent out maybe just 2 or 3 sets. It takes a lot of time to burn, label, pack, mail, etc. and I’m not really able to do that again. But maybe somebody else can.

      As far as copyright issues are concerned, I suspect this would be pretty much the same as archive.org or Google Cache has to deal with. Except in this case, most of it was essentially abandoned content that people forgot even existed anyhow. Though certainly not all; there is a small active gopher community still.

  7. The British Library might be interested as they haev also expanded their remit to include digital works – bl.ac.uk – and 40GB is hardly massive compared to their existing collections!

    Andy

  8. I would fall off my chair in surprise of the archive.org people were not interested in this. If you need a contact, let me know.

  9. I can host the entire thing in proper fashion, at a datacenter with a 10 meg connection. If you can provide a Hyper-V disk image (preferably a CentOS guest, but I hear Suse and Redhat are supported as well) and let me know what the hardware requirements are, I can get it hosted free, indefinitely.

  10. Hi,

    I can offer some of my hosting space for this, I will download the file over the weekend and upload a preliminary version. Please contact me by email, then we can arrange more details as they come up.

    Cheers,
    Guillermo

  11. Hi, am downloading as I type, I should be able to host this for free permanently but will have to have a look at it first to see

    Will post again when I’ve had a look

    Steve

  12. Downloading it on my seedbox which has a pretty nice upload connection. Will keep it up on there for the next two weeks so the torrent should run pretty quickly.

    One thing on my ‘nerd bucketlist’: run a publicly-accessible gopher server someday.

  13. I would be willing to host the site if you like. What are your needs for the site? I have plenty of servers lying around doing and would be willing to donate their cycles to seeing this project move forward. Please let me know if you are interested.

  14. RE: the torrent, is there anyway to split the main files into several smaller ones as FAT32 cannot support files larger than 4gb therefore I can’t download the main file without switching to NTFS.

  15. I’m working on making a web-based Gophernet archive & proxy, with both the 2007 & 2010 archives, as well as a continuously updated archive, similar to the wayback machine. Send me an email for more info. I’m still writing the server software, but plan to have a beta version up in a couple months.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.