Search for Backup Tools

November 7th, 2008

Since the last time I went looking for backup software, I’ve still be using rdiff-backup.

It’s nice, except for one thing: it always keeps an uncompressed copy of your current state on the disk. This is becoming increasingly annoying.

I did some tests with dar and BackupPC, and both saved considerable disk space over rdiff-backup. The problem with dar, or compressed full/incrementals with tar, is that eventually you have to make a new full backup. You have to do that, *then* delete all your old fulls and incrementals, so there will be times when you have to store a full backup twice.

The hardlinking approach sounds good. It’s got a few problems, too. One is that it can lose metadata about, ironically enough, hard links. Another is that few of the hard linking programs offer a compressed on-disk format. Here’s what I’ve been looking at:

BackupPC

Nice on the service. I’m a bit annoyed that it’s web-driven rather than commandline-driven, but I can look past that. I can also look past that it won’t let me clamp down on ssh access as much as I’d like.

BackupPC writes metadata to disk alongside files, so it can restore hard links, symlinks, device entries, and the like. It also has the nice feature of being able to hard link identical files across machines, so if you’re backing up /usr on a bunch of machines and have the same files installed, you save space. Nice.

BackupPC also can compress the files on your disk. It uses pre-compression md5sums for identifying files to hard link, which is nice.

Here’s where I get nervous.

BackupPC doesn’t just use regular compression, from say gzip or bzip2. It uses its own low-level algorithm centered around the Perl deflate library. And it does it in a nonstandard way owing to a supposed memory issue with zlib. Why they don’t just pipe it through gzip or equivalent is beyond me.

This means that, first off, it’s using a nonstandard compression format, which makes me nervous to begin with. If that weren’t annoying enough, you have to install Perl plus a bunch of modules to extract the thing. This makes me nervous too.

Dirvish

Doesn’t support compression.

faubackup

Doesn’t support compression.

rdup

Supports compression and encryption. Does not preserve ownership of things unless the destination filesystem does (meaning you must run as root to store your backups.)

Killer lack of feature: it does not preserve knowledge about what was hardlinked on the source system, so when you restore your backup, all hardlinks are lost. Epic fail.

rsnapshot

Doesn’t support compression.

StoreBackup

Does support compression, appears to restore metadata in a sane way. Supports backing up to a different machine on the LAN, but only if you set up NFS. Looks inappropriate for doing backups over VPN. Comprehensive, though confusing, manual. Looks like an oddball design with an oddball manual.

So, any suggestions?

Categories: Software

Leave a comment

Comments Feed17 Comments

  1. Lisandro Damián Nicanor Pérez Meyer

    Write a script that decompresses the backup, run rdiff-backup, and compress again.

    I have done similar things and it works perfectly (and let’s hope Murphy doesn’t get involved if I need the backup :-) )

    Reply

  2. stlman

    There is a perl script called flexbackup. It is very good and… flexible :-) It supports many archivers and compressors and adding your own isn’t too hard either.
    However, it hasn’t been developed for few years, there isn’t anything it lacked for me.

    I converted to rsync(1) incremental backup with hard links because it was quite annoying to search for files in eg. 20GiB bzipped tar. (3 MB/s)

    http://freshmeat.net/projects/flexbackup/

    Reply

  3. toupeira

    At my workplace we’re also using rdiff-backup to manage almost 3TB now, and rsync to replicate the backups to other, identical systems. But currently we’re considering moving everything to a central OpenSolaris storage system, and just do backups using ZFS’ snapshots, which are basically instantaneous.

    Reply

    stlman Reply:

    Tell me Mr. Toupeira, what good is a backup when you’re unable to restore it?

    Snapshots are not backups they’re just checkpoints which prevent overwriting blocks “created” before the checkpoint. If some data in such block are to be updated copy-on-write is performed hence the instantaneousness. Snapshot are very useful for backups because you get consistent, even on application level, filesystem state. But they are NOT in any way safer than the data you work with. They are on the same hardware.

    Reply

    thedward Reply:

    One of the neat things about ZFS is zfs send / zfs receive; You can serialize and deserialize snapshots or diffs between snapshots. It makes it easy to synchronize your local snapshots with a remote system.

    Reply

    toupeira Reply:

    I’m aware how ZFS snapshots work, and they *can* be restored. We’re planning to still have 3 separate systems and replicate the data, using zfs send/receive as thedward mentioned.

    Reply

  4. alex

    rdiff-backup on a compressed FS?

    Reply

  5. Michael Alan Dorman

    John,

    I think your nervousness over BackupPC may be unwarranted.

    Looking at the code, what it’s producing is simply headerless gzip compression—anything that provides an interface to zlib should be able to chew on it.

    The allowances it’s making for memory usage issues would appear to have no more impact than perhaps lowering the compression level by flushing excessively.

    Oh, and it appears it appends a record of rsync meta-information that is presumably there to allow it to avoid unnecessary transfers without having to decompress unchanged files.

    Honestly, I might look at using it for myself

    Reply

  6. chris burkhardt

    duplicity is by the same author as rdiffbackup, but instead of keeping a mirror of the current state, everything is tar’d and gpg’d:

    [url=http://duplicity.nongnu.org/]http://duplicity.nongnu.org/[/url]

    Reply

    chris burkhardt Reply:

    Oh, I just read the other entry you referenced and see you’ve already considered and rejected duplicity.

    Reply

  7. ramune

    There’s also Bacula. We currently use it at work and it supports restoring hard links, manages pools of media, can auto-label tapes, back up to files (ala VTL), and so on.

    The main gripe I have is the user interface. It’s pretty much unusable without running under rlfe. Despite that, though, I found its features and reliability good enogh to stick with, despite the klunky interface.

    It is lacking a bit in configuration options and tweaks compared to commercial products, but I found it more reliable than any other open-source backup software out there.

    Reply

  8. solrize

    I just use tar and/or rsync, but have been wanting to look at veracity:

    http://taobackup.org

    Reply

  9. Miek Gieben

    Hi,

    I’m the author of rdup and rdup DOES support hardlinks as of version 0.6.0. This ofcourse only works when the hardlinked files are all contained in your
    backup.

    Reply

  10. Kai Hendry

    I stole some ideas from Stuart Langridge.

    Here are the backup scripts I use at work.

    Ok, it doesn’t use compression for storage, though disk space is cheap and I’d rather have fast non-CPU intensive backup runs.

    Reply

  11. Alec Berryman

    I’ve been looking around for an alternative to rsnapshot. I think that gibak is promising – http://eigenclass.org/hiki/gibak-backup-system-introduction.

    It’s based on git, appears to get all the metadata issues right, stores no uncompressed copies (just use a bare repository). It doesn’t do compression, but you can use encfs or something similar for that.

    The downside is that it’s rough and inflexible. I think it will take a nontrivial amount of work to have it do daily/weekly/monthly schemes, expire content, not just backup the home directory, and other important features.

    Reply

  12. Theo Band

    Dump and restore is what I have used for several years without any problem. All data resides on a ext3 filesystem. Dump creates full compressed backups and can also make increment backups. Before I dump the filesystem, I first make a snapshot using LVM to have a consistent state of the filesystem during the backup. I am now playing with rdiff-backup. Advantage is that data can be quickly found and retrieved, which is more cumbersome with the compressed dumps.

    Reply

  13. Backing up every few minutes with simplesnap | The Changelog

    […] my last search for backup tools, I’d been using BackupPC for my personal systems. But since I switched them to ZFS on Linux, […]

Leave a comment

 

Feed

http://changelog.complete.org / Search for Backup Tools