Category Archives: Software

Backing up

Just about everybody hates backing up computers. But it’s important. With more of our information being stored digitally — even photos — it’s critical to back them up.

At home, I’ve been using rdiff-backup for years now. Very slick. It stores backups on the filesystem — they look as if you had used rsync. It also stores metadata (owner, mode, etc.) in separate files, so you don’t have to back up using root. But the neat thing is how it handles incrementals. Incremental backups will update the backup image files to the current state, and store binary diffs to the past state. So you can access the latest backup instantly, and re-generate the previous state if needed. Very nice.

I had just been backing up to a regular IDE drive. But this week, I ordered two Seagate ST3400601CB-RK external drives. The drives support both USB2 and FireWire. We will get a safe deposit box at a bank. At any given moment, one drive will be at home, and one will be safely at the bank. They’ll be rotated periodically.

At work, we’ve been using Amanda for years. It does its job well. (Except on AIX, where both dump and tar are broken in obscure, hideous ways, but that’s not Amanda’s fault.) Recently, I discovered Bacula. This looks very slick. It seems to be the direction Amanda would evolve, if it would ever evolve. We’re going to test it out soon.

And besides, who wouldn’t love a program whose slogan is “Bacula: It comes in the night and sucks the essence from your computers”?

Today’s Pet Peeve: Stupid GTK File-Open Dialogs

Have you noticed the incredibly annoying dialogs appearing in new Gnome/GTK apps in sid? They no longer allow you to use the keyboard to enter a filename. Not only that, but they are *incredibly* slow when working with large directories. You better go get some caffeine when if you need to open something under /usr/share/doc.

Here’s an example from Firefox:

Other apps, such as Gimp, also have this problem.

I have one thing to say to these people: WHAT WERE YOU THINKING?

The keyboard is still a useful part of a computer, and I have absolutely no inclination to wait 45 seconds for some annoyingly slow dialog to populate because you prefered to remove the ability for me to enter a filename in a dialog box myself.

Haskell #1 in the Shootout

Wow. Some Haskell hackers have started paying a small bit of attention to the Great Computer Language Shootout site, and the results are impressive.

Haskell now takes first place in the lines of code competition. In the CPU time competition, Haskell is also doing quite respectably: it beats out OCaml by a small margin, and defeats Java, C++, Python, Perl, Erlang, Ruby, Mono, Tcl, etc. by significant margins.

These links are all using the Shootout default weightings for individual tests.

The only downside to the Shootout is that the programs — for all languages — are not really idomatic and don’t show off a language’s natural beauty. Sounds like it’s time to gather up some Haskell hackers to rally around the PLEAC effort as well.

Firebird was almost interesting…

I was looking at the Firebird database recently. Free Software, very feature complete, and one neat feature was that it could run either client/server (like PostgreSQL) or as a standalone .so (like Sqlite). I was starting to look into using it.

Then I discovered it only supports i386 on Linux, and no progress has been made in 3 years on that.

So I will not be trying Firebird.

I thought we had all learned by now that portable code is a good thing. Guess not.

I will be sticking with PostgreSQL as my preferred RDBMS for awhile.

Hello, ext3. Goodbye, reiser4.

So I’ve been trying out various filesystems over the past few months, by converting a few machines to them and using them on a daily basis.

I’ve found that reiser3, JFS, and XFS are all risky and actually corrupt data on crashes. JFS also has a few weird bugs that make the kernel oops, and sometimes cause filesystem corruption. All of the above also have starvation issues, where one IO-intensive process can dramatically slow down everything on the system (by a factor of 100 or more).

Reiser4 has proven better — only one small issue that I can recall. But it’s got a huge problem: no ability to resize a Reiser4 partition. That is rather ridiculous these days, and really reduces the utility of LVM. (Hans says he’ll make it resizable when someone pays.)

So I’ve tried out ext3 again, for the first time in a few years. I’m using data=ordered,commit=300 (or 600 on some machines), which still makes it safer than the other journaled filesystems.

And I must say that it is impressive. The old bottlenecks that I was used to were gone. The thing is reliable and fast, and scales well. I’m going to move everything back to ext3.

So why do Hans’s benchmarks show reiser4 being better? For one thing, most benchmarks measure throughput, not response time, so things like starvation don’t cause black marks in them. Most of them don’t even use multiple processes to simulate real-world activity anyway. Plus, ext3’s default mount options (commit=5, for instance) are much more conservatve than other filesystem’s. To get a fair test, one should increase that commit= number on ext3.

Here’s another discussion about ext3.

Some nice code: libarchive

Yesterday, while looking for information on the format of tar files, I discovered libarchive, which is part of FreeBSD. libarchive and read about 5 different tar formats, 4 different cpio formats, zip, and ISO images, and supports gzip and bzip2. It can also write 2 different tar formats plus cpio and shar. Very nice.

Oh, and its tar.5 is the best reference on the tar format I’ve seen.

I’ve packaged up libarchive and bsdtar (the default tar on FreeBSD, which is built using libarchive) for Debian.

Perl, Powered By Haskell

Autrijus Tang is well-known for developing the first working Perl 6 interpreter, Pugs. Pugs is written in Haskell, my new favorite language. Perl.com has an interview with Autrijus, and page 2 of that interview gets particularly interesting. Here are some quotes from Autrijus:

Haskell . . . is faster than C++, more concise than Perl, more regular than Python, more flexible than Ruby, more typeful than C#, more robust than Java, and has absolutely nothing in common with PHP.

(If it has nothing in common with PHP, it must be great, right?)

Haskell is a pure functional language optimised for conciseness and clarity. It handles infinite data structures natively, and offers rich types and function abstractions that give Haskell programs a strong declarative flavor–the entire Pugs compiler and runtime is under 3000 lines of code.

Most languages require you to pay a “language tax”: code that does nothing with the main algorithm, placed there only to make the computer happy. [Java, anyone? — jgoerzen]

On the other end of spectrum, we often shy away from abstracting huge legacy code because we are afraid of breaking the complex interplay of flow control and global and mutable variables. Besides, the paths leading to common targets of refactoring–those Design Patterns–are often non-obvious.

Because Haskell makes all side effects explicit, code can be refactored in a safe and automatic way. Indeed, you can ask a bot on to turn programs to its most abstracted form for you.

Go check out the interview (page 2) for more, including a demo program that Autrijus wrote to show off Haskell.

Thanks to metaperl for the link.

Digikam

Back when I first got my digital camera (a Canon Digial Rebel), I knew I had to find some sort of program to keep track of my photos. I looked at many different programs on Linux, but none of them really did what I wanted. I’ve used some iView software on the Mac some times. While it can do what I want, its database is proprietary, which annoys me. It means, among other things, I can’t write my own programs to pull data from that database.

Lately I’ve been checking out the Linux photo management scene again, and I’ve got to say that digiKam is quite the impressive piece of work.

It has a versatile database, nice interface, and loads of features. Its database uses sqlite, so writing my own programs to work with it will be a snap. I’ve been using version 0.7.x, and it looks like the 0.8.x beta will address all of my few remaining complaints.

I’m moving everything over to digiKam.

Kudos to the digiKam deveopers.