Category Archives: Technology

OSCon Wednesday

Nat Torkington, program chair, started off the day. He commented that one of the most interesting trends these days is the expansion of the Open Source ideals beyond software.

Tim O’Reilly commented about the FSF’s four freedoms, and asked how we maintain them. We have to think about preserving freedoms — questions such as Free Software that relies on proprietary services, data, or business processes. It’s important to remember to pay attention to freedom and not just to the success of businesses. But businesses matter and have enormous power and will always be related.

Tim really pushed expanding the boundaries of Open Source and thinking ahead: wikipedia, OpenID, etc. He also asked: does Congress need a version control system?

He suggested there are four open source success factors: frictionless software distribution, collaborative development, freedom to build/adapt/extend, freedom to fork.

Hadoop is an interesting FLOSS project to build some infrastructure like Google has. Apparently Yahoo is very interested.

Back to Nat… hardware is cheap and everyone keeps buying more of it.

James Reinders from Intel talking about multi-core parallelism. Saying that parallelism is going to be more and more important. Intel released threading building blocks, a series of templates for C++, as GPL’d software at the conference this week. I’m not all that excited about a C++ project, though, since I think languages like Haskell have more promise here anyway.

The other Intel guy mentioned Intel’s open source involvement: intellinuxgraphics.org, intellinuxwireless.org, linuxpowertop.org, kernel,org, moblin.org. Linux laptops have the longest runtimes compared to other laptops.

“It’s amazing how many people you can make paranoid by showing up with a tie and a suit to do a keynote at OSCON.” — James Reinders

Simon Peyton-Jones is up now, and Nat says he will “stretch your brain until only tiny bits are left.”

State of the art in parallelism is really 30 years old with locks and condition variables — like building a skyscraper out of bananas.

Locks are difficult to do right and have “diabolical error recovery”.

Let’s do transactions against memory instead of against a database. Implementation can even be similar to databases. The idea is transactional memory, and it sounds very, very slick.

Mark Shuttleworth and Tim for an interview…

Mark was fine, but I wish Tim had more interesting questions for him.

I went up to the front a few minutes after the event to talk to Simon PJ. He was talking to someone, who saw my nametag, and said, “Hi John, nice to meet you.” He looked familiar but I couldn’t quite place him, so I asked who he was. “Mark Shuttleworth.” Yep, I was sitting just far enough back from the stage that I wasn’t behind one of the large TV screens and couldn’t make out faces real well, and I didn’t recognize him. Erg..

Sierra Wireless 595U / Sprint on Linux

Here’s how you use a Sierra Wireless 595U USB modem to connect to wireless Internet service with Sprint:

Insert the modem into the USB slot. lsusb should show:

Bus 001 Device 005: ID 1199:0120 Sierra Wireless, Inc.

rmmod usbserial

Then:

modprobe usbserial vendor=0x1199 product=0x0120

You should see /dev/ttyUSB0, ttyUSB1, and ttyUSB2 appear. See also instructions for automating this with a similar card (modify vendor and product to above settings).

Now you will need to configure PPP for this. On Debian, run pppconfig. Your settings will be:

Phone number: #777
Username: 1234567890@sprintpcs.com (replace 1234567890 with your data card’s “phone number”, no dashes)
Password: your sprint password
Speed (BPS): 921600 (use lower numbers such as 115200 if you have trouble with this)
Port: /dev/ttyUSB0
Init string: ATZ

Here are some other helpful pages:

Verizon EVDO
Sprint and Linux
Cingular AT&T UMTS
Sierra’s Linux page

Mail Readers Still Stink

Five years ago, I started work on OfflineIMAP. I couldn’t find any mail reader that offered good IMAP support and a good feature set. Rather than write my own mail reader, I simply wrote OfflineIMAP and used it with mutt. OfflineIMAP does a bi-directional sync between an IMAP server and local mailboxes. This lets you work offline, and also speeds up reading since each new message doesn’t have to be downloaded from the network on the spot.

I kept hoping that OfflineIMAP would become obsolete soon, as mail readers got better. Back in 2004, two years after writing OfflineIMAP, I looked at mail readers. In 2005, after some more frustrations with mail readers, I wrote a comparison. I wound up sticking with mutt and OfflineIMAP each time.

I’ve gone out looking at mail readers again. Here’s what I’ve found.

KMail
Overall a nice reader, KMail has almost every feature and setting I want. It has “disconnected IMAP” folders, which download every new message in folders to the local disk as part of the routine mail checking. It then caches local changes and syncs them to the remote on the next mail check. This boosts interactive performance and permits offline operation — very similar to OfflineIMAP. KMail has keyboard shortcuts for most things, and keyboard shortcuts can be added or changed for most other things as well.

KMail also integrates with the KDE calendar and addressbook, which I already use. That’s nice, too.

I have two big gripes about it though.

Back in 2004, I noticed that KMail crashes a lot. By 2005, it was worse. Sadly, KMail still has a tendency to crash. I’ve seen an average of 1-2 crashes per day, due to SIGFPE, SIGSEGV, and I think also SIGILL. This doesn’t make me happy at all. Especially to see that it’s no better on that in three years. Just don’t try emptying trash while your mailbox is being synced, for one thing….

Gripe is that there is absolutely no way to select a different alternative in a multipart message without using the mouse. Simply no way using the keyboard. It’s also cumbersome, though possible, to view attachments using the keyboard — you have to press Enter to open the message in its own window, then tab to the attachment.

KMail also sometimes works a bit sluggishly — for instance, when you delete a message, you first see strikethrough through it, then it disappears. It doesn’t feel very “snappy”.

Evolution
Evolution has a decent core. It is easy to get set up and has an extensive set of keyboard shortcuts. It does IMAP downloading and syncing like Kmail, and it does it by default. It doesn’t offer all that much flexibility in configuration, but probably enough.

Here’s my gripe. There is no way for it to show a total message count next to each folder in the folder list. It will show an unread message count, but not a total message count. You have to click on each folder individually to see a total message count. I can’t figure out why this is missing from Evolution. It’s one of the main benefits to switching from mutt, and so I didn’t bother looking at Evolution any more past there.

Thunderbird / Icedove
By default, it isn’t all that capable of a mail reader. There aren’t that many configuration options, and the keyboard shortcuts — while existing for most things — are cumbersome.

The Nostalgy extension helps with the keyboard shortcuts significantly. You still can’t change some of them (Ctrl-L for forward, anyone) — at least not without an extremely cumbersome process involving editing text files.

Thunderbird can do automatic IMAP downloading and syncing like KMail and Evolution, but for some inexplicible reason, only for your INBOX. In fact, Thunderbird won’t even check for mail in folders other than INBOX unless you set an undocumented configuration option. It seems to assume that nobody does server-side mail filtering.

If you want IMAP downloading for offline use or performance, you have to manually invoke a download operation. There is a Sync on Arrival extension, but it isn’t compatible with Thunderbird 2.0. From reading comments online, there are a lot of people frustrated about that.

So Thunderbird strikes out as well.

mutt + OfflineIMAP
The good thing about this combination is performace. mutt is extremely fast, and OfflineIMAP works faster than anything else for IMAP downloading. mutt is also far more configurable than anything else.

There are some annoyances about mutt.

on that list is the lack of a folder list. There is just no way to see a list of folders along with new or total message counts. You can press c, Enter to go to the next folder with unread mail, which is something, but not enough. There have been numerous abortive projects over the years to address this, but for whatever reason, mutt itself doesn’t have this yet. Probably the most promising current project is this one.

is HTML mail. I don’t mind the lack of default support for HTML mail. That’s to be expected. There are some things that do bug me involving viewing HTML mail. First off, sometimes people attach graphics to messages that also have an HTML component. Viewing these graphics doesn’t represent a security risk, but mutt doesn’t make them available to a browser for viewing — you have to manually save them if you want to view them. Also, you normally don’t want to load graphics from the Internet for HTML mail. The only way to accomplish that with mutt is to set your browser to lynx or something; just using Firefox to view a HTML component will load all of that.

#3 is handling of embedded URLs. xterms can pass mouse clicks, and it would be nice if mutt made URLs clickable like other mail readers do.

#4 is the IMAP support. No support for caching, fragile, etc. That’s why I use OfflineIMAP. That works, but it’s a hassle.

#5 is printing. Printouts from mutt just spew the text of the message at the printer. No page numbers, formatting, nothing. muttprint makes that situation a bit better, but the integration is flaky and weird.

Conclusions
I’m not sure what I’ll do. None of these are really where I want them to be, though mutt and KMail are probably the closest.

Debian Developers 7 Years Ago

Today while looking for something else, I stumbled across a DVD with the “last archive” of my old personal website. On it were a number of photos from the 2000 Annual Linux Conference in Atlanta, and the Debian developers that were there. These were posted in public for several years.

I’ve now posted all of them on flickr, preserving the original captions.

Here’s the obligatory sample:

20001018-01-06.jpg

That’s Joey Hess, using what I think was his Vaio. Most acrobatic keyboardist ever. Probably the only person that could write Perl with one hand comfortably.

What else can you see? The best of show award that Debian won that is now in my basement due to a complicated series of events, the Debian machines that were being shown off at the show, Sean Perry and Manoj, the photo with long-term corrupted caption, and of course, numerous shots of Branden.

I know the size stinks. It was scanned at a web resolution for 2000. I do still have the negatives somewhere and will post the rest of them, in higher res, when I find them.

Click here to view the full set.

Conferences Suggestions

At work, we use Linux (and Debian, in specific) for a lot of different things: everything from our phone system (running on Asterisk) to file serving and running some proprietary applications. I’m one of the people that finds, sets up, and maintains these systems, and I write code for our in-house use as well. I like to learn from others, and get to know others that may have things in common with me and with our environment. So going to conferences is a useful thing to do.

I’m hoping that some reader out there will have a good suggestion for a conference I ought to attend. Here are some that I’ve looked into and my thoughts on them:

  • Usenix Annual Tech Conference: I went last year. There were some very good talks, but the audience size was not all that large. I got to meet some peers there, but I didn’t get to talk in-person to anybody I’d worked with online before. (LISA being in fall/winter means it’s too early to consider it just yet)
  • LinuxWorld Conference & Expo: My general impression of LWCE in the past has been that its technical talks aren’t very technical; that is, they either cover things I already know or don’t care about. They are starting to publish their program this year, and I see a few interesting things though. There have traditionally been a lot of Debian folks at the .org pavillion.
  • Debconf: It seems to be focused almost exclusively on developing the Debian operating system, rather than on using it. While I am a Debian developer and have been for quite awhile, I would find new uses of Debian to be more interesting than new ways to hack on Debian. Plus, the insanely early registration requirements means that it’s too late to go this year anyhow. (And my brother is getting married right in the middle of it.)
  • OSCon: There look to be some interesting talks in the database area, and some about Xen and virtualization, and Simon PJ (one of the ghc hackers) will be there. So this would be interesting, though somewhat light on the admin side of things.
  • OLS: Seems very focused on the kernel, and not much else. That is of interest, of course, but is one piece of many. Though there was a talk about Linux deployment at Nortel that sounded interesting.

My leading candidates are probably Usenix and OSCon. I’m interested to hear what people think, especially those that have attended some of these conferences.

And we’re off!

Yesterday afternoon, we started our information meetings with employees about our Linux on the desktop project. We’re underway on our migration.

But before I talk about that, I need to back up and describe what the project is.

We are converting approximately 80% of our 150 or so PC users to Linux desktops. They’re Debian etch (4.0) running Gnome, Firefox (Iceweasel), Evolution, NFSv4, and SystemImager. Over the coming days and weeks, I’ll be writing about why we’re doing this, how we’re making it happen, things we’ve run into along the way, and the technology behind it.

Today I’d like to start with a high-level overview of the reasons we started investigating this option.

It became apparent that Vista was going to be a problem for us. Most of our desktop PCs are not very old, but Vista meant a significant degradation in performance from the Windows XP Pro that most people were running. A performance dip so significant, in fact, that it would have created a significant negative impact on employee productivity.

We tend to buy PCs with Windows licenses from the vendor (Windows preinstalled). As such, we knew it wouldn’t be long before XP-based machines would be hard to find. If we stuck with Windows, we’d be running a mixed-OS network — which we knew from experience we did NOT want to do. The other option would be to replace all those old PCs. The direct costs of doing that, with the associated Vista and Office licenses, would have been more than $200,000.

So we started to look at other options — changing the way we license Windows, sticking with XP for awhile, or switching away from Windows. This last option sounded the most promising.

I took a spare desktop-class machine, representative of the hardware most end users would have, and installed etch (then testing) on it. I spent a bit of time tweaking the desktop settings, making things as transparent to the user as possible. We liked what we saw and started pursuing it a bit more. We knew we had some Windows apps we couldn’t discard, so we tested running them off a Windows terminal server with the Linux rdesktop client. That worked well — and the appropriate Server 2003 licenses plus CALs would still be far cheaper than a mass migration to Vista.

To make a long story short, we are getting quite a few benefits out of all this. One of the most important is a single unified system image. Excepting a few files like /etc/fstab, every system gets a bit-for-bit identical installation from the server, updated using rsync. /home is mounted from the network using NFS (v4). So our users can sit down at any PC, log in, and have all their programs, settings, email, etc. available. A side benefit is that hardware problems become minor annoyances rather than major inconveniences; if your hard disk dies, we can just bring up a different PC. We had tried numerous times to make roaming profiles work in Windows, but never really achieved a reliable setup there — perhaps because it seemed virtually impossible to assure that each Windows PC had the exact same set of software, in the exact same versions, installed.

More to come.

Some more git, mercurial, and darcs

Ted Ts’o had an interesting post about git recently. He has a lot of good thoughts on the subject. He comments that he wound up using git because it’s so Unixy (with its small commands to do things), that he sees the git community developing innovations faster than Mercurial, and that they are working to improve the documentation and user interface problems.

The being so Unixy is a double-edged sword. On the one hand, it can make it easy to write shell scripts to extend Git. That itself can be a double-edged sword (think filename quoting and the like). But one doesn’t have to use the shell. The other downside is that being Unixy makes it hard to run on platforms that aren’t, such as Windows. So if one is working on Unix-only software (X, the kernel, e2fsprogs, etc.), there’s no need to care about it. But if you’re a person like me, who has Windows users using my software, or a large organization like Mozilla, it’s maybe a showstopper. Of course, workarounds exist (cygwin, git-cvsserver), but none of them are particularly nice.

I think that both Git and Mercurial are working to address their shortcomings. I’ve chosen hg for now because it does what I need now. And because there are very nice tools to convert hg to git, and vice-versa. So if Ted’s right, and a year from now git is easier to use, better documented, more featureful, and runs well on Windows, it won’t be that hard to switch over and preserve history. Ted’s the sort of person that usually is right, so maybe I should starting looking at hg2git right now

So following up on my bzr post, here are the things that Mercurial is great at right now:

  1. Performance. Approximately even with git, occasionally faster. Nobody else can compete with these two right now.
  2. Simplicity. It’s almost as easy to get started as with darcs, and with recent patches, will be even closer in the future.
  3. Lots of ways to interact. You can send hg bundles, which preserve all metadata (parents, hash, authors, etc), or you can send git-format email patches, or you can push and pull between repos. The email tools will shortly be able to automatically detect what patches to send. Your choice. git doesn’t seem to support lossless emailing of bundles like this, and bzr doesn’t make emailing of anything easy by default.
  4. Merging. hg seems to be able to automatically resolve more merge conflicts than anything else, and when it can’t automatically resolve them, has a nicely configurable system to let you use your choice of tool to manually resolve them.
  5. Community. The Mercurial community is open and inviting, and open to new/different ideas. It seems similar to Darcs in that respect, and somewhat dissimilar to git.
  6. Rebase does not trash history like it does (barring undocumented manual intervention) in git.

I’ve written before about Darcs, so I won’t duplicate that here.

bzr, again

I’ve talked a lot lately about different VCSs.

I got some interesting comments in reply to my most recent post. One person took issue with my complaint that nobody really understood how to specify a revision to git format-patch, and proceeded to issue an incorrect suggestion. And a couple of people complained about my comments about bzr, which generally came down to the released version of bzr didn’t have anything compelling and also didn’t support tags.

So I went into , asked them what bzr has that git, Mercurial, and darcs don’t. And gave bzr the benefit of the doubt that 0.15 will be out soon and will be stable. What I got back were these general items:

  1. Renaming of directories (not in hg, git)
  2. 2-way sync with Subversion (not in hg, darcs)
  3. Checkouts (not in any others by default)
  4. No server-side push requirement

Let’s look at these in more detail.

1. Renaming of directories

All of them can rename files and (excepting git) completely accurately track back the file’s history. But consider this: if person A commits a change to branch A that adds a file, and person B then renames the directory that the file is in on his branch, will a merge cause person A’s file to appear in the new directory name? In darcs and bzr, yes. In Mercurial and git, no.

So yes, this is a nice thing. But I have never actually had this situation crop up in practice, and even if it did, could be trivially remedied. I would say that for me, I don’t really care.

[Update: Current stable releases of Mercurial can do this too. I’m not quite sure how, but it does work. So git is the only one that can’t do that.]

2. 2-way sync with Subversion

This is a really nice feature and is present in both git and bzr. I haven’t tested it either place, but if it works as advertised — and properly supports tracking multiple related svn branches and merges — would be slick. That was enough to make me consider using git, but in retrospect, I so rarely interact with people using svn that it is not that big a deal to me.

Still, for those that have to work with svn users, this feature in bzr and git could be a big one.

Better yet would be to get all those svn holdouts over to DVCS.

3. Checkouts

A bzr checkout is basically a way to make local commits be pushed to the remote repo immediately, as with svn. This is of no utility to me, though I can see some may have a use for it. But it can be done with hg hooks and probably approximated with scripting in others.

4. No server-side process necessary for pushing repos

bzr has built-in support to push to a server that has sftp only, and doesn’t require a copy of itself on the server. While I believe that none of the other three have that, it is possible to rsync (and probably ftp) darcs and Mercurial repos to a server in a safe fashion by moving repo files in a defined order. Probably also possible with git. All four can pull repos using nothing but regular HTTP.

What bzr still doesn’t have

Integrated patch emailing. The big thing is that it has no built-in emailing of patches support. darcs is extremely strong in this area, followed by hg, and git is probably third. “darcs send” is all it takes to have darcs look at the remote repo, figure out what you have that they don’t, and e-mail a bundle of changesets to them. I posted an extension and later a patchset that does all this for Mercurial except for automatically figuring out what default email address to do (that’ll come in a few days, I think). One feature Mercurial has had for awhile that Darcs hasn’t is sending multiple textual diffs as a thread, with one message per changeset. bzr doesn’t have any support for emailing patches yet, which is disappointing. Because of the strong support for this in darcs and Mercurial, people running those systems feel less of a need to publish their repos.

[Update: There is a plugin for bzr that seems to address some of this. I haven’t tested it, and it’s not in bzr core (so doesn’t quite meet my very friendly for a newbie requirement), but this does exist, though apparently not as advanced as Mercurial]

Performance. Supposedly 0.15 is supposed to be better on this, but even if bzr achieves the claimed doubling of performance, most benchmarks I have seen would rate it as still being significantly behind git and Mercurial, though it may overtake darcs in some tests.

Extensive documentation. I would say that bzr’s docs are better in some ways than git’s (its tutorials especially), but lack depth. If you want to know some detail about how the repository works on-disk, it’s not really documented. Darcs still has David’s excellent manual, and Mercurial has the hg book which is still great as well.

Merging not as advanced. darcs is pretty obviously way on top here, but of the others, Mercurial does a pretty good job with its automatic handling of renames and automatic resolving of different branches that commit the same change (even if that same change is a rename, or an add of the same content). bzr can’t resolve as much automatically.

Summary

Well, I’ll say that bzr still doesn’t look compelling enough for my use cases to use, and the lack of an easy-for-a-newbie-to-use automated email submission feature is a pretty big disappointment. Though I did appreciate the time those on spent with me, and if I needed to sync with svn users frequently, I’d probably choose bzr over git.

For now, I’m happy with sticking with darcs for my code and hg for my Debian work.

But all four communities are aggressively working on their weaknesses, and this landscape may look very different in a year.

More on Git, Mercurial, and Bzr

I’ve been writing a lot about this lately, I know, but it’s an interesting landscape.

I had previously discarded git, but in light of git-cvsserver (which provides a plausible way for Windows people to participate), I gave it a try.

The first thing I noticed is that git documentation, in general, is really poor. Some tutorials that claim to cover git actually cover cogito. Still others use commands that are much more complex than those in the current git — and these just the ones linked to from the git homepage.

git’s manpages aren’t much better. There are quite a few git commands (such as log) that take arguments that other git commands accept. Sometimes this fact is documented with a pointer to these other commands, but often not; a person is left guessing what the full range of accepted arguments are.

My complaint that git is overly complex still exists. They’ve made progress, but still have a serious issue here. Part is because of the docuemtnation, and part is because of the interface. I wanted to export to diffs all patches on the current branch in a repo. I asked on , and someone suggested using the revision specifier ..HEAD. Nope, didn’t work. A few other git experts chimed in, and none could come up with the correct recipe. I finally used -500, which worked but is hackish.

git’s lack of even offering support for a human to indicate renames also bothers me, though trustworthy people have assured me that it doesn’t generally cause a problem in practice.

git does have nicer intra-repo branching than Mercurial does, for the moment. But the Mercurial folks are working on that anyway, and branching to new directories still works fine for me.

But in general, git’s philosophy is to make things easy for the upstream maintainer, and doesn’t spend much effort making things easy for contributors (except to make it mildly easier to contribute to a large project like Linux). Most of my software doesn’t have a large developer community, and I want to make it as easy as possible for new developers to join in and participate. git still utterly fails on that.

I tried bzr again. It seems that every time I try it, after just a few minutes, I am repulsed. This time, I stopped when I realized that bzr doesn’t support tags and has no support for emailing changesets whatsoever. As someone that has really liked darcs send (and even used tags way back with CVS!), this is alarming. The tutorial on the bzr website referenced a command “bzr help topics”, which does not work.

So I’ll stick with my mercurial and darcs combination for now.

I announced the first version of a hg send extension yesterday as well. I think Mercurial is very close to having a working equivalent to darcs send.