Monthly Archives: March 2007

Some more git, mercurial, and darcs

Ted Ts’o had an interesting post about git recently. He has a lot of good thoughts on the subject. He comments that he wound up using git because it’s so Unixy (with its small commands to do things), that he sees the git community developing innovations faster than Mercurial, and that they are working to improve the documentation and user interface problems.

The being so Unixy is a double-edged sword. On the one hand, it can make it easy to write shell scripts to extend Git. That itself can be a double-edged sword (think filename quoting and the like). But one doesn’t have to use the shell. The other downside is that being Unixy makes it hard to run on platforms that aren’t, such as Windows. So if one is working on Unix-only software (X, the kernel, e2fsprogs, etc.), there’s no need to care about it. But if you’re a person like me, who has Windows users using my software, or a large organization like Mozilla, it’s maybe a showstopper. Of course, workarounds exist (cygwin, git-cvsserver), but none of them are particularly nice.

I think that both Git and Mercurial are working to address their shortcomings. I’ve chosen hg for now because it does what I need now. And because there are very nice tools to convert hg to git, and vice-versa. So if Ted’s right, and a year from now git is easier to use, better documented, more featureful, and runs well on Windows, it won’t be that hard to switch over and preserve history. Ted’s the sort of person that usually is right, so maybe I should starting looking at hg2git right now

So following up on my bzr post, here are the things that Mercurial is great at right now:

  1. Performance. Approximately even with git, occasionally faster. Nobody else can compete with these two right now.
  2. Simplicity. It’s almost as easy to get started as with darcs, and with recent patches, will be even closer in the future.
  3. Lots of ways to interact. You can send hg bundles, which preserve all metadata (parents, hash, authors, etc), or you can send git-format email patches, or you can push and pull between repos. The email tools will shortly be able to automatically detect what patches to send. Your choice. git doesn’t seem to support lossless emailing of bundles like this, and bzr doesn’t make emailing of anything easy by default.
  4. Merging. hg seems to be able to automatically resolve more merge conflicts than anything else, and when it can’t automatically resolve them, has a nicely configurable system to let you use your choice of tool to manually resolve them.
  5. Community. The Mercurial community is open and inviting, and open to new/different ideas. It seems similar to Darcs in that respect, and somewhat dissimilar to git.
  6. Rebase does not trash history like it does (barring undocumented manual intervention) in git.

I’ve written before about Darcs, so I won’t duplicate that here.

bzr, again

I’ve talked a lot lately about different VCSs.

I got some interesting comments in reply to my most recent post. One person took issue with my complaint that nobody really understood how to specify a revision to git format-patch, and proceeded to issue an incorrect suggestion. And a couple of people complained about my comments about bzr, which generally came down to the released version of bzr didn’t have anything compelling and also didn’t support tags.

So I went into , asked them what bzr has that git, Mercurial, and darcs don’t. And gave bzr the benefit of the doubt that 0.15 will be out soon and will be stable. What I got back were these general items:

  1. Renaming of directories (not in hg, git)
  2. 2-way sync with Subversion (not in hg, darcs)
  3. Checkouts (not in any others by default)
  4. No server-side push requirement

Let’s look at these in more detail.

1. Renaming of directories

All of them can rename files and (excepting git) completely accurately track back the file’s history. But consider this: if person A commits a change to branch A that adds a file, and person B then renames the directory that the file is in on his branch, will a merge cause person A’s file to appear in the new directory name? In darcs and bzr, yes. In Mercurial and git, no.

So yes, this is a nice thing. But I have never actually had this situation crop up in practice, and even if it did, could be trivially remedied. I would say that for me, I don’t really care.

[Update: Current stable releases of Mercurial can do this too. I’m not quite sure how, but it does work. So git is the only one that can’t do that.]

2. 2-way sync with Subversion

This is a really nice feature and is present in both git and bzr. I haven’t tested it either place, but if it works as advertised — and properly supports tracking multiple related svn branches and merges — would be slick. That was enough to make me consider using git, but in retrospect, I so rarely interact with people using svn that it is not that big a deal to me.

Still, for those that have to work with svn users, this feature in bzr and git could be a big one.

Better yet would be to get all those svn holdouts over to DVCS.

3. Checkouts

A bzr checkout is basically a way to make local commits be pushed to the remote repo immediately, as with svn. This is of no utility to me, though I can see some may have a use for it. But it can be done with hg hooks and probably approximated with scripting in others.

4. No server-side process necessary for pushing repos

bzr has built-in support to push to a server that has sftp only, and doesn’t require a copy of itself on the server. While I believe that none of the other three have that, it is possible to rsync (and probably ftp) darcs and Mercurial repos to a server in a safe fashion by moving repo files in a defined order. Probably also possible with git. All four can pull repos using nothing but regular HTTP.

What bzr still doesn’t have

Integrated patch emailing. The big thing is that it has no built-in emailing of patches support. darcs is extremely strong in this area, followed by hg, and git is probably third. “darcs send” is all it takes to have darcs look at the remote repo, figure out what you have that they don’t, and e-mail a bundle of changesets to them. I posted an extension and later a patchset that does all this for Mercurial except for automatically figuring out what default email address to do (that’ll come in a few days, I think). One feature Mercurial has had for awhile that Darcs hasn’t is sending multiple textual diffs as a thread, with one message per changeset. bzr doesn’t have any support for emailing patches yet, which is disappointing. Because of the strong support for this in darcs and Mercurial, people running those systems feel less of a need to publish their repos.

[Update: There is a plugin for bzr that seems to address some of this. I haven’t tested it, and it’s not in bzr core (so doesn’t quite meet my very friendly for a newbie requirement), but this does exist, though apparently not as advanced as Mercurial]

Performance. Supposedly 0.15 is supposed to be better on this, but even if bzr achieves the claimed doubling of performance, most benchmarks I have seen would rate it as still being significantly behind git and Mercurial, though it may overtake darcs in some tests.

Extensive documentation. I would say that bzr’s docs are better in some ways than git’s (its tutorials especially), but lack depth. If you want to know some detail about how the repository works on-disk, it’s not really documented. Darcs still has David’s excellent manual, and Mercurial has the hg book which is still great as well.

Merging not as advanced. darcs is pretty obviously way on top here, but of the others, Mercurial does a pretty good job with its automatic handling of renames and automatic resolving of different branches that commit the same change (even if that same change is a rename, or an add of the same content). bzr can’t resolve as much automatically.

Summary

Well, I’ll say that bzr still doesn’t look compelling enough for my use cases to use, and the lack of an easy-for-a-newbie-to-use automated email submission feature is a pretty big disappointment. Though I did appreciate the time those on spent with me, and if I needed to sync with svn users frequently, I’d probably choose bzr over git.

For now, I’m happy with sticking with darcs for my code and hg for my Debian work.

But all four communities are aggressively working on their weaknesses, and this landscape may look very different in a year.

More on Git, Mercurial, and Bzr

I’ve been writing a lot about this lately, I know, but it’s an interesting landscape.

I had previously discarded git, but in light of git-cvsserver (which provides a plausible way for Windows people to participate), I gave it a try.

The first thing I noticed is that git documentation, in general, is really poor. Some tutorials that claim to cover git actually cover cogito. Still others use commands that are much more complex than those in the current git — and these just the ones linked to from the git homepage.

git’s manpages aren’t much better. There are quite a few git commands (such as log) that take arguments that other git commands accept. Sometimes this fact is documented with a pointer to these other commands, but often not; a person is left guessing what the full range of accepted arguments are.

My complaint that git is overly complex still exists. They’ve made progress, but still have a serious issue here. Part is because of the docuemtnation, and part is because of the interface. I wanted to export to diffs all patches on the current branch in a repo. I asked on , and someone suggested using the revision specifier ..HEAD. Nope, didn’t work. A few other git experts chimed in, and none could come up with the correct recipe. I finally used -500, which worked but is hackish.

git’s lack of even offering support for a human to indicate renames also bothers me, though trustworthy people have assured me that it doesn’t generally cause a problem in practice.

git does have nicer intra-repo branching than Mercurial does, for the moment. But the Mercurial folks are working on that anyway, and branching to new directories still works fine for me.

But in general, git’s philosophy is to make things easy for the upstream maintainer, and doesn’t spend much effort making things easy for contributors (except to make it mildly easier to contribute to a large project like Linux). Most of my software doesn’t have a large developer community, and I want to make it as easy as possible for new developers to join in and participate. git still utterly fails on that.

I tried bzr again. It seems that every time I try it, after just a few minutes, I am repulsed. This time, I stopped when I realized that bzr doesn’t support tags and has no support for emailing changesets whatsoever. As someone that has really liked darcs send (and even used tags way back with CVS!), this is alarming. The tutorial on the bzr website referenced a command “bzr help topics”, which does not work.

So I’ll stick with my mercurial and darcs combination for now.

I announced the first version of a hg send extension yesterday as well. I think Mercurial is very close to having a working equivalent to darcs send.

Mercurial & Git

About two weeks ago, I wrote about my thoughts on Mercurial and how I was switching to it from Darcs.

At the time, I had skipped Git because of its lack of Windows support. I have some contributors to pieces of Free Software that I write that use Windows, and that seemed a pretty big flaw.

But I recently discovered git-svn and git-svnimport, both of which look like great tools for working with our friends using svn that haven’t yet gotten ahold of the DVCS light. Then I noticed that Git has a CVS server emulation tool, which means that Windows users can use TortoiseCVS to interact with it. Nice.

I spent some time today learning Git. This was a lot easier having already learned Mercurial. Git and Mercurial have very similar philosophies to a number of things, but the Mercurial documentation explains all this far better than the Git documentation does.

I’m going to have to try both of them out more and see what I think. But git-svn (which is bi-directional) certainly looks like a very nice thing.

Neither of them have something as nice as darcs send, though.

Farm Living Update

Well, we’ve been back in the country for about 2 months now. I figure it’s about time to write about what’s been going on lately around here.

The big controversy is about the county jail. Apparently the county is sharply divided about this. People are angry. Profanity has been uttered at county commission meetings. Some people want to build a new, larger county jail because the current jail has been overcrowded for years. Some don’t see any problem with the current situation. Others want to close our county jail entirely and pay other counties to house our prisoners, saying that we usually have less than 6 prisoners total.

Yes, in all seriousness, the county is all abuzz about our jail where a population of 6 means overcrowding.

The weather has been getting warmer and that means an increase in traffic. Today I met two cars on the roads near our house — one in the morning and one in the afternoon. That’s a new single-day record. Usually I don’t meet that many vehicles in a week.

And our local high school boys’ basketball team made it to the state tournament for the first time since the late 80s. That was quite something. It’s probably been years since our school had one of their games broadcast live on the radio. And probably about that long since any local business bothered to advertise on the radio. It even got mentioned in a sermon at church. (They took 4th in the state — congrats!)

I also have prepared this helpful chart for you explaining a few differences about living out here.

Item City Country
Check this before leaving home Traffic report, so you can avoid the big 5-car pileup on the Interstate Weather report, so you can avoid the roads that are impassible if it rained last night
You might comment on this when you get home in the evening Three of the cars in the daily 5-car pileup were on fire and there was gas on the roadway and helicopters everywhere and you drove right past it Someone drove down our road at night
Always yield to… Trains, school buses Escaped cows and those trying to catch them
Neighbors will be mad if… You are blaring loud music at 3AM Monday night You notice the gate to their pasture is open and you don’t tell them
Neighbors will not notice if… A car drives by at night You are blaring loud music, anytime
Minor everyday dangers Maniac drivers, drug dealers, Taco Bell Cow pies, electric fences, thistle infestations
Seasonal events that prolong commute time Indianapolis 500 Harvest
Bank tellers ask you… What your account number is, and could you give them a photo ID with that How your remodel has been coming
Bank presidents… Never spek to you Ask about your brothers
Distance from house to mailbox… 50 feet or less 1 mile or less
Your car is sporty if… It can do 0-60 in a respectable time It can do 0-60, then slow back to a stop, before leaving your driveway
Power flickers during Hurricanes and tornados Wind
Water meters read by Computer or city employees Yourself; you write the reading on your payment stub each month, if you are lucky enough to qualify for a water service
Free meals attainable by… Using a 2-for-1 coupon Attending the annual business meeting for your electric company
A good time for fundraising is… End of year so people can get a tax deduction on that year’s taxes Just after harvest
Fundraising benchmarks include… We have less than the price of a new house to raise! We have less than the price of a new combine to raise!

Want to try living in vim

I’ve been an Emacs user for many years, though of course I know some vi and vim commands out of necessity.

I want to try taking the plunge by spending a month using vim only, no Emacs.

Sadly the vim documentation isn’t very helpful for me in a number of areas. I’m hoping someone can point me to some resources or recipes that will help with:

  • Turning off that stupid “hide most of the Debian changelog” thing. I have no idea why it does that or how to make it stop.
  • Turn on or off autoindent, syntax highlighting, etc. in various languages (really, I want to set global defaults for all of them)
  • Be able to edit another file without closing or saving the first (:e doesn’t seem to do what I want)
  • Integrate it with Mercurial and Darcs

Re-Examining Darcs & Mercurial

I recently wrote an article or two about distributed version control systems.

I’ve been using Darcs since 2005. I switched to Darcs, in fact, 10 days after the simultaneous founding announcements of git and Mercurial.

Overall, I have been happy. I continue to believe that it is the most distributed of the distributed VCSs, which is a Good Thing.

However, I have lately started having trouble with Darcs hanging while working on my Debian packages. My post to the Darcs user list drew out a few other people whith this problem, which is a design flaw of Darcs.

So I revisited the VCS landscape. I re-examined git, Mercurial, and bzr. I eventually decided to give Mercurial a try. I avoided git because I write some code that is portable to Windows, and git isn’t (or isn’t very well). Also, git is complex to pick up for me, and I certainly don’t want to force something complex onto my contributors. bzr seemed to still have some strange behaviors that it’s had for awhile, and I couldn’t find even one advantage of it over Mercurial. So off I went with Mercurial.

I quickly learned a bit of a philosophical difference from Darcs to Mercurial.

Darcs avoids conflicts at all costs. Mercurial makes handling conflict easy and, in many cases, automatic.

It is exactly this Darcs behavior that permits both is excellent “darcs send” feature (still unmatched in any other VCS), but also causes its hang problems.

I found Mercurial quite pleasant to work with, and *fast*. It seems to be edging out git in speed tests sometimes these days.

It is easy to get started with Mercurial. The mq system — similar to quilt or other patch-management programs — is really quite an amazing hybrid between patch management and version control. I frankly don’t see any need for other patch-management tools anymore.

Mercurial has a “patchbomb” feature where you can select a range of changesets to send off, and it will generate nice emails with one changeset per email, and send them to your selected destination, optionally with an introductory message. The normal way of interacting with other Mercurial users is via the hg export/import commands, which send around simple unified diffs plus some additional header information, optionally in the git extended diff format.

I am happy with Mercurial and am in the process of converting my Debian repositories from Darcs to Mercurial. I’m going to keep my personal code in Darcs for the moment because “darcs send” is still easier than “hg email”, but that may change before long, depending on how my experience goes.

I’d encourage others to give Mercurial a try. The community is also very nice and helpful.

I have contributed patches to Tailor to make it make exact copies of Darcs repos into Mercurial, which are now in its Darcs repo. There is also a thread on the Mercurial list with some of my initial questions/concerns coming from a Darcs perspective.

A better environment for shell scripting

Shell scripts are good for a lot of things. It’s quick and easy to design shell scripts that take input from one program, pass it to another program, munge it for filenames, etc.

But there are a few drawbacks to shell scripts.

The drawback, in my opinion, is that it is extremely difficult to get quoting and escaping right. I often see things like $@ in shell scripts (breaks if a parameter has a space in it). I also see people failing to check for errors properly (set -e helps that). It’s also difficult to do a more modern style of exception handling (do a sequence of actions in a temporary directory, and always remove that directory, even if there’s an error, but stop processing and propogate the error). Command-line parsing is esoteric and odd, even with getopt. That’s not to say that it’s impossible to make a secure shell script that handles filenames with spaces in them properly. Just that it’s difficult, and makes using common operators like backticks difficult.

Awhile back, I toyed with the idea of making Haskell a shell scripting language. This week, I spent some time to make this a reality. I released HSH, a shell scripting environment for Haskell.

HSH makes it easy to run shell commands, set up pipelines, etc. straight from Haskell. You can either use simple strings to invoke commands (they’ll be passed to sh -c), or you can specify arguments as a list (like exec…() takes), which eliminates the strange filename problems.

But the really cool thing is that HSH doesn’t just let you pipe from one external program to another. It also lets you pipe to/from pure Haskell functions. Yes, you can pipe the output of ls -l straight into a Haskell version of grep. I’ve found it to be very nice, especially for more complex processing tasks.

I put these simple examples on the HSH homepage:

run $ "echo /etc/pass*" :: IO String
 -> "/etc/passwd /etc/passwd-"

runIO $ "ls -l" -|- "wc -l"
 -> 12

runIO $ "ls -l" -|- wcL
 -> 12

In this example, wcL is a pure-Haskell line-counting function.

The results were surprising. According to SLOCCount, porting hg-buildpackage from a shell script to a HSH script achieved a 20% reduction in source lines of code. And at the same time, gained better error handling, better safety of filenames, better type safety (compile-time type checking), etc. Yet it does exactly the same thing in almost exactly the same way.

Even greater savings will occur too. I decided to reimplement a small part of sed just for fun, and that code is still in my tree. If I removed that and replaced it with a call to sed as in the shell version, that would probably buy another 5% savings.

I didn’t really expect to achieve a reduction in lines of code. I thought that I’d be lucky to come close to breaking even. After all, who’d expect something other than the shell to be better at shell scripting?

I don’t know if these results are generalizable, but I’m really excited about it.