All posts by John Goerzen

Mercurial & Git

About two weeks ago, I wrote about my thoughts on Mercurial and how I was switching to it from Darcs.

At the time, I had skipped Git because of its lack of Windows support. I have some contributors to pieces of Free Software that I write that use Windows, and that seemed a pretty big flaw.

But I recently discovered git-svn and git-svnimport, both of which look like great tools for working with our friends using svn that haven’t yet gotten ahold of the DVCS light. Then I noticed that Git has a CVS server emulation tool, which means that Windows users can use TortoiseCVS to interact with it. Nice.

I spent some time today learning Git. This was a lot easier having already learned Mercurial. Git and Mercurial have very similar philosophies to a number of things, but the Mercurial documentation explains all this far better than the Git documentation does.

I’m going to have to try both of them out more and see what I think. But git-svn (which is bi-directional) certainly looks like a very nice thing.

Neither of them have something as nice as darcs send, though.

Farm Living Update

Well, we’ve been back in the country for about 2 months now. I figure it’s about time to write about what’s been going on lately around here.

The big controversy is about the county jail. Apparently the county is sharply divided about this. People are angry. Profanity has been uttered at county commission meetings. Some people want to build a new, larger county jail because the current jail has been overcrowded for years. Some don’t see any problem with the current situation. Others want to close our county jail entirely and pay other counties to house our prisoners, saying that we usually have less than 6 prisoners total.

Yes, in all seriousness, the county is all abuzz about our jail where a population of 6 means overcrowding.

The weather has been getting warmer and that means an increase in traffic. Today I met two cars on the roads near our house — one in the morning and one in the afternoon. That’s a new single-day record. Usually I don’t meet that many vehicles in a week.

And our local high school boys’ basketball team made it to the state tournament for the first time since the late 80s. That was quite something. It’s probably been years since our school had one of their games broadcast live on the radio. And probably about that long since any local business bothered to advertise on the radio. It even got mentioned in a sermon at church. (They took 4th in the state — congrats!)

I also have prepared this helpful chart for you explaining a few differences about living out here.

Item City Country
Check this before leaving home Traffic report, so you can avoid the big 5-car pileup on the Interstate Weather report, so you can avoid the roads that are impassible if it rained last night
You might comment on this when you get home in the evening Three of the cars in the daily 5-car pileup were on fire and there was gas on the roadway and helicopters everywhere and you drove right past it Someone drove down our road at night
Always yield to… Trains, school buses Escaped cows and those trying to catch them
Neighbors will be mad if… You are blaring loud music at 3AM Monday night You notice the gate to their pasture is open and you don’t tell them
Neighbors will not notice if… A car drives by at night You are blaring loud music, anytime
Minor everyday dangers Maniac drivers, drug dealers, Taco Bell Cow pies, electric fences, thistle infestations
Seasonal events that prolong commute time Indianapolis 500 Harvest
Bank tellers ask you… What your account number is, and could you give them a photo ID with that How your remodel has been coming
Bank presidents… Never spek to you Ask about your brothers
Distance from house to mailbox… 50 feet or less 1 mile or less
Your car is sporty if… It can do 0-60 in a respectable time It can do 0-60, then slow back to a stop, before leaving your driveway
Power flickers during Hurricanes and tornados Wind
Water meters read by Computer or city employees Yourself; you write the reading on your payment stub each month, if you are lucky enough to qualify for a water service
Free meals attainable by… Using a 2-for-1 coupon Attending the annual business meeting for your electric company
A good time for fundraising is… End of year so people can get a tax deduction on that year’s taxes Just after harvest
Fundraising benchmarks include… We have less than the price of a new house to raise! We have less than the price of a new combine to raise!

Want to try living in vim

I’ve been an Emacs user for many years, though of course I know some vi and vim commands out of necessity.

I want to try taking the plunge by spending a month using vim only, no Emacs.

Sadly the vim documentation isn’t very helpful for me in a number of areas. I’m hoping someone can point me to some resources or recipes that will help with:

  • Turning off that stupid “hide most of the Debian changelog” thing. I have no idea why it does that or how to make it stop.
  • Turn on or off autoindent, syntax highlighting, etc. in various languages (really, I want to set global defaults for all of them)
  • Be able to edit another file without closing or saving the first (:e doesn’t seem to do what I want)
  • Integrate it with Mercurial and Darcs

Re-Examining Darcs & Mercurial

I recently wrote an article or two about distributed version control systems.

I’ve been using Darcs since 2005. I switched to Darcs, in fact, 10 days after the simultaneous founding announcements of git and Mercurial.

Overall, I have been happy. I continue to believe that it is the most distributed of the distributed VCSs, which is a Good Thing.

However, I have lately started having trouble with Darcs hanging while working on my Debian packages. My post to the Darcs user list drew out a few other people whith this problem, which is a design flaw of Darcs.

So I revisited the VCS landscape. I re-examined git, Mercurial, and bzr. I eventually decided to give Mercurial a try. I avoided git because I write some code that is portable to Windows, and git isn’t (or isn’t very well). Also, git is complex to pick up for me, and I certainly don’t want to force something complex onto my contributors. bzr seemed to still have some strange behaviors that it’s had for awhile, and I couldn’t find even one advantage of it over Mercurial. So off I went with Mercurial.

I quickly learned a bit of a philosophical difference from Darcs to Mercurial.

Darcs avoids conflicts at all costs. Mercurial makes handling conflict easy and, in many cases, automatic.

It is exactly this Darcs behavior that permits both is excellent “darcs send” feature (still unmatched in any other VCS), but also causes its hang problems.

I found Mercurial quite pleasant to work with, and *fast*. It seems to be edging out git in speed tests sometimes these days.

It is easy to get started with Mercurial. The mq system — similar to quilt or other patch-management programs — is really quite an amazing hybrid between patch management and version control. I frankly don’t see any need for other patch-management tools anymore.

Mercurial has a “patchbomb” feature where you can select a range of changesets to send off, and it will generate nice emails with one changeset per email, and send them to your selected destination, optionally with an introductory message. The normal way of interacting with other Mercurial users is via the hg export/import commands, which send around simple unified diffs plus some additional header information, optionally in the git extended diff format.

I am happy with Mercurial and am in the process of converting my Debian repositories from Darcs to Mercurial. I’m going to keep my personal code in Darcs for the moment because “darcs send” is still easier than “hg email”, but that may change before long, depending on how my experience goes.

I’d encourage others to give Mercurial a try. The community is also very nice and helpful.

I have contributed patches to Tailor to make it make exact copies of Darcs repos into Mercurial, which are now in its Darcs repo. There is also a thread on the Mercurial list with some of my initial questions/concerns coming from a Darcs perspective.

A better environment for shell scripting

Shell scripts are good for a lot of things. It’s quick and easy to design shell scripts that take input from one program, pass it to another program, munge it for filenames, etc.

But there are a few drawbacks to shell scripts.

The drawback, in my opinion, is that it is extremely difficult to get quoting and escaping right. I often see things like $@ in shell scripts (breaks if a parameter has a space in it). I also see people failing to check for errors properly (set -e helps that). It’s also difficult to do a more modern style of exception handling (do a sequence of actions in a temporary directory, and always remove that directory, even if there’s an error, but stop processing and propogate the error). Command-line parsing is esoteric and odd, even with getopt. That’s not to say that it’s impossible to make a secure shell script that handles filenames with spaces in them properly. Just that it’s difficult, and makes using common operators like backticks difficult.

Awhile back, I toyed with the idea of making Haskell a shell scripting language. This week, I spent some time to make this a reality. I released HSH, a shell scripting environment for Haskell.

HSH makes it easy to run shell commands, set up pipelines, etc. straight from Haskell. You can either use simple strings to invoke commands (they’ll be passed to sh -c), or you can specify arguments as a list (like exec…() takes), which eliminates the strange filename problems.

But the really cool thing is that HSH doesn’t just let you pipe from one external program to another. It also lets you pipe to/from pure Haskell functions. Yes, you can pipe the output of ls -l straight into a Haskell version of grep. I’ve found it to be very nice, especially for more complex processing tasks.

I put these simple examples on the HSH homepage:

run $ "echo /etc/pass*" :: IO String
 -> "/etc/passwd /etc/passwd-"

runIO $ "ls -l" -|- "wc -l"
 -> 12

runIO $ "ls -l" -|- wcL
 -> 12

In this example, wcL is a pure-Haskell line-counting function.

The results were surprising. According to SLOCCount, porting hg-buildpackage from a shell script to a HSH script achieved a 20% reduction in source lines of code. And at the same time, gained better error handling, better safety of filenames, better type safety (compile-time type checking), etc. Yet it does exactly the same thing in almost exactly the same way.

Even greater savings will occur too. I decided to reimplement a small part of sed just for fun, and that code is still in my tree. If I removed that and replaced it with a call to sed as in the shell version, that would probably buy another 5% savings.

I didn’t really expect to achieve a reduction in lines of code. I thought that I’d be lucky to come close to breaking even. After all, who’d expect something other than the shell to be better at shell scripting?

I don’t know if these results are generalizable, but I’m really excited about it.

Rebase Considered Harmful

Today I was musing about different version control systems and merge algorithms. I’ve been thinking specifically about how I maintain Debian packages in Darcs. I tend to import upstream tarballs into one branch, and maintain the Debian packages in another, simply merging when a new upstream is released.

Now, there seem to be two prevailing philosophies on how to handle merges in this case. I’m thinking here about merges back to upstream. Say I want to contribute my Debian patches to them.

  1. Commit “clean” patches upstream. Don’t have a bunch of history — the fixing typos commits, the fixing bugs commits, or the merging to track new upstream releases. Just something like a series of diffs against the current head.
  2. Bring across the full history, warts and all, and keep it around permanently.

git encourages option , with its rebase option. Darcs encourages option (though some use its amend-record option to work more like ).

As I got to thinking about it, it occured to me that git-rebase would be very nice if you are going to use philosophy . In short, rebase will remove your local patches from a repo, update it to the latest upstream, then re-apply your local changesets — aborting to have you fix any conflicts. This is as opposed to a more traditional merge, where you add the upstream changesets to your local branch and then commit new changesets to resolve conflicts. (So a rebase would be totally useless in situation )

I got to thinking about this, and started wondering what would happen to people that I’m working with that in turn work off my branches. And sure enough, the git-rebase manpage says, “When you rebase a branch, you are changing its history in a way that will cause problems for anyone who already has a copy of the branch in their repository and tries to pull updates from you.”

I maintain, therefore, that git-rebase is evil and should be avoided. It only works for a situation where someone maintains a private branch of a project, never shared in any way except to submit patches to an upstream. Forget it if you have a team maintaining that branch, or want to post that branch online for others to help with (as I do with my Debian darcs package). Even if you keep it private now, do you really want to adopt a work process that forces you to keep it private forever, or else completely change how you work?

And this brings me back to the original question of patch philosophy. Personally, I dislike philosophy . I’d much rather have the full history of a change, warts and all. Look at the Linux kernel example: changesets that introduced bugs that made it into the official tree have their fixes documented, but changesets that introduced bugs that were fixed before being merged into the official tree could be lost to the public due to rebasing by submitters. Is that really what we want? I don’t think so.

With Darcs, tagging is very cheap and it is quite trivial to write an “apply a changeset bundle” script that makes a before tag, applies a series of patches, and makes an after tag. One could then run a darcs diff between the two tags to see the net effect on the repository, or could still look at the individual patches. (Or, you can avoid tagging and manually specify the “from” and “to” patches.) I find that a much better model: you can have it both ways. I’d think that most modern VCSs ought to support some variant on that, too.

And I think that git-rebase should be removed on the grounds that it encourages poor version tracking practices.

Haskell Time Travel

There is something very cool about a language in which the easiest, most direct way to explain how it solves a problem is to say, “When we pass the output of [this function] into the input for the oracle we are actually sending the data backwards in time. So when [the code] queries the oracle we get a result from the future.”

Sweet.

The story goes on to say, however, “Time travel is a very dangerous business. One false move and you can create a temporal paradox that will destroy the universe (which in this case means that the computation will diverge). When programming with values from the future, it is important never, never, to do anything with the values that might change the future. This is the temporal prime directive.”

Dial Tone

Yesterday I went to activate phone service out at the farm. It got me to thinking a bit about how things change, and how they stay the same, too.

Before I go on, I’ll have to say that every one of my history books is in storage, so if I get some details wrong, it’s because I’m not remembering correctly.

Anyhow, phone service came to our community via an unusual route about 100 years ago. It wasn’t Bell/AT&T or some other company that brought it there, as it was most places. I’m sure they figured that a small, scattered rural community would cost too much to support. So the community organized, built, and supported the phone system themselves.

Even today, roads around here can be impassible after a good rain. I’m sure that, in the early 1900s, before heavy road-maintaining machinery, things were worse — and, of course, transportation was a lot slower then anyway. There were real problems: getting the word out about funerals, being able to summon a doctor when necessary, or letting people know that church was cancelled because of too much snow.

People in the community saw a phone system as a real need. So did the churches, which have left a legacy that is still reflected in phone company territories today.

Once phone service arrived, it was used for all the things that people expected, of course. But it also proved to be an important part of the social fabric of the community. Since party lines were the norm, it was possible to announce things to every listening subscriber pretty quickly. Older people remember announcements of fresh fruit arriving at the grocery store, funerals, or other news of the day.

To place a call, you would pick up your phone and turn your crank. That caused a bell to ring at the telephone office, which everyone called “Central.” The operator would connect to your line and ask whom you wanted to talk to. The operator would then send the distinctive ring for your party down their party line, and patch — manually — your call through to them. And, if he was busy, the operator wouldn’t listen in on your conversation — but others on the party line very well might.

Central’s hours were published. If you were making a call in the middle of the night, you were going to wake up someone at Central to do it — plus everyone on the entire party line. So calls after hours were rare.

Fortunately, while some of the old Central operators were still around, some people in the community wrote down some of their stories.

There were some people in the community that were notorious for eavesdropping on other people’s conversations. Two brothers one time figured that they knew somebody was listening to their conversations, so they devised a code. One called the other, and said, “I’ll be going to McPherson in the morning for band practice.” That meant something along the lines of going to town to buy groceries.

A few days later, their prime suspect came up to him and said, “What on earth are you going to band practice for? I didn’t know you knew how to play an instrument!” Apparently she realized she was had when he burst out laughing uncontrollably.

The Central operators learned to know the habits of telephone users. Sometimes they would connect calls without even bothering to ask who people wanted to talk to — and seemed to always get it right.

The phone system supported itself for about 50 years. But as the rest of the world moved on provide direct dialing, this proved a controversial subject in the community. People liked having their operators. The people that worked at Central were everybody’s friend. They were people that were there, 24 hours a day, to assist with any emergency. They would gather volunteer firefighters to help fight a fire, or be able to spread community news quickly. This wouldn’t be available with the newer phone systems. How would the community be informed of events quickly now? Who would just happen to know whose house the doctor was at when he was urgently needed?

The change was resisted for some years, but eventually the finances of the telephone cooperative turned out to be in deep trouble. Operators grew to be much more expensive than automation, and in the late 1960s, the telephone cooperative was no more — sold to a phone company from a small town more than twice our size, and a for-profit company at that! Central no longer existed. I remember reading about this event — it seems people were sad about that for quite some time. They felt that they had really lost an important part of the community when Central went away. Some machine locked in a cabinet doesn’t care for people the way Central did. Even today, the older people in the community sound a little sad when they remember telephone modernization, and get the wistful look of somebody that has just remembered something that they miss.

The phone company that bought the system wasn’t an AT&T, though. It was a small, independent phone company. To this day, that phone company serves only the two communities. And it was this company that I called yesterday to establish service out at our house.

They had already upgraded our lines — over a mile of new copper, benefiting only us, at no charge to us — last fall. The box was already on the outside of the house. Just need to get it activated.

So I called the phone company. They said I needed to drop by their office and sign some papers. Uh-oh, I think — this is a bad sign. Sounds like a bunch of phone company bureaucracy.

But not so much. I went to the office and signed up. They asked the usual questions: name, address. Plus a few that bigger companies wouldn’t ask: who used to have service at that address? Of course, most people would know that answer in our community. I couldn’t have told you in Wichita, Dallas, or Indianapolis.

Then they asked when I’d like service to be activated. “As soon as possible,” I say, figuring that this would be a couple of weeks like it is with AT&T or Sprint. “Well, we probably can’t get out there for a couple of hours. Would it be OK if it’s on at about 3?” Yes, that would be fine!

Now, how about DSL? “Well, we’re a little backed up on that right now.” Uh-oh. Sprint took several weeks when they *weren’t* more backed up than usual. “So it’ll probably be Monday or Tuesday before we can get out there. Should I just have the installer call you and arrange a time when it gets closer?” Yes, that would be fine, too!

Now, how about finding a phone number.

Out comes a large paper book. Yep, paper. They paged through it, and told me that my grandpa’s old number would be available if I wanted it. I said yes — after all, we’ve got his old address, so might as well keep the same phone number. OK, no problem. She whips out some white-out, whites out grandpa’s name, and writes ours in. Done.

Now, do I want any optional services? Caller ID, call waiting, voicemail? How much is caller ID, I ask. $5 a month. We’ll try it for now. “OK”. A box was checked on the form and that was that. No high-pressure sales pitch on taking “the works” for some poorly-disclosed price, providing a ton of services I’ll never use and don’t want. No confusing “discounts” for having The Works and DSL at the same time.

Then I ask about an unlisted number, or at least an unlisted address. I figured that anybody that really wanted to be able to reach us will figure out how without using a phone book, and these things get in so many databases these days. Sprint charged almost $10/mo for a fully unlisted number, but only a few dollars a month to just keep our address off the directories.

Our new company charged 50 cents a month for a fully unlisted number. Done.

Now it’s time to pay for the first month’s fees and the setup. Oops, I’ve forgotten my checkbook in the car. No problem, the secretary says, I’ll watch your baby while you go get it! Jacob was with me, but had fallen asleep, so I brought him inside in his car seat. I went to get the checkbook — just out the door and close by. I was back in a few seconds later, and the secretary was already on the other side of her desk talking and playing with Jacob. “My baby’s 12 now,” she said, and for a second, looked like a person that was remembering Central.