Today I was musing about different version control systems and merge algorithms. I've been thinking specifically about how I maintain Debian packages in Darcs. I tend to import upstream tarballs into one branch, and maintain the Debian packages in another, simply merging when a new upstream is released.
Now, there seem to be two prevailing philosophies on how to handle merges in this case. I'm thinking here about merges back to upstream. Say I want to contribute my Debian patches to them.
- Commit "clean" patches upstream. Don't have a bunch of history -- the fixing typos commits, the fixing bugs commits, or the merging to track new upstream releases. Just something like a series of diffs against the current head.
- Bring across the full history, warts and all, and keep it around permanently.
git encourages option #1, with its rebase option. Darcs encourages option #2 (though some use its amend-record option to work more like #1).
As I got to thinking about it, it occured to me that
git-rebase would be very nice if you are going to use philosophy #1. In short, rebase will remove your local patches from a repo, update it to the latest upstream, then re-apply your local changesets -- aborting to have you fix any conflicts. This is as opposed to a more traditional merge, where you add the upstream changesets to your local branch and then commit new changesets to resolve conflicts. (So a rebase would be totally useless in situation #2)
I got to thinking about this, and started wondering what would happen to people that I'm working with that in turn work off my branches. And sure enough, the git-rebase manpage says,
"When you rebase a branch, you are changing its history in a way that will cause problems for anyone who already has a copy of the branch in their repository and tries to pull updates from you."
I maintain, therefore, that git-rebase is evil and should be avoided. It only works for a situation where someone maintains a private branch of a project, never shared in any way except to submit patches to an upstream. Forget it if you have a team maintaining that branch, or want to post that branch online for others to help with (as I do with my Debian darcs package). Even if you keep it private now, do you really want to adopt a work process that forces you to keep it private forever, or else completely change how you work?
And this brings me back to the original question of patch philosophy. Personally, I dislike philosophy #1. I'd much rather have the full history of a change, warts and all. Look at the Linux kernel example: changesets that introduced bugs that made it into the official tree have their fixes documented, but changesets that introduced bugs that were fixed before being merged into the official tree could be lost to the public due to rebasing by submitters. Is that really what we want? I don't think so.
With Darcs, tagging is very cheap and it is quite trivial to write an "apply a changeset bundle" script that makes a before tag, applies a series of patches, and makes an after tag. One could then run a darcs diff between the two tags to see the net effect on the repository, or could still look at the individual patches. (Or, you can avoid tagging and manually specify the "from" and "to" patches.) I find that a much better model: you can have it both ways. I'd think that most modern VCSs ought to support some variant on that, too.
And I think that git-rebase should be removed on the grounds that it encourages poor version tracking practices.
Comments
Thu, 03.07.2008 19:17
I recently was looking at opti ons for my blog, and decided t o try out blip.tv. Indeed thei r system appears to cate [...]
Thu, 03.07.2008 14:00
When HTML5 video comes out, it should be easy to host your o wn videos. My video site is here: http://video.nat [...]
Thu, 03.07.2008 12:51
You might want to check out [url="http://viddler.com"]Viddler[/url]. I have some command line tools for the API [...]
Thu, 03.07.2008 08:25
I haven't decided for sure yet . I found a nice review of some of them. [...]
Thu, 03.07.2008 07:53
What are you going to use to c apture/edit? You can have a look at kino, if you [...]
Thu, 03.07.2008 07:03
Thanks for the suggestions, ev eryone. To give a very brie f idea of what we have done: For the learning curve [...]
Thu, 03.07.2008 05:29
The original text was discussi ng whether religion is detrime ntal to science. For 1 it was putting the point that s [...]
Wed, 02.07.2008 16:15
Two primary concerns: compatib ility with other hardware, esp ecially MS servers; and ease o f staff updates and installs.