Git looks really nice, until….

February 24th, 2008

So I have been learning about Git this weekend. It has some really nice-looking features for sure — some things Mercurial doesn’t have.

I was getting interested in switching, until I found what I consider a big problem.

Many projects that use git require you to submit things using git-format-patch instead of pushing/pulling from you. They don’t want your merge history.

git-format-patch, though, doesn’t preserve SHA1s, nor does it preserve merges.

Now, say we started from a common base where line 10 of file X said “hi”, I locally changed it to “foo”, upstream changed it to “bar”, and at merge time I decide that we were both wrong and change it to “baz”. I don’t want to lose the fact that I once had it at “foo”, in case it turns out later that really was the right decision.

When we track upstream changes, and submit with git format-patch, the canonical way to merge upstream appears to be:

git fetch; get rebase origin/master

Now, problem with that is it loses your original pre-conflict code on a case like this.

There appears to be no clean way around that whatsoever. I tried a separate “submission” branch, that rebases a local development-with-merge branch, but it requires a ton of git rebase –skip during the rebase process.

Thoughts?

Categories: Programming

Leave a comment

Comments Feed10 Comments

  1. Yaroslav Halchenko

    development-with-merge

    Pardon my ignorance — may be I didn’t comprehend the problem in its entirety since I do not usually submit lots of patches off git repository.

    Why not to keep branch upstream from which is branch-off your upstream-devel from which you generate and submit patches which modified or not accepted virtually in upstream (which you fetch and rebase or I just conventionally do merge origin/upstream which should result in simple rebase since I don’t modify upstream directly). Then simply merge origin/upstream also in your upstream-devel. If there was a change to your patch — there would be a conflict in that place which you can explicitly ‘resolve’ once and forever, and continue your hacking?
    or if you hacked on top of it before upstream absorbed it, then branch-off temporary branch at the moment where you submitted your patch (let that commit be aaaaaaaa and new branch name upstream-devel-temp), merge origin/upstream into upstream-devel-temp, and then rebase your changes on top of merged upstream? (then remove prev upstream-devel and rename upstream-devel-temp into upstream-devel)…

    or am I too confused and thus confusing others? ;-)

    Reply

  2. Anonymous

    I have never run into a project that won’t accept a git pull rather than a mailed patch. In any case, that sounds like an issue of human policy, not git mechanism. Git *can* create a self-contained construct that includes history; see “git bundle”. However, if people refuse anything except the output of “git format-patch”, then I don’t see what Git could do about it. Furthermore, some people prefer a linear history, and use “rebase” rather than “merge” unless the merge has some significance to them.

    Reply

  3. Jan Hudec

    It’s obviously not a Git problem, but a policy problem. It is just that Git makes that policy easy, because it’s Linus’ policy (which does not mean the pull/push policy wouldn’t be easy — it still is).

    I recall seeing a long mail from Linus (I can’t think of good terms to find it right now though), where he argues, that recording the work in progress in history is worthless and only the final changes, split into logical chunks, should be recorded.

    Linus compared it to math assignment — you have a lot of papers with various calculations, but you only hand in the result and the direct calculation that leads to it. The rest is not interesting. And patches should be the same — you do the work in random order, change your mind a few times in the process, but you should only submit the final change, split into logical chunks. That makes it easier to review the changes. It eventually even makes the history more useful, because it is simpler.

    Besides the history being easier to read , one practical upshot is, that you can then effectively bisect the history when looking for when bug was introduced. If you included the work in progress, the bisect would wind up somewhere in the middle of that with something hard to understand if it worked at all, because the guilty commit might be a work in progress state where something else is broken that prevents you from testing what you want.

    Reply

    Pierre Thierry Reply:

    It’s very strange that Linus thinks that WIP history is useless. In my experience, it’s invaluable when it comes to understanding code, especially when it’s broken or unusual.

    Epistemology also somewhat contradicts that. There’s far more about science than its mere results. And often the path to the results is very complicated, sometimes quite convoluted. Having people think that the path of ideas that led to the results is the one exhibited in publications is detrimental to science.

    Reply

    John Goerzen Reply:

    Yes, I quite agree with this. If I implemented something in a way that I discover is broken 2 years later, knowing my path to that particular implementation is often quite interesting. And useful.

    Reply

    Anonymous Reply:

    Major tradeoff, though: you lose the ability to grab a random version in the middle, compile it, and run it.

    John Goerzen Reply:

    I don’t think so. That assumes a “don’t commit until it compiles” policy. Or at least a “don’t keep commits that don’t compile” policy.

    As far as I can see, such a policy is a huge loss, because you lose all that development history. Which, let’s face it, is what a DVCS is supposed to help with.

    This is more suited for a stable vs. development branch sort of thing.

    Karl Hasseström Reply:

    The kind of history rewriting that Linus et al advocate is to rewrite
    near-term history so that it’s pretty, but not touch long-term
    history.

    The idea being that the mistakes you make and correct in a single
    afternoon (or maybe a week) are going to look more like noise than
    information in the long run, and you’re better off just recording
    history that’s designed to be easy to read. Whereas longer term, you
    can’t keep the history in your head anymore, so you use the tool to
    record the history that actually happened.

    The end result is that when you look at the history, it’ll be
    carefully composed in the small, and historically accurate in the
    large.

    It’s all rather like writing a diary. You edit and revise all you want
    as you write each entry, but don’t touch the older ones.

  4. Rémi Vanicat

    Well, If you only want to save your history, you can do something like:

    git tag before-rebase-$(date +”%Y%m%d”)

    then rebase. Git will remember the old branch, but use the new one.

    You could also do use stacked git (stg). With stg you maintain a seri of patch on top of upstream, rebasing them when needed. Stg will remember, separately, the history of each patch.

    Reply

  5. Jonathan Swarthin

    > Many projects that use git require you
    > to submit things using git-format-patch
    > instead of pushing/pulling from you.
    > They don’t want your merge history.

    In typical blogger fashion you assume too much.

    Reply

Leave a comment

 

Feed

http://changelog.complete.org / Git looks really nice, until….