Whose Distributed VCS Is The Most Distributed?

August 10, 2006Softwaredarcs, version controlJohn Goerzen

Lately I have been trying out a number of distributed version control systems (VCS or SCM).

One of my tests was a real problem: I wanted to track the Linux 2.6.16.x kernel tree, apply the Xen patches to it, and pull only specific patches (for the qla2xxx driver) from 2.6.17.x into this local branch. I wanted also to be able to upgrade to 2.6.17.x later (once Xen supports it) and have the version control system properly track which patches I already have.

But before going on, let’s establish what it means to be an ideal distributed VCS:

1. The fundamental method of collaboration must be a branch. A checkout should mean creating a local branch on which a person can commit and work without having to involve the server. An update from some central server should take the form of a merge from that branch to the local branch.
2. Branching should be cheap. It should be easy to create a local branch, the operation to do so should be fast, and it shouldn’t take an inordinate amount of space. It should also be as easy as possible to branch from a remote repository.
3. Merging between branches is intelligent. It should be easy to merge another branch with your own. The VCS should know which changesets from the other branch are already on yours, and should not attempt to merge changesets that you have already merged previously.
4. Inividial changesets should be mergeable without bringing across the whole history. You should be able to bring across the minimum number of changesets necessary to effect a specific change. This corresponds to my test case above. Future merges from the whole branch should, of course, recognize that these changesets are present already.
5. Branching preserves full history. A branch should be a first-class copy of a repository, even if the repository is remote. It should contain the full history of the branch it was made from, including diffs for each individual changeset and full commit logs, unless otherwise requested by the user.
6. Merging preserves full history. A merge from one branch to another should also preserve full history. Changesets merged to the local branch should retain the individual, distinct diffs and commit logs for each changeset.

There are also some things that we would generally want:

7. It is possible to commit, branch, merge, and work with history offline.
8. The program is fast enough for general-purpose use.

Evaluation

Let’s look at some common VCSs against these criteria. I’ll talk about Arch (tla, baz, etc), bzr (bazaar-ng), Darcs, Git, Mercurial (hg), and Subversion (svn) for reference.

1. The fundamental method of collaboration must be a branch

All of the tools pass this test except for svn.

2. Branching should be cheap

Everyone except svn generally does this reasonably well.

The tla interface for Arch had a pretty terrible interface for this, so it took awhile simply due to all the typing involved. That’s better these days.

Darcs supports hardlinking of history to other local repositories and will do this automatically by default. Git also supports that, but defaults to not doing it, or you can store a path to look in for changesets that aren’t in the current repo. I believe Mercurial also can use hardlinks, though I didn’t personally verify that. bzr appears to have some features in this area, but not hardlinks, and the features were too complex (or poorly documented) to learn about quickly.

svn does not support branching across repositories, so doesn’t really pass this test. Branches within a repository are not directly supported either, but are conventionally simulated by doing a low-cost copy into a specially-named area in the repository.

3. Merging between branches is intelligent

Arch was one of the early ones to work on this problem. It works reasonably well in most situations, but breaks in spectacular and unintelligble ways in some other situations.

When asked to merge one branch to another, Darcs will simply merge in any patches from the source branch onto the destination which the destination doesn’t already have. This goes farther than any of the other systems, which generally store a “head” pointer for each branch that shows how far you’ve gone. (Arch is closer to darcs here, though ironically bzr is more like the other systems)

Merging between branches in svn is really poor, and has no support for recognizing changesets that have been applied both places, resulting in conflicts in many development models.

4. Inividial changesets should be mergeable without bringing across the whole history

Darcs is really the only one that can do this right. I was really surprised that nobody else could, since it is such a useful and vital feature for me.

Both bzr and git have a cherry-pick mode that simulates this, but really these commands just get a diff from the specific changeset requested, then apply the diff as with patch. So you really get a different changeset committed, which can really complicate your history later — AND lead to potential conflicts in future merges. bzr works around some of the conflict problems because on a merge, it will silently ignore patches that attempt to perform an operation that has already occured. But that leads to even more confusing results, as the merge of the patch is recorded for a commit that didn’t actually merge it. (That could even be a commit that doesn’t modify the source.) Sounds like a nightmare for later.

Arch has some support for it, but in my experience, actually using this support tends to get it really confused when you do merges later.

Neither Mercurial nor svn have any support for this at all.

5. Branching preserves full history

git, darcs, and Mercurial get this right. Making a branch from one of these repos will give you full history, including individual diffs and commit logs for each changeset.

Arch and bzr preserve commit logs but not the individual changesets on a new branch. I was particularly surprised at this shortcoming with bzr, but sure enough, a standard bzr merge from a remote branch commited three original changesets into one and did not preserve the individual history on the one commit.

svn doesn’t support cross-repo branching at all.

6. Merging preserves full history

Again, darcs, git, and Mercurial get this right (I haven’t tested this in Mercurial, so I’m not 100% sure).

Arch and bzr have the same problem of preserving commit logs, but not individual changesets. A merge from one branch to another in Arch or bzr simply commits one big changeset on the target that represents all the changesets pulled in from the source. So you lose the distinctness of each individual changeset. This can result in the uncomfortable situation of being unable to re-create full history without access to dozens of repositories on the ‘net.

Subversion has no support for merging across repositories, and its support for merging across simulated local branches isn’t all that great, either.

7. It is possible to commit, branch, merge, and work with history offline

Everyone except Subversion does a good job of this.

8. The program is fast enough for general-purpose use

All tools here are probably fast enough for most people’s projects. Subversion can be annoying at times because many more svn commands hit the network than those from others.

In my experience, Arch was the slowest. Though it was still fine for most work, it really bogged down with the Linux kernel. bzr was next, somewhere between arch and darcs. bzr commands “felt” sluggish, but I haven’t used it enough to really see how it scales.

Darcs is the next. It used to be pretty slow, but has been improving rapidly since 1.0.0 was released. It now scales up to a kernel-sized project very well, and is quite usable and reasonably responsive for such a thing. The two main things that slow it down are very large files (10MB or above) and conflicts during a merge.

Mercurial and git appear to be fastest and pretty similar in performance.

All of these tools perform best with periodic manual (or scheduled cron jobs) intervention — once a month to once a year, depending on your project’s size. Arch users have typically created a new repo each year. Darcs users periodically tag things (if things are tagged as part of normal work, no extra work is needed here) and can create checkpoints to speed checkouts over the net. git and Mercurial also use a form of checkpoints. (not sure about bzr)

Subversion works so differently from the others that it’s hard to compare. (For one, a checkout doesn’t bring down any history.)

Conclusions

I was surprised by a few things.

First, that only one system actually got #4 (merging individual changesets) right. Second, that if you had to pick losers among VCSs, it seems to be Arch and bzr — the lack of history in branching and merging is a really big issue, and they don’t seem to have any compelling features that git, darcs, or Mercurial lack. #4 was a unique feature to Darcs a few years ago, but I figured it surely would have been cloned by all the other new VCS projects that have popped up since. It seems that people have realized it is important, and have added token workaround support for it, but not real working support.

On the other hand, it was interesting to see how VCS projects have copied from each other. Everyone (except tla) seems to use a command-line syntax similar to CVS. The influence of tla Arch is, of course, plainly visible in baz and bzr, but you can also see pieces of it in all the other projects. I was also interested to see the Darcs notion of patch dependencies was visible (albeit in a more limited fashion) in bzr, git, and Mercurial.

So, I will be staying with Darcs. It seems to really take the idea of distributed VCS and run with it. Nobody else seems to have quite gotten the merging thing right yet — and if you are going to make it difficult to do anything but merge everything up to point x from someone’s branch, I just don’t see how your tool is as useful as Darcs. But I am glad to see ideas from different projects percolating across and getting reused — this is certainly good for the community.

Updates / Corrections

I got an e-mail explaining how to get the individual patch diffs out of bzr. This will work only for “regular”, non-cherry-picked merges, and requires some manual effort.

You’ll need to run bzr log, and find the patch IDs (these are the long hex numbers on the “merged:” line) of the changeset you’re interested in, plus the changeset immediately before it on the same branch (which may not be on the same patch and may not be obvious at all on busy projects.) Then, run bzr diff -r revid:old-revid-string..new-revid-string.

I think this procedure really stinks, though, since it requires people to manually find previous commits from the same branch in the log.

31 thoughts on “Whose Distributed VCS Is The Most Distributed?”

James says:

August 10, 2006 at 10:28 pm

Have you looked at monotone?

Reply
1. John Goerzen says:
  
  August 11, 2006 at 7:10 am
  
  Unfortunately, I didn’t have time to look at Monotone and SVK, though I’m happy to see the SVK comments below
  
  Reply
Daniel Westermann-Clark says:

August 10, 2006 at 10:30 pm

[url=http://svk.elixus.org/]SVK[/url] brings a number of distributed VCS features to Subversion (in addition to other neat things like VCP). It isn’t as “pure” as Darcs, but I find it immensely useful for working with many existing Subversion repositories.

Feature by feature:

1. SVK uses a depot to hold a full mirror of each Subversion repository you want to use. You initialize and then checkout from the mirror. Once a repository is mirrored, you can create a local branch, which retains full history and allows you to commit without network access to or commit permissions on the server. So branching is not “fundamental” in that it requires extra steps, but it is quite simple.

2. Branching is implemented, as in Subversion, as a copy operation. Subversion copies are copy-on-write, so they are cheap and fast.

3. SVK implements merge tracking using a simple push and pull model. It also implements star-merge (based on tla’s algorithm, IIRC).

4. SVK allows you to cherry pick changesets, but it is not very well abstracted at the moment.

5. SVK retains full history on remote branches (i.e., those that happen on the server) and on local branches (i.e., those that happen in your local depot).

6. Merging from one branch to another preserves full history in SVK. SVK calls this an “incremental” merge.

I don’t expect to convert anyone currently using Darcs to SVK, but any DVCS review should at least mention SVK. :-)

Reply
1. glandium says:
  
  August 10, 2006 at 11:33 pm
  
  For 1) I’d add that it’s actually very easy to branch from a remote repository.
  you just svk cp . It will ask you where you want to put the mirror and the local branch in your svk repo.
  
  Reply
  1. Daniel Westermann-Clark says:
    
    August 11, 2006 at 9:59 am
    
    Nice, I didn’t know about that shortcut.
    
    Looks like I also forgot the last two points:
    
    7. You can create a local branch offline and commit to there. SVK also lets you work with the full history of the remote repository (the part you have mirrored). Once you’re connected again, you can easily pull down new changes from the remote repository and push yours back.
    
    8. SVK is quite fast for general use, but I don’t know how it compares to others. The initial mirroring of a repository can be slow if the repository has many revisions.
    
    Reply
David Caldwell says:

August 10, 2006 at 11:50 pm

In #4 you say that Mercurial does not have any support for cherrypicking patches across branches. This is incorrect–I do it all the time. It’s not a completely seamless process, but it’s not bad. First, export the changeset you want to a temp file, then import it onto the new branch (it will import to wherever your working directory is pointing).

It would certainly be better to be able to do it in one step, but two isn’t bad.

-David

Reply
1. John Goerzen says:
  
  August 11, 2006 at 7:12 am
  
  From what I can tell, this sounds just like a diff followed by a patch — that is, it doesn’t pull the *same* changeset over, but rather commits a new changeset with the same changes as the original. Is that wrong? If so, could you list the exact commands you’re using? (I’m assuming you’re using hg export)
  
  Reply
2. pachi says:
  
  May 21, 2007 at 7:50 am
  
  Mercurial does support cherrypicking using the transplant extension.
  
  There are many very useful Mercurial extensions that are worth a try.
  
  Reply
  1. Mike says:
    
    June 5, 2008 at 4:03 am
    
    Mercurial lets you transplant patches, but it doesn’t handle the merging the branches well afterwards, and that’s what really matters: http://www.selenic.com/mercurial/wiki/index.cgi/TransplantExtension
    
    Without being able to merge easily after cherrypicking, a DVCS really can’t be said support cherrypicking.
    
    Reply
Martin says:

August 11, 2006 at 12:51 am

A couple of corrections about Bazaar:

#2 – Bazaar has a mechanism called ‘repositories’ to do cheap local or remote branching. It works without requiring hardlinks, which is good on OS X and Windows. It helps in some cases where hardlinking cannot, such as bringing down someone else’s branch related to something you already have. It actually works very well, but I acknowledge the documentation is confusing and we should fix that.

#3 – Merging between branches is pretty good. I wouldn’t say that Arch’s merging is much like darcs, but a full comparison is too long for this comment. In particular Arch won’t do darcs’s very cool merging of discontiguous changes without considerable user effort.

#4 – No, bzr is a bit more intelligent than just getting a patch and applying it, though we still want to do more to improve cherry picking. I’m not sure where you got this idea that we will cause nightmares by ignoring partially merged patches; we certainly don’t. There is some design work for improving the tracking of this.

#5 – Bazaar brings across the full history when you branch or merge. We *present* the data as being rolled up, but it’s all still there and you can look at it e.g. with the GUI, and it’s used by merging. The distinctness of the changes is not lost; this is something we specifically fixed compared to Arch.

We also have support for svn-like checkouts where the history is stored remotely and accessed only when needed.

#8 – We’re specifically concentrating on performance as we come up to 1.0. 0.9 will be substantially faster than the previous release.

Bazaar really does not lose history. I wish you would correct that in the article. If you read it somewhere or if there’s something we should make more clear please let me know.

Reply
1. John Goerzen says:
  
  August 11, 2006 at 7:18 am
  
  Regarding #4, the problem is that you commit the merge at the wrong time. You don’t commit the fact that the merge happened when it actually occurs; rather the cherrypicked merge gets committed whenever the full history up through (and beyond) that particular patch is merged.
  
  So someone trying to track the history of merges will have a terrible time, since bzr log will show a “merged:” line of the source repo at the wrong place. It’s really storing an incorrect history.
  
  Regarding the history preservation, someone sent me information on how to get that out of bzr, so I will update the article. It is completely undocumented in the bzr manpage and bzr diff, and only works for non-cherrypicking, but is nevertheless a valid point.
  
  Reply
Peter Van Eynde says:

August 11, 2006 at 1:35 am

As I documented in [url=http://wiki.debian.org/PackagingWithDarcsAndTailor]PackagingWithDarcsAndTailor[/url] I consider darcs to be bad for debian package maintenance because we are guaranteed to introduce [url=http://darcs.net/DarcsWiki/FrequentlyAskedQuestions#head-76fb029ff6e9c20468eacf3ff00d791e2cf03ecb]”doppleganger patches”[/url] when a fix of ours flows into upstream. Enough of these will make the debian branch unmergable with the upstream one.

Personally I’m investigating bzr at the moment.

Reply
1. John Goerzen says:
  
  August 11, 2006 at 8:21 am
  
  I personally use Darcs quite heavily for Debian package maintenance, and in fact, the majority of my packages use Darcs (either because I wrote them, or because I maintain them in Darcs). See [url=http://darcs.complete.org/debian]darcs.complete.org/debian[/url] for my Debian darcs repositories.
  
  I have found this to not be an issue with modern darcs. kdiff3 is especially nice when merging with upstreams that have applied Debian patches already.
  
  I know that the issue exists, but I have only seen it bother me once in actual use. As a practical matter, it isn’t a consideration for me.
  
  Reply
  1. Peter Van Eynde says:
    
    August 11, 2006 at 4:37 pm
    
    The bad part is that it was working perfectly, until at a certain moment it just stopped merging. I don’t like working with such a thread looming above me, even if darcs is very nice indeed.
    
    Reply
the debian user says:

August 11, 2006 at 2:04 am

Debian Developer John Goerzen just did a tremendous job, useful not only for Debian users but for all developers of free and open source software. He compared the different available version control systems which support distributed collaborative work….

Reply
Nathaniel says:

August 12, 2006 at 2:58 am

Since monotone came up, I’ll quickly summarize, leaving #1 for last…

2) In monotone, branching is essentially free — committing to a new branch costs exactly the same as committing to an existing branch.

3) Monotone’s merger is pretty smart — it has a few enhancements that would be nice to add at some point, but it’s smarter than 3-way merge, provably correct (for some value of provably), and cleanly handles things like renames (both files and directories), arbitrary attributes on files, etc.

4) Merging individual changesets is trivial — you can ‘mtn pluck’ them out of history and into your workspace, using the smart merger to wiggle them into place. However, future merges are blind to this. The reason this feature is unique to darcs is that, well, no-one knows how to make it actually work — even darcs (cf. the last year+ of traffic on the darcs-traffic list). We’re all watching to see if the darcs devels solve the problem, but until they do, I don’t think anyone else wants to get sucked into the situation where there are users waiting for you to invent new math that may or may not exist.

5 & 6) Of course we preserve history.

7) Actually, in some sense it’s only possible to work with history offline :-). Monotone’s only network operation is “sync my local mirror”; everything else works against local disk. (This might change at some point to support conveniences like history-less checkout, but we’ll see.)

8) “Fast enough for general purpose use” is a bit subjective, but I’d say so. The one potential issue is that initial pulls are somewhat slow, which large projects have worked around by putting up compressed repos for HTTP download. We’re working on fixing this now, as well.

Right, about #1, then — you describe the particular branching model that is popular for DVCS’s these days — I think it originated with BK. It’s not the only possibility. Unfortunately, we don’t have a good potted explanation I can point you to about how monotone does it. But briefly, a project generally has a whole shared namespace of branches. You normally synchronize the whole namespace at a time (though don’t have to). I.e., this namespace is not tied to any particular location; people can commit to “the same” branch while disconnected. Branching is orthogonal to physical location — you only create a new branch when you have some divergence you/your project wants to keep track of. By default, branches are replicated, globally visible, and shared.

There are some rough edges on all this still — fixing those is priority #2 after improving pull speed — but that’s the basic idea.

Cheers,

Reply
anon says:

August 12, 2006 at 11:30 pm

git doesn’t do [i]changesets[/i]. git tracks file identities at each step. The [i]changeset[/i] is a derived quantity.

Thus, your #4 works even though you think it won’t… You can cherry-pick changes, merge later, and have everything work perfectly. Seriously. I do this often. The [i]changeset[/i] is [b]not[/b] replayed, re-applied, or anything like that. If the default merge strategy doesn’t work for you, there are others available. Plus, git-rerere will handle some really bizarre cross-merge and rebasing changes for you by recording what you did to resolve those changes last time.

Plus, git-cherry-pick by default will record the tree from which you picked the changes. The gitk viewer uses that as a link, so you can jump to it, etc.

Two holes in git’s toolset right now are cloning with limited history and handling subprojects. Those are better handled by other tools, alas.

Reply
Sven Mueller says:

August 15, 2006 at 10:25 am

Hi.

There is one question I don’t see covered on any comparison of DVCSs. With team collaboration on a project, it is often desirable to maintain a central “authoritative” repository to which several people can commit. So:
How good are the different DVCSs at offering multi-user commits to a remote repository, preferably without requiring shell access on the machine where that repository is kept?
I know SVK is able to offer that, cause the central repository can be a simple SVN server.
Given the rest of the comparison here, darcs might be a viable solution if I/we ever need a distributed VCS.

regards,
Sven

Reply
1. John Goerzen says:
  
  August 15, 2006 at 10:51 am
  
  I didn’t test this with a bunch of different ones, but I can comment about Darcs.
  
  Normally, you use “darcs push” to push to a central server. It uses ssh and run the darcs binary on that server to apply the patches. So all you need is ssh and a darcs binary there, and standard Unix permissions can control the setup.
  
  An alternative is to use signed patches sent over email. The Darcs author, David Roundy, does this. You can simply run “darcs send”, and it will send any patches you have that the central repo doesn’t. The central repo describes where to send the patches to. You can use a simple script on the central server that will check signatures, and if so, pass the patches to darcs apply.
  
  So the requirements are probably a little less than SVN.
  
  If you don’t need to handle multiple users committing simultaneously (and thus don’t need the darcs binary to handle locking and conflicts), you can just rsync your personal repo up to the server.
  
  Reply
2. Nathaniel says:
  
  August 15, 2006 at 4:27 pm
  
  For monotone, having a central repository is the most common working mode — keeping track of everyone’s repositories is too much of a hassle to deal with :-).
  
  The central repo isn’t particularly “authoritative”, though; monotone replication is more like mirroring than like branching, so we can all have equally authoritative copies of our mainline branches.
  
  Possibly useful link about this topic: [url]http://venge.net/monotone/wiki/MasterRepository[/url]
  
  Reply
3. Curt Sampson says:
  
  November 30, 2009 at 12:46 am
  
  Now, in 2009, gitosis is fantastic for keeping “master” git repositories. The repositories are managed under a single account (though you could use multiple accounts, if you’re a nervous sort). The configuration is a set of ssh public keys, each with an arbitrary name associated with it, and a configuration file describing who has what access (read or read/write) to which repositories. gitosis puts everybody’s keys in the accounts .ssh/authorized_keys file with appropriate restrictions so that they can run only the server program, which uses the configuration file to serve requests within the access limits.
  
  It also supports using git-daemon for public access, as well, again controlled through the configuration file.
  
  Gitosis was one of the big features that convinced me to switch to git.
  
  Reply
Anonymous says:

September 20, 2006 at 11:43 pm

I think git does #4, actually. I’m not completely sure what you mean by your description of the feature. In particular, the phrase “merging a changeset” does not have an obvious translation into git-ese. Perhaps you could elaborate? Git doesn’t have an explicit concept of changesets, as it’s fundamentally snapshot based. And merging applies to heads of development (branches) and nothing else.

What I’m thinking of is what git terms “rebasing a branch”. If you’ve been hacking a neat feature on top of baseline version 1.1, and in the mean time baseline version 2.0 has been released, merging your changes in could be messy. You can either merge with 2.0 and fix the conflicts before releasing your changes to the world, or you can “rewrite history” and apply the changes you’ve made to version 2.0, resulting in a branch that looks like you started hacking on version 2.0. This can produce a less tangled branch history, so is appreciated in large projects.

Git’s cherry-pick is BASICALLY diff-and-patch, but it does one diff and patch per change along the branch, so you keep the full development history, AND it checks each patch to see if it has already been applied to the destination branch. If it has, it skips it. (The “git-cherry” helper does that.)

There’s also git-rerere, which remembers merge conflicts that have to be manually fixed and recycles the previous manual resolution if the same conflict has to be fixed again in a later merge or rebase operation.

Neither of these assign a permanent ID number to a particular change (git assigns permanent ID numbers to snapshots, not the deltas between them), but in practice they reduce the incidence of needing to manually resolve the same conflict twice to negligible levels.

Reply
era says:

September 24, 2006 at 1:47 pm

Just a quick note about “reasonably fast”.

I have this puny little 133MHz laptop which has been working just fine for me over several years. With 32Mb of RAM and Debian Woody, I was able to run X, Emacs, and CVS and basically not worry. Except that CVS is not, you know, distributed.

So I got a RAM upgrade (64Mb is all this puppy can take) and upgraded to Ubuntu Dapper, and installed SVK, because we recently switched to SVN at work and this seemed like the most “natural” way to go distributed. Well, it couldn’t even download 68 revisions of a single text file — it took forever, and failed because of some network error, and had to be unjammed before I could try again, and then it corrupted its repo, and I had to start over from scratch, and eventually, the “oom-killer” thing (I guess “oom” stands for “out of memory”) killed not only svk, but also various and sundry vital system daemons.

Next, since I’m on Ubuntu now, I thought I’d try bzr. Big deal. Basically the same thing — it kept on going and going and going and eventually fell on its face after having consumed up all memory. (And it’s not like I have a small swap space on this machine — I set up like a gigabyte, but I suspect oom-killer will never even allow me to access all of it.)

Next up, darcs. You know what? It just works ™. I’m still too chicken to try to run X again, but Emacs + darcs is a very reasonable combination indeed, even on this modest hardware.

PS. Sorry if the wrapping is screwy, I had to save this to a text file while finding a machine that would be able to look at your CAPTCHA /-:

Reply
bloggo ergo sum says:

September 29, 2006 at 10:27 am

Whose Distributed VCS Is The Most Distributed? – The Changelog
A good read on the more recent tools available. Included is:

SVN
GIT
Darcs
Arch
baz
Mercurial
bazaar-ng

…

Reply
Jakub Narebski says:

November 6, 2006 at 5:31 pm

I won’t repeat the argument that git is [i]snapshot[/i] not changeset/patchset based about it’s ability to cherry-pick patches, but perhaps [url=http://www.procode.org/stgit/]Stacked GIT[/url] (also known as StGIT) would give what you want in #4.

Reply
The Changelog says:

March 6, 2007 at 6:02 pm

I recently wrote an article or two about distributed version control systems.

I’ve been using Darcs since 2005. I switched to Darcs, in fact, 10 days after the simultaneous founding announcements of git and Mercurial.

Overall, I have been happy. I

Reply
beza1e1 says:

August 14, 2007 at 3:48 am

I try to do some research on this in a study thesis. Maybe you’d like to comment on my ideas [url=
http://computerroriginaliascience.blogspot.com/2007/08/how-to-evaluate-dvcs.html%5Dhow to evaluate DVCSs?[/url]

Reply
1. beza1e1 says:
  
  August 14, 2007 at 3:50 am
  
  No BBCode? Maybe this works:
  
  http://computerroriginaliascience.blogspot.com/2007/08/how-to-evaluate-dvcs.html
  
  Reply
Curt Sampson says:

November 30, 2009 at 1:18 am

This post seems to have gone a bit wrong in certain parts of its underlying thinking, which makes the conclusions slightly unfair, in my mind.

The basic problem here is that Subversion is not a distributed VCS, in the common sense of the word that means “everybody has their own, separate and different copy of the repository.” Subversion works on the single-central-repo model, and shouldn’t be included in this list at all, if being able to have disconnected operation or not have a shared “master” repository is a requirement.
If you want to compare working within Subversions central-repo model to working within the distributed model of another VCS, that can be a fair enough comparison. But you need to work within the strengths of each VCS. A DVCS would lose on most counts to Subversion if you used Subversion’s working model within a DVCS.

In that light, how does Subversion compare? I’ll be making some specific references to Git in this comparsion since, out of this list, Subversion and Git are the two VCSs I’m most familiar with.

1. I’m not clear what “the fundamental method of collaboration must be a branch” really means, myself. If you need to be able to create commits that do not go to the master repo, Subversion of course fails completely here; it’s not a DVCS! But as far as updates from the central server being a merge into a local branch of work, yes, that’s how Subversion works. Think of your working copy as a (very short) local branch, and that’s exactly how `svn update` works. You can see this as a one-step version of doing a `git stash`, `pull` and `stash pop`, with the provisio that once you’ve done the `svn update`, it’s more work to try to back out of it (though it can be done).

2. Branching is quite cheap in Subversion; it’s just an `svn cp` with two repository URLs. It’s pretty near as cheap as it gets, barring the necessary network I/O overhead. That’s not to say that merging is going to be cheap, just creating a branch is.

3. Merging between branches is indeed poor in Subversion, as of 2006. I understand it’s improved a lot in the past couple of years. I’ve not done enough of this to do a good comparision.

4. No individidual merging of changesets. (I’m not even clear on how this might work in any VCS, if there are conflicts, even if those conflicts are trivially resolved.)

5. Branching does preserve full history in Subversion, since the branch is part of the same repository, and you can easily trace back through the beginning of the branch back into the history that led to the branch point. No, Subversion of course does not support cross-repo branching, because there is only one repo. Th
e question of whether it does or not doesn’t make sense in the Subversion world.

6. Merging does not, as of 2006, preserve full history in Subversion, since the merge is basically just a huge patch. (I generally note in the commit message the branch and revision from which I’m merging when I merge across branches.) I understand in newer versions of Subversion this information is preserved automatically.

7. Subversion does basically nothing off-line except let you compare your local changes to the working copy with the checked-out version, and revert them.

8. Subversion is indeed fast enough for general purpose use.

If you’re working in an environment where you have a centralized, “master” repository which everybody updates frequently and updates from frequently, svn seems to compare reasonably well with the DVCSs except for merging, where as of 2006 it is a ways behind. (Someone else will have to explain what the latest versions of Subversion do to improve this.) It does have the advantage that the system and workflow, being fairly constrained, is in the circumstances generally a bit simpler to use than the DVCSs.

Reply
Selena Killough says:

February 20, 2018 at 7:30 am

Great web site you’ve got here.. It’s difficult to find high-quality writing like yours these days. I truly appreciate individuals like you! Take care!!

Reply
Katlyn McGaw says:

June 10, 2018 at 7:02 am

Excellent post. Keep posting such kind of info on your site. Im really impressed by your blog.
Hey there, You’ve done a great job. I will definitely digg it and in my opinion suggest to my friends. I’m sure they’ll be benefited from this web site.

Reply

The Changelog

Comments on family, technology, and society