Category Archives: Software

Experimenting with Git

I’ve been writing about Git a bit lately.

I’ve decided to switch some of my Debian work over to it to start with, as well as some of my other projects.

Although I was thoroughly frustrated with Git a year ago, now I am quite pleased with it. What’s different? The documentation is a LOT better. So far I have only found one manpage (git-show) that omits lots of its options. The system is friendlier, keystroke-happier, and powerful.

Compared to Mercurial, I’ve found some nice things:

In-directory branching. I didn’t expect to care about this, since both git and hg permit lightweight clones. But it turns out to be so easy to use that it is great. Especially since I don’t have to setup multiple branch repos on the server. I really like this. Note that “hg branch” is not the same as a git branch, and see the discussion on the hg lists about renaming that before 1.0.0 for why.

Flexibility in getting things around. Plain HTTP works fine (no static-http:// hack). ssh. git daemon. rsync. Very slick.

Performance. Surprisingly, git actually feels faster than Mercurial, especially when pushing or pulling. I didn’t expect that.

Tags. They seem smarter in git. No more merging of .hgtags all the time. Also I like that I can attach a message to a tag and sign it.

All that power. There is a *lot* that Git can do. I should have been taking notes about it all.

My main complaint is still that Git doesn’t have something as nice as “darcs send”. Mercurial doesn’t either, but it’s a bit closer. Git has moved closer, but still has room to improve on that.

So I have set up git.complete.org and am starting to publish my Debian stuff on Debian’s alioth server as well.

Also, hg-fast-export in the fast-export project is *awesome*. Branch-aware and everything. It made a perfect Git version of my Mercurial work.

Git looks really nice, until….

So I have been learning about Git this weekend. It has some really nice-looking features for sure — some things Mercurial doesn’t have.

I was getting interested in switching, until I found what I consider a big problem.

Many projects that use git require you to submit things using git-format-patch instead of pushing/pulling from you. They don’t want your merge history.

git-format-patch, though, doesn’t preserve SHA1s, nor does it preserve merges.

Now, say we started from a common base where line 10 of file X said “hi”, I locally changed it to “foo”, upstream changed it to “bar”, and at merge time I decide that we were both wrong and change it to “baz”. I don’t want to lose the fact that I once had it at “foo”, in case it turns out later that really was the right decision.

When we track upstream changes, and submit with git format-patch, the canonical way to merge upstream appears to be:

git fetch; get rebase origin/master

Now, problem with that is it loses your original pre-conflict code on a case like this.

There appears to be no clean way around that whatsoever. I tried a separate “submission” branch, that rebases a local development-with-merge branch, but it requires a ton of git rebase –skip during the rebase process.

Thoughts?

Revisiting Git and Mercurial

Exactly one year ago today, I wrote about Git, Mercurial, and Bzr. I have long been interested in VCS, and looked at the three main DVCS systems back then.

A Quick Review

Mercurial was, and for the moment, remains, my main VCS. Bzr remains really uninteresting; I don’t see it offering anything compelling that Mercurial or Git can’t do. My Git gripes mainly revolved around its interface and documentation. Also, I do have Windows people using my software, and need a plausible solution for them, even though I personally do no development on that platform.

Ted Tso wrote his own article in reply to mine, noting that the Git community had identified many of the same things I had ans was working on them.

I followed up to Ted with:

… So if Ted’s right, and a year from now git is easier to use, better documented, more featureful, and runs well on Windows, it won’t be that hard to switch over and preserve history. Ted’s the sort of person that usually is right, so maybe I should starting looking at hg2git right now.

So I guess that means it’s time to start looking at Git again.

This is rather rambly, I know. It’s late and I want to get these thoughts down before going to sleep…

Looking at Git

I started at the Git wikipedia page for an overview of the software. It linked to two Google Tech Talks about Git: one by Linus Torvalds and another by Randal Schwartz. Of the two, I found Linus’ more entertaining and Randal’s more informative. Linus’ point that CVS is fundamentally broken, and that SVN trying to be “a better CVS” (an early goal of svn, at least) means it too is fundamentally broken, strikes me as quite sound.

One other interesting tidbit I picked up is that git can show you where functions have moved from one file to another, thanks to its rename-detection heuristic. That sounds really sweet, and is the best reason I’ve yet heard for Git’s stubborn refusal to track renames.

The Landscape

I’ve been following Mercurial and Darcs somewhat, and not paying much attention to Git. Mercurial has been adding small features, and is nearing version 1.0. Darcs has completed a major overhaul both of its repository format and internal algorithms and is nearing version 2.0, and appears to have finally killed the doppleganger (aka conflict spinlock) bug for good.

Git, meanwhile, seems to have made strides in usability and documentation in its 1.5.x versions.

One thing particularly interesting to me is: what projects are using the different VCSs. High-profile projects now using Mercurial include OpenSolaris, OpenJDK (Java 7), and Mozilla’s projects. Git has, of course, the Linux kernel. It also has just about everything associated with freedesktop.org, including X. Also a ton of Unixy stuff.

Both Mercurial and Git communities are working on TortoiseHg/TortoiseGit types of GUIs for Windows users. Git appears to have a sane Windows port now as well, putting it on pretty much even footing with Mercurial and Darcs there. However, I didn’t spot anything with obvious Windows ties in the Git “what projects use git” pages.

The greater speed of Mercurial and Git — even for pushing and pulling small patches — likely will keep me away from Darcs for the moment.

Onwards…

As time allows (I do have other things keeping me busy), I plan to install git and work through some tutorials and try to use it in practice as much as possible, to get a good feel for it.

Future

It is beneficial to be using a VCS that is popular, though that is certainly not a major criterion for me. I refuse to use SVN because its lack of distributed functionality makes it too unproductive to be useful. But it looks like Git is gaining a lot of traction these days, especially in Debian circles, which also makes it more interesting.

I notice that Ted did convert e2fsprogs over to git as he said he might, incidentally.

Two new bashisms

I learned about two bash features I hadn’t known about today.

From a colleague, GLOBIGNORE. A colon-separated glob of files to ignore when expanding globs. Helpful behavior when set to “*~” and used with grep.

From the Git FAQ, in a section explaining that it breaks the Git build process, CDPATH. A colon-separated search path to use when you type cd. Possibly useful to refer to subdirs of ~ or other common areas. Seems like it’s prone to break a ton of scripts if exported though.

A Cloud Filesystem

A Slashdot question today about putting to use all the unused disk space on corporate desktops got me to thinking. Now, before I start, comments there raised valid points about performance, reliability, etc.

But let’s say that we have a “cloud filesystem”. This filesystem would, at its core, have one configurable parameter: how many copies of each block of data must exist in the cloud. Now, we add servers with disk space to the cloud. As we add servers, the amount of available space on the cloud increases, subject to having enough space for replication according to our parameters.

Then, say we say we want a minimum of 3 copies of each block replicated. Each write to the filesystem will then cause a write to at least 3 different servers. Now, what if one server goes down? If the cloud filesystem is short on space, we may be down to only 2 copies of some blocks until that server comes back up. Otherwise, space permitting, it can rebuild that third copy on other servers.

Now, has this been done before? As far as I can tell, no. Wouldn’t it be sweet?

But there are some projects that are close. Most notably, GlusterFS. GlusterFS does all of the above, except the automated bits. You can have this 3 copy redundancy, but you have to manually tell it where each copy goes, manually reconfigure if a server goes offline, etc. Other options such as NBD, OpenAFS, GFS, DRBD, Lustre, GFS, etc. aren’t really well-suited for this scenario for various reasons.

So, what does everyone think? Can this work? Has it been done outside of Google?

LinuxCertified Laptop LC2100S

As you might know from reading my blog, at my workplace, we have largely standardized on Linux on the desktop and laptop.

We use systemimager to maintain a standard desktop image and a separate standard laptop image. These images differ because there are different assumptions. The desktop machines mount /home over NFS, authenticate to LDAP, etc. This doesn’t work on laptops. Moreover, desktops don’t use network-manager or wifi, but laptops do.

Our desktop image uses Debian’s hardware autodetection — plus a little hacking in /etc/init.d/gdm — to automatically adjust to a wide range of hardware. So far this has worked well.

Laptops are much more picky. Our standard laptop model had been the HP nc4400 — a small and light 12″ model that people here loved. HP discontinued that model. Their replacement was the 2510p. We ordered one in here for evaluation. Try as we might, we couldn’t get it to suspend and resume properly in Linux.

So I went out scouring the field of Linux laptops. Companies such as Emperor Linux buy retail laptops from people like Lenovo, test them for Linux, and sell them — at a premium. These were too expensive to justify at the quantities we need them.

Then I stumbled across Linux Certified. I’d never heard of them before. I called them up and asked a few questions. They don’t buy retail laptops, but instead have OEMs in Taiwan build laptops to their spec. They happen to use the same OEM that Fujitsu does, I believe. (No big company builds laptops in the USA these days). I asked them about wifi chipsets, video chipsets, whether they use stock kernels. I got clueful answers to all of these.

So we ordered one of their LC2100s models. They didn’t offer Debian preinstalled, but did offer Ubuntu, so I selected that. The laptop arrived a couple of days (!!) later, configured with the particular CPU, etc. that I selected.

I was surprised at the thrill I felt at taking a brand new laptop out of its box, turning it on, and watching Grub appear before my eyes. Ubuntu proceeded to boot. I then of course installed our regular Debian image on the thing to check it out.

It needed a kernel and xserver-xorg-video-intel from lenny, as well as the ipw3945 driver for wifi, but otherwise worked with the exact same software as our HP nc4400 image. (In fact, it wasn’t hard to support both laptops with that image, since both use a lot of Intel hardware.) The one trick was making hibernate call /etc/init.d/ipw3945d stop so that the ipw3945 module could be unloaded before suspend. (Why this particular chipset needs a daemon is beyond me, but oh well.)

The hardware is great. As far as I know, the ipw3945 was the only component that wasn’t directly and automatically supported by DFSG-free software in lenny main. The screen is sharp and high-contrast (it’s glossy, which I personally don’t like, but I bet our users will). The device itself feels sturdy. It’s small and dense. I haven’t opened it up, but it looks like all you need is a screwdriver to do so.

The only downside is that they don’t sell docking stations for it. Their standard answer on that is to buy a USB docking station. That’s a partial answer, but can’t handle power or video like a standard docking station will.

Also, the LC2100s is much cheaper than the HP laptop, even when configured when nicer specs in every way. That is no doubt partially due to the lack of the Windows tax.

I’m sending off an order for 4 more today, I believe.

Viper

Well, now this is quite the experience.

I’ve been trying Viper for the past few days. Viper, for those that don’t know, is usually described as a set of Vi bindings for Emacs.

After reading the nearly 100 pages of documentation and trying it a bit, I have realized that this is not really an accurate description. Viper is a port of vi to Elisp.

But that doesn’t really do it justice. Viper seems to have pretty much everything going for it that Vim does, and then some. It is extensible with Elisp, and works with all the Emacs major modes (indentation and so forth). Yet it also is a very authentic Vi implementation, yet more customizable than Vim. And, in my opinion, more capable than Vim too.

On the one hand, this is a really neat combination: the power of the vi editing commands with the power of Emacs and Elisp for indentation, customization, etc.

On the other hand, it makes my head hurt. While Viper and Vim both are supersets of the vi command set, they don’t always implement extensions (such as multiple windows) the same way or with the same keys. Of course, you could remap them in both, but it’s a bit jarring to run Viper in expert mode, press C-w to start creating a new window, and have it run the Emacs cut command. (You can run Viper in a more limited mode where it does not recognize any regular Emacs keys if you don’t want that)

It’s just weird. It mostly looks like Emacs. It is modal like Vim, and responds to all Vi and most Vim commands. It has an additional mode: the Emacs mode. Also if configured to run in expert configuration, Emacs commands are accepted most places. Yes, you can move with h, j, k, l and C-n, C-p, C-f, C-b all at the same time.

The main drawback I can see is that Viper mode doesn’t work well with Info mode, which has other bindings for keyboard shortcuts… so all of a sudden, hjkl don’t work in info mode.

I don’t know yet if I’ll use viper much, but it is a slick program.

A little more on Vim and Emacs file handling

Yesterday’s post about switching back to Emacs saw quite a few comments from people (most of them useful, even). I learned a few things.

My biggest gripe about Vim was that for the file types I worked with most, its indentation and syntax highlighting was inferior to that of Emacs. I’d like to illustrate that with an example.

Let’s consider one of those file types: XML containing DocBook markup.

Vim has a DocBook mode. It doesn’t autodetect DocBook files, so I have this at the top of each one:

<!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : -->

Now, why should Vim need a separate DocBook mode? DocBook is just XML or SGML, and these things have a well-formed nature. Well, part of the reason is that /usr/share/vim/vim71/syntax/docbk.vim has a ton of lines like this:

syn keyword docbkKeyword chapter citation citerefentry citetitle city contained

Yes, they are hard-coding all the DocBook element names into the editing mode. It’s probably used for completion, highlighting, maybe indentation. I’m not sure, really. I remember that editing these files without the DocBook mode was much more painful anyway, but that was 8 months ago and I can’t quite remember why.

Now, what about Emacs? I don’t know if Emacs even has a DocBook mode, mainly because I don’t have to care. The Emacs psgml mode actually parses the DTD for your XML or SGML files. It knows exactly what the valid tags are from doing so. This means it has full functionality not just for DocBook, but for any XML or SGML file with a DTD.

Not only that, but it knows more about the files than Vim does. For instance, both Emacs and Vim can do completion of various things. Vim </ C-x C-o (ooo, sounds like Emacs!) can complete my closing tags. But it can’t autocomplete my opening tags, and it certainly isn’t aware

Not only can Emacs autocomplete opening and closing tags, but it knows exactly what tags are valid at a given place in the document (thanks to the DTD) and will only consider those tags for completion. Moreover, depending on how you have configured it, it could also insert spots for you to add any required attributes. So, for instance, if you’re editing XHTML and autocompletion gives you an <img> tag, it would add src="" in it for you, as a reminder that src is required.

There are a host of other smart things that Emacs can do with XML or SGML documents. For instance, you can get a list of all tags valid at the current point with C-c C-t or Shift-RightClick — useful if you’ve forgotten the name of a tag for a moment.

The difference isn’t as great with everything. But it sure is noticable as I work with XML and Haskell files.

So long, Vim. I’m returning to Emacs

I’d been using Emacs for quite awhile, and about 8 months ago I decided I would try using Vim. I’d only used vi for system emergency work, but knew a number of people that swore by it for regular work. So I decided I would learn Vim and use it for my regular work. I figure that with things like this, I don’t get a real feel for how well they work unless I use them for all my work. So I haven’t really opened Emacs at all in the past 8 months.

Yesterday I finally decided that Vim was not living up to my expectations and I’m in the process of switching back to Emacs. I thought I ought to write down why I’m doing that, for my own future reference… and since nobody has ever written about Emacs vs. Vim, I might as well post it where everyone can see it.

So here we are.

Original Reasons for Using Vim

It would lead to more comfortable typing. Lots of Vim users mention that you don’t have to hold down keys while hitting other keys as much in Vim as in Emacs, and that the movement keys are all on the home row. That’s true, but I didn’t find it to be that big of an improvement, since Esc is a farther reach than anything in Emacs, and let me tell you, you’re hitting Esc all the time in Vim. I found that removing the armrests from my chair made my hands happier than Vim ever did, and swapping Ctrl and CapsLock in Emacs will probably help there too.

It starts faster. I’m not sure if that really was true even when I switched, but it certainly isn’t true on any of my machines today. Both Vim and Emacs have had major version upgrades (v7 and v22, respectively) since I started using Vim. People seem to say that Emacs 22 feels faster, though I don’t know if that’s true. The startup times of the two, if they’re different, are imperceptible.

Vim would use less RAM. Frankly, these days, both Emacs and Vim are way down on the list of things that use up RAM. Heck, kmail has 141MB resident, and each of its two IMAP processes is using more than 30MB. Emacs in X right after start has 16MB resident, 10MB of which is shared, and 25MB VSS. gvim right after start has 8MB resident, 5MB of which is shared, and a 43MB VSS. Emacs tends to use fewer processes for things that vim. So they’re not all that different, and Emacs could come out smaller in certain situations. But the difference is irrelevant on today’s machines, and modern Gnome and KDE apps are many times larger than both of them.

It will make me more comfortable in rescue environments where I have only traditional vi available. Actually, the vi on AIX is so different from modern Vim that this didn’t really help.

It would make me more productive. There are some editing commands that did, but as you’ll see below, it was more than balanced out by other problems.

Things I Liked about Vim

The commands dt, dT, df, dF. Wonderful little things those. Emacs now has M-z (Zap), which is similar to df but can actually go to other lines (a nice addition). And there are easy ways to bind keys to the others as well, though that doesn’t make it a pervasive convention like it is in Vim.

Antialiased fonts. It’s crazy that Emacs doesn’t have this yet. But not a showstopper; I still like good ole 10×20 just fine.

Regexp search-and-replace. Emacs actually has this now, and maybe it had it back then too. M-C-%. Apparently in Emacs22 the replacement expression can also have lisp code in it, which sounds really slick but I can’t see myself using it regularly.

Annoying Things in Vim

Syntax highlighting. The syntax highlighting for most languages in Vim felt like it was about as smart as it was in Emacs about 10 years ago. Strings like "Hello!\"" (in languages where \” inserts a literal “) often confused it. Sometimes quotes within comments confused it. Sometimes it would be confused permanently. Other times, just until I scrolled around in the file or reloaded it.

Indentation. This is much more annoying than the syntax highlighting, really. In many languages — and especially the two modes I’ve used most recently, XML and Haskell — it really, really stinks. The indentation there isn’t aware of syntax, or not very much. Sometimes it is smart enough to know that if an XML line starts with </ that it moves left and if it starts with an opening tag, that the next line moves right. But it’s not smart enough to do this reliably. Not only that, but indentation is not handled with consistent configuration between languages. And even though Vim ships with a ton of language modes, the central docs only cover indentation for C.

I’ve asked Vim experts about this, and have tried all sorts of various tweaks, have read through Vim indentation mode source files, etc. There is just no way to get it anywhere near the intelligence of Emacs for most languages, short of writing my own mode, it appears. This is even worse because when using the backspace key in insert mode, for awhile it deletes individual spaces, and then all of a sudden deletes a big chunk of whitespace back to the beginning of the line. (And no, the insertion of Tab characters is disabled.) Indentation is my complaint about Vim, and something that shows no progress towards being fixed any time soon.

And forget about anything like Emacs M-x reindent-region. This is a syntax-aware indenter. You can write out an entire source file with no indentation whatsoever, and it will indent the entire thing according to the indentation rules you’ve defined and the syntax of the language you’re using. The best I’ve seen in Vim are commands that add or remove space at the beginning of every line in a region.

In short, Emacs seems to “understand” the file format on a much deeper level than Vim, and can automate things to a much better extent because of it.

Too many things disrupt the paste buffer. I can use Y or y to yank some text in Vim, and it’s really, really easy to overwrite that buffer with other things. Yes, I know that I can yank it into a named buffer, but that’s inconvenient and I don’t usually know in advance that I’ll have that need. In Emacs, only C-k and other “large area” commands disrupt it.

Vim doesn’t like you having lots of files open at once. It’s surprisingly convoluted to do this. If you use the basic documented command to edit another file, :e, it closes the file you’re working on. The normal way to open multiple files at once is to use split windows. Well, I don’t like split windows all that well, and often just want to make a quick change in one file — in full screen — and then go back to another. Even though I use set hidden in my ~/.vimrc, it still is annoying and more convoluted than it should be.

Vim can’t create new top-level X windows. In Emacs, I can press C-x 5 2, and poof, I have a second Emacs window in X, and it’s tied to the same editing session and Emacs process. Not a new process, with a different set of files, its own buffers, etc. The same process, same set of files. Just like a split window, but with a new top-level X window instead. gvim simply has no way to do that. This is also a large annoyance.

gqap stinks. This has burned me more than once. I’ll be editing an XML document, and insert some text in the middle of a paragraph. Now I have a really wide line. So I type gqap to reformat the paragraph. My cursor is near the bottom of the screen, so I don’t really see much past the current line. I then save the document and exit. Later I discover that vim considered the entire rest of the document part of the single paragraph, and removed all the different indentation levels at </para> and the like, so it’s completely messed up. Emacs is smart enough to know what is a paragraph in XML mode, and M-q does the right thing. Oh, and Emacs reindent-region can fix the Vim gqap-induced mess.

Desktop Linux: Gnome

I had been intending to write an entire series of posts about our corporate switch to Linux on the Desktop. To date, I wrote only one introducing the project and our reasons for switching from Windows. That was back in April.

Today I’d like to start talking about it all some more.

We have standardized on Gnome for our desktops. Given the Windows background of our user base, it was pretty much a given that we would have to use either Gnome or KDE. Something like fvwm or a non-integrated environment just wouldn’t be a good option.

We evaluated both Gnome and KDE. The very “clean” appearance of Gnome was a nice thing for us. KDE seemed to be to “chatty”, talked about entering in audiocd:/ when it shouldn’t have needed to, and generally violated the KISS and principle of least surprise too often. That said, I continue to run KDE for my personal desktop because Gnome just doesn’t have the flexibility that KDE does. It is too bad that Gnome has gone on this remove functionality kick, and KDE hasn’t gotten the KISS religion yet.

Anyway, Gnome worked well for the most part. We have set some defaults in gconf for things like panel icons. We also set a few mandatory defaults. I fixed a couple of bugs in the vfs system related to nfs4 support, which manifested themselves as icons for files newly saved to the desktop never showing up.

We wanted to present a customized menu to people based on what their job function is. That is, we are using a single system image, so all apps will be installed on all machines. But we didn’t want people to have to see a ton of software that they don’t use. That was easily enough accomplished for custom apps by creating desktop files with mode 0640 and setting the group to the set of people that should see the program on their menu. We removed a few stock programs (such as the terminal) from the menu as well, using dpkg-statoverride. That was also quite easily done. However, I will say that the entire Gnome XDG menu thing is woefully under-documented.

We use Firefox for the standard web browser. It is integrated well enough with Gnome and we have no problems there, aside from sites that are IE-only. We solve that with a Windows terminal server, which I’ll discuss later.

Our network printing was already based on Cups. The individual machines are set up as Cups clients only, which works fine. We did find, however, that gnome-cups-manager automatically installs a tray monitor for cups. This monitor puts little printer icons on the tray when printers are in use. Unfortunately, it figures out which printers are in use by polling the server, and it is turned on by default out of the box, with no good way to disable it short of dpkg-statoverriding it to 0000. You can imagine that hundreds of users times dozens of printers times numerous polls per minute created quite the load on the server. This was a really braindead design and the people that wrote it should have known better. It is also quite useless to have icons coming on for all the printers on the network, which on some networks could be thousands, and not even on the same continent as the user.

Printing is generally a bit iffy in Gnome. They seem to be transitioning between about 3 different printing toolkits, all of which have different print dialog boxes with different supported features and different ways of selecting printers. One chief annoyance is that the print box in evince (the document/PDF viewer) does not let people access printer-specific features such as hole punching and stapling. So we installed gtklp and xpdf for people. The people that print heavy PDFs are huge fans of gtklp these days; it’s a nicer solution than we had in Windows. Nobody really likes evince. We also have had some trouble with evince generating PostScript output that some printers can’t grok. It sounds like all this should be much better in newer versions of Gnome, which if true, would be welcome news.

The Gnome screenshot tool makes it easy to save off a screenshot to a file, or to drag it into an email, but it is difficult to print it (you have to save it first). That was a common complaint around here, so I wrote a little wrapper around xwd and gtklp for printing screenshots. People really like that because gtklp gives them lots of options about orientation and size of the image if they want it, or a simple “Print” button to click if they don’t care. We set a gconf default to bind this to Ctrl-PrintScr and it works well. KDE’s screenshot tool is much more capable, and if we were using KDE, we wouldn’t have had any problem with screenshots.

The bottom line on Gnome is that we, and are users, are happy with it after we’ve made these customizations. But we have had to do more customization that we should have. I still think that Gnome has been better for our users than KDE, but I do wonder how long we’ll be able to survive with our “no KDE libraries” policy, as people want ksnapshot, kolour, etc.