So, a little while ago, I wrote about why I like HP. This week, I’m starting to be annoyed at them.
My employer just bought nearly $100,000 worth of HP hardware. We get a new MSA1500cs Fibre Channel SAN (with redundant controllers, FC switches, disks, etc), a new blade enclosure system, three blades to start with (all of them, at minimum, dual dual-core Opterons with 4GB RAM, and some considerably more), a rack to put all this in, etc.
So we’re starting to set all this stuff up. I’ve got Debian installed on an NFS root for testing the blades and how they interact with the SAN.
The blades have an integrated dual-port QLogic QLA2312 Fibre Channel adapter. The Linux kernel has a built-in driver for this (qla2xxx), which detects it and, so far at least, works fine. We want to run kernel 2.6.17 because it’s the first version where XFS has decent semantics for write ordering to prevent corruption after a power failure. Plus we want at least a 2.6.16.x kernel because we want to run the latest Xen 3.0 on these blades. (Live migration of virtual servers from blade to blade — this will be great.)
But we learn that HP does not support the kernel qla2xxx driver. HP does not say WHY they don’t support it, just that their own driver is the only one that they support.
After plowing through several annoying scripts to get to their driver, I realize why it fails to install: it is OLD. At BEST, 2.6.14 is the most recent kernel it would even compile against (release date: October 2005), and I think the most recent version it supports is more like 2.6.8 (almost TWO YEARS OLD now). They reference a whole bunch of kernel symbols and macros that were removed somewhere between 2.6.8 and 2.6.17.
I sent a ticket to HP support. Their first request was to run their system information gathering tool and send them the results. Fine, that’s reasonable. I did so. Next they say, gee, you’re running Debian, and we don’t support that.
Argh…. If they tried to compile it against 2.6.17.1 on RedHat or SuSE, they’d get the exact same problem. I told them what symbols they were erroneously using, and a simple grep would have showed them that.
Besides, how many customers are going to be pleased with no upgrade path available for 2 years? I wouldn’t want our kernel version to be held hostage to HP’s slow driver development process.
Sigh.
I should add, in case there is any confusion, that it is quite possible that we would go 2 years without upgrading a kernel on a machine like this. But doing that out of choice — because the system is meeting our expectations as-is and we don’t want to introduce change into a production system — is quite different from being forced to do so.
Not only that, but the release versions of RedHat or SuSE wouldn’t have the features we want, so we’d be building our own kernels there anyway.
(We find Debian to be better enterprise-suited than RHEL or SuSE, BTW)
If they’re not going to support you anyway for using Debian, then why worry that they’re not going to support the qla2xxx driver?
Yes, that is a fair point. The problem is that, right now, we don’t know why HP cares whether or not we use their driver. We’re going to do some testing and make sure that all the failover & dynamic resizing features work with the kernel.org driver. If it all seems to be reliable, we probably will go that route.
We’ve got a pile of HP hardware here, and their behavior is quite similar. While the hardware is very reliable and its support is reactive, software support is a complete mess. Whatever you install that is not “supported by HP”, you lose support. I wonder if it’s possible to run anything else than a web server for static pages.
I also wonder about the internal HP policy about Debian. Why is not HP selling Debian systems after giving us so much resources, I don’t understand.
These kind of issues are interesting. You don’t want to run an unsupported driver, because if the sh*t hits the fan, you want to get support and be able to blame your vendor and not get fired. But on the other hand, you know up-front that the support you WILL receive is crappy as hell. Nice deadlock situation:)
Yes, sort of.
Basically, there are a few things at work here.
First, we are better equipped to deal with problems in Debian than anything else. We’re better equipped to prevent problems using Debian than anything else, too. (And I believe that’s because Debian is much easier to set up and maintain in a secure, stable fashion)
We don’t generally buy software support on our OSs — just hardware support from the vendor.
In our case, we’re going to be running Debian virtual machines under Xen 3.0. In the worst-case scenario, we can always install RH or SuSE in Xen dom0 and still use the Debian userland for everything. We’re still evaluating options there.
The good news, though, is that HP is good about supporting their hardware even if we’re using Debian or whatever on the software side. We’ve been running in this setup for years and have had nothing but good experiences. So this may wind up being not a problem at all — we just wanted to try to do things their way first.
Disclaimer: I work for HP, but have no affiliation with the storage/proliant organizations, so note that this is just my own speculation and should not be construed as anything more – I am not speaking for HP here.
It is standard practice for hardware companies to only support what has been tested/qualfied. Qualifying a storage driver is not a trivial/cheap task – they have to draw the line somewhere, and its typically with vendors’ stable kernels since that is what most customers run. Sure, it might be easy to make sure a driver builds against the latest source, but its not unheard of for the core kernel behavior to change from underneath a driver, resulting in bugs of various severities. I wouldn’t think that just because an HP supported driver builds against your kernel that HP would automatically support that configuration. Imagine if HP guaranteed your data integrity for a version of a driver, regardless of the core kernel, and a core kernel bug caused your fs to become corrupt – that sounds like legal liability hell :)
On the other hand, you should certainly be able to do what you want with your hardware, including moving to a < 1 week old kernel. The drivers are upstream (and HP has been fairly good about choosing FC hardware with free drivers, and I don't think that's a coincidence). I'd hope that HP would be good about dealing with any hardware problems you have in such a situation. And I'm sure HP Services would be happy to work with you to qualify and stand behind your custom install (for a fee, of course) but I personally don't see any other way of providing a useful guarantee.
Yes, you have made good points there.
The one thing I would quibble about is that the drivers are not upstream. The HP driver source is not apparently the same, or even very closely related, to what’s in the Linux kernel tree. HP seems to use its own multipath instead of multipath-tools as well. HP drivers seem to be a bit modified from what is available at qlogic.com, but again that is not the same as what’s in kernel.org.
But the kernel.org stuff, and multipath-tools, seems nicer anyway.
After thinking about this some more… perhaps I just shouldn’t worry about it. If the Linux stack works better than the HP stack, then we should just use it.
We’ve been running Debian and kernel.org kernels on our other HP hardware for years with excellent results, and you are right — they do support the hardware well even if they know we have Debian or whatever.
(Which is more than can be said for Dell)
I’m sorry, but I don’t understand your position at all. At no time that I can see or recall has HP ever stated or suggested that it will support any Linux distribution a customer may decide to run. They seem to be quite clear about what distributions they support and which they don’t. Was this a problem of someone giving you incorrect information?
Being on or near the bleeding edge of driver development is not something that many enterprise houses want. If you are saying that you are willing to pay HP for support for whatever Linux distribution and add-ons you choose, then tell them that, and be willing to put up the money for the SOFTWARE SUPPORT that will require. They have to run a business just as your employer does, right?
You’re not paying attention. This problem has nothing to do with my distribution. The distribution is entirely irrelevant to the question.
This problem has to do with whether or not HP supports modern kernels.
I I got the gist of your comments just fine. As I understand it, HP supports the kernels _that ship with the distributions_ and any updates thereto. That has been the word I got from them in my dealings with them. So it apparently DOES have something to do with the distribution. HP can’t reasonably be expected to support every possible combination of kernel and distribution available, and as I said before, most enterprise customers likely don’t want to run the latest and greatest software, since they believe it hasn’t gone through enough vetting and won’t be sufficiently stable. Yours is a special case, and you could probably speak to HP about a custom support contract (as the HP employee mentioned in an earlier post) to suit your needs.
I don’t see why there is any need for irritation or annoyance with them about this. They are the good guys from my standpoint, since they are at least giving us other options, and I think they would listen if people said they wanted expanded supported configurations and were willing to pay for it.
No, here’s the thing. If they were really doing the right thing by the community, they would get their drivers into the kernel.org source tree and support that. They would also support multipath-tools instead of their proprietary tools.
These drivers in the k.o tree, of course, would filter into their supported distributions.
It is interesting to note that RH and SuSE do not support HP’s drivers, but only the k.o ones with multipath-tools.
I think there are plenty of reasons that somebody would want to use their own kernel instead of the one that comes with RHEL or SuSE. While it probably wouldn’t be as extreme as using one that came out last week (and we won’t be using one that “came out last week” by the time this is put into production), it could certainly be more recent than 2 years old.
I understand they can’t do rigorous qualification procedures of each point release and each distribution, but they can at least work with the community to get the drivers into the kernel tree and maintain them there, rather than as a spearate package.
Back in the days when there was no unified multipath framework, they had a *little* bit more excuse (they should still have got this in the kernel tree!) Now that there is a unified multipath framework, that works well, they should support it from top to bottom and work to fix it wherever it is broken, instead of requiring their own.
What they’re doing is akin to ATI saying that you can’t use X.Org with our cards; you must use ATI ProprietaryX instead. (ATI hasn’t done this; it’s just an example.)
HP/Qlogic aren’t alone in this — the other multipath folks that have been around for awhile are also slow to warm to multipath-tools, but the Linux vendors have adopted it already.
Sooner or later, they will have to adapt, though. I wish it was sooner.
I guess what I’m saying is that they should do what all the other kernel driver developers do:
1) Make it work in the general case
2) Test in the special case (for the HW/SW combinations they have available or claim to support)
3) Interact with the community to receive bug reports and patches
John, you mentioned redhat. Never trust large groups of large ladies wearing red hats.
What kind of throughput have you been able to get on the MSA1500cs? The online docs give an upper limit of 200MB/sec. I’ve only been able to get up to a little over 100MB/sec. We’ve got eight fully SATA populated MSA20 enclosures.