Thoughs on cfengine, bcfg2, and puppet

July 26, 2006Softwarecfengine, configuration managementJohn Goerzen

Yesterday I posted about my first steps with cfengine. By the end of the day today, I had things far along that I can:

cdebootstrap a directory
Run a special cfengine script to get the base files like /etc/fstab and /etc/hosts set up
Bring it up in Xen, apt-get install cfengine2, and use cfagent to bring up the rest of the system and install the necessary base packages (like xfsprogs)

Very nice.

I’ve had a few annoyances with the cfengine packages support, which doesn’t quite seem to work as documented al the time.

I also took a look at bcfg2 thanks to a comment yesterday. It looks very interesting, but I have a few gripes about it. I find cfengine files easier to read. I can look at a file, having never used cfengine before, and have a reasonable idea of what is trying to be done and how it will be accomplished. I can’t say the same for bcfg2, plus bcfg2 uses XML config files (ick) and a bunch of small otherfiles. While the architecture as the authors have described it certainly sounds appealing, I’m not sure that bcfg2 is a simple as cfengine. I am a strong believer in the KISS (Keep It Simple, Stupid) principle. But THANKS to the person that left the comment, and I hope that bcfg2 continues to evolve and provide an alternative to cfengine.

I also looked at Puppet. This thing looks very slick. Seems to be cfengine with a nicer syntax. On the other hand, it’s not really clear that anybody is using it. That makes me nervous — this is the kind of thing that can seriously harm machines if it does something unexpected.

34 thoughts on “Thoughs on cfengine, bcfg2, and puppet”

Narayan Desai says:

July 26, 2006 at 8:03 pm

This is certainly a valid critique of bcfg2; complexity is an issue.

At the same time, while it is easier to get up and running with cfengine, cfengine does suffer from some long-term issues associated with its architecture. I touched on this a little bit in a later comment associated with your previous post. Cfengine provides an imperative model to system configuration; cfengine scripts describe the process through which clients get configured. This works properly for simple operations, but can get complicated as the goal state changes over time.

In contrast, the specification bcfg2 consumes has a declarative character. It describes the goal configuration state for clients. The bcfg2 client code is responsible for determine the set and order of operations that need to be executed to bring the client into conformance with the specification.

In order for a system like this to function properly, it needs to encode a configuration model that is useful to administrators. This is another difference when compared with cfengine’s imperative constructs. Bcfg2’s model encodes a basic set of familiar configuration entities (ConfigFiles, Packages, Services, etc) that can be used to describe target goal configurations.

This model is a little more complicated, but you do get a lot more about it. Basically, you cannot reliably invert imperative constructs. This means that you cannot necessarily reliably validate client compliance with a cfengine specification without executing it. In short, since the user is writing cfengine code, the results are defined by its execution and resulting state.

It is my belief that a declarative model is a better long-term solution because it enables construction of a different brand of tools. We have used this infrastructure to build some really interesting and unique tools that our administrators regularly use to improve their understanding of our network.

For example, we have a built a configuration monitoring system that uses the bcfg2 configuration specifications to locate configuration problems or local modifications. The bcfg2 model provides both a familiar conceptual framework for administrators and a data reduction mechanism; administrators only need to look at anomalous configuration information.

This may not sound too interesting, but it has allowed us to implement a really interesting management model for production machines. Basically, important production servers only run the bcfg2 client in dry-run mode. This means that the client code will download the client configuration and validate the current client state against it. Any state errors are tagged and uploaded to the server. As security updates and other reconfigurations are added to the generic specification, dry-run servers accumulate incorrect configuration entries. This is displayed in configuration reports. So, even if bcfg2 never makes any configuration changes to clients, it can tell you which updates have been applied to each client. Moreover, if local configuration changes have been manually performed, they also show up as inconsistencies. This allows a much gentler goals-vs-reality reconciliation process than is possible with other tools.

I would say, overall, that KISS is a really good design principle to follow. Unfortunately, configuration management is not necessarily a task that will go along, particularly as the client count, configuration complexity and administrative group size increase. The good news is that folks are working hard on simplifying these processes as much as possible.

We are working hard to make specification management a lot simpler. Our goal is to minimize user interactions with XML files; at this point we can generate most of them. This is really getting better quickly. We currently use scripts like this to automatically deploy security updates in a controlled fashion.

Can you characterize the intuition gap between cfengine and bcfg2 a little bit more? It is clear that we need to write a bcfg2 for cfengine users guide, however, the grouping mechanisms used to dispatch configuration aspects are somewhat similar in both tools.

Another major difference between cfengine and bcfg2 is that cfengine is intended to be used to manage a single file or more, which bcfg2 is intended to manage complete system configuration, since that is required for its validation model. This can be daunting for new users, but would it still be an issue if we could autogenerate an initial specification from a current system?

I apologize for the long post; this is my pet project.

Reply
1. John Goerzen says:
  
  July 26, 2006 at 8:48 pm
  
  No need to apologize — I’m happy to see this discussion.
  
  I completely agree that you have the right long-term approach. Perhaps the problem is only one of documentation, or my own thick skull.
  
  My theory — and feel free to ignore this because I’ve used cfengine for all of 12 hours now and spent a few minutes reading bcfg2 docs — is that the problem isn’t the bcfg2 approach but perhaps just the execution.
  
  As an example, in cfengine, I can plonk down some files on a server, and have a rule that says take the file and put it here on all, say, time servers.
  
  I’m a big fan of non-imperative technologies (Haskell, for instance). One of cfengine’s flaws — and a point where it violates KISS — is that it seems really half-imperative to me.
  
  I would have found it very helpful to see a full example included in the bcfg2 manual. In this example, I can’t immediately see somebody doing the exact same thing that I want to do and just clone their approach. And things like actually editing existing text files (one of the very nicest features of cfengine, in my opinion) I just don’t see how to do at all, right off the bat.
  
  I forget whose philosophy it was — Guido’s maybe — but someone wrote “simple things should be simple and complex things should be possible.” That seems to fit cfengine well. If all I care about is maintaining a few files out there, I don’t want to have to go through a bunch of overhead. It’s the sort of philosophy that Java fails — reading a line of input from the keyboard takes two lines of code, and that’s not even counting all the boilerplate to put it in a class.
  
  It’s a little rambly, but I guess what I’m trying to say is that I think you’re on to something good here, and I’d like to use it once it’s had a little more time to evolve.
  
  Reply
  1. Narayan Desai says:
    
    July 26, 2006 at 9:48 pm
    
    Documentation is certainly an issue. I think that convenience is as well. The initial startup cost for bcfg2 is still higher than it should be. However, once you have the initial setup done (for the most part just mapping out your configuration patterns) adding a configuration file to any disjoint subset of clients is a matter of a one line addition to a file and dropping a copy in place on the server.
    
    Building better examples and documentation are definitely on our TODO list. There is a simple example based on NTP in chapter 3 of the manual.
    
    Editing existing config files (in place or not) is a really brittle process. I would really hesitate to use it in a serious environment. We have support for service side file patch functionality, but it is a long-term headache. We are working at making it easier to pull current versions of config files from clients directly into the repository, but it will still be a little bit before that works…
    
    Overall, modulo initial setup, simple things remain pretty simple and really complex things are possible. In addition, we make it a easier to manage complex configuration patterns in a few ways. The main difference is that the initial ramp up costs more with bcfg2, but you can go farther with it. I shudder to think how you would completely control a large network with cfengine; it would be a lot like writing openoffice in assembly.
    
    Incidentally, I really think that installing configurations isn’t really the difficult part of the puzzle. Coping with complexity is. In the same way that compilers have gotten more powerful and sophisticated as our programming languages have gotten better, our system management tools have to actively _help_ us cope with the large and usually completely crazy environments we are saddled with.
    
    I am really interested in your perspective on this. After the last 4 years of development, I have a little too much of an insider’s perspective. The first month of a new users (or anyone with ourside eyes) is one of the most valuable commodities right now; they are the only ones who can point things like this out. Once you have wrapped your head around it, you can’t make observations like this anymore.
    
    Reply
  2. Narayan Desai says:
    
    July 26, 2006 at 11:09 pm
    
    I dug up and interesting walk through of how to get bcfg2 setup. This mail message from the archives describes how to get things going and how the parts all work together. Does this sort of info help things at all?
    
    http://www-unix.mcs.anl.gov/web-mail-archive/lists/bcfg-dev/2006/06/msg00058.html
    
    Reply
    1. John Goerzen says:
      
      July 27, 2006 at 8:31 am
      
      Yes, indeed that does help. I’ll have to play around with this a bit over the weekend.
      
      Our environment is probably a bit different than yours. We have about 30 total Debian installs, most of which exist as Xen domains. We have two *nix people and some help with day-to-day tasks from our two operations people.
      
      So we’re a much smaller place, both in terms of staff and in terms of machines. Also, we are using this only for servers that are generally up 24/7, so the feature of finding out where things aren’t applied shouldn’t really be an issue — unless cron died or something, things will be applied everywhere. If they aren’t, we are probably already aware of a larger issue.
      
      We are just recently starting to cross that threshold where something like cfengine or bcfg2 is needed. Since we’re heavily Debian-based, we got along quite well so far by rolling our own Debian packages for things. We’d package up custom code (and even configs, in some cases) in internal-only Debian packages and install them where appropriate. The packages can also carry dependencies on stock Debian packages and bring in whatever code we need. The ease of building Debian packages really has been a nice win.
      
      So I think we have scaled higher without a configuration management tool than we otherwise would have. And at the same time, we probably need less out of such a tool. I expect that weeks may go by without us making any config changes at all, and that the tool will be mainly a convenience to apply the right changes to the right machines when the time comes.
      
      That doesn’t necessarily rule out bcfg2 for us at all. But it changes the weight we might put on some of the thousands-of-machines management features. I don’t think we’ll ever be in that camp.
      
      Reply
      1. Sami Haahtinen says:
        
        July 27, 2006 at 2:21 pm
        
        I’m running bcfg2 with pretty much the same configuration that you intend to run it on (on a smaller scale though). Mostly Debian hosts running inside Xen hypervisor. What i have learned to love in bcfg2 is the fact that it is constructed in from bundles. Bundles are basically packages and their configuration files and services.
        
        For example, i could have a Bundle ntp which consists of package openntpd, /etc/openntpd/ntp.conf file and openntpd service. This bundle is bound to a group xen-dom0. What this means is that all hosts that belong to xen-dom0 group will get openntpd installed.
        
        What makes this nice is that bcfg2 will handle all of the nastines for me. I don’t have to worry about old ntp instances when i install a new ntpd.conf, bcfg2 will restart it for me if something needs to be changed inside the bundle. No scripting here.
        
        What i remember from cfengine when i used to use it for handling this kind of stuff, i had to write the logic to restart the services. If i forgot to do that the services would run with the old configurations.
        
        It’s been quite some time since i used cfengine, but i still have these nightmare like flashbacks from writing the configuration…
        
        Don’t get me wrong, I appreciate cfengine and i believe it will make you coffee and fix your fridge if you tell it to, but i just want to manage configuration.
Matt Palmer says:

July 26, 2006 at 8:49 pm

Puppet is getting used in a bunch of places; it’s managing several clusters of production machines, as well as developer workstations, at the client I’m currently at, and my primary employer will be using to manage a couple of hundred machines scattered across the country in the next month or so.

It doesn’t have the wide traction of cfengine, but it’s new, which means it does have it’s foibles. Luckily, Luke Kanies is incredibly responsive to bug reports and things typically get fixed pretty quickly.

Puppet already Sucks Less than CFEngine to the point that I’m dreading having to go back to wrangling CFEngine’s nits at legacy client sites. Puppet is just soooo much smoother.

I’ve not used bcfg2, but it appears to be a little more research-oriented (which has irritated me more than a bit with CFEngine) than Puppet (which is entirely built on solving real-world problems) and it’s full of XML, which I’m not keen on as a data description language. I doubt it can suck more than CFEngine, though, and it’s general principle of operation seems to be quite similar to Puppet (objects, states, transitions, etc).

Reply
1. John Goerzen says:
  
  July 26, 2006 at 9:04 pm
  
  I could entirely see us using Puppet in a year or two. I’m usually an early adopter (heck, we’ve had 64-bit Debian amd64 systems in production for ages now — with no problems whatsoever — so I was amused to see Slashdot announce the Debian port today). But something like this is both critical and error-prone. Mess up one character in the wrong file and a person can have a network full of mission-critical machines with grave issues.
  
  So in a sense, cfengine is the “safe” choice. It has its faults, but they are well-understood.
  
  Having an upstream author that is responsive to bug reports is great. But frankly, use on a few clusters is a lot different from a large heterogenous environment. Not that we have that, being a mostly Debian place — but we certainly aren’t as homogenous as a cluster.
  
  Reply
2. Narayan Desai says:
  
  July 26, 2006 at 10:08 pm
  
  I just felt the need to add a little bit of history about bcfg2, to address the research issue. I work at a research lab, and am the primary author of bcfg2. (I was also the primary author of bcfg1, but I digress). I started working on bcfg(1) at LISA02, after thinking to myself “how hard can it be to write a good tool”. It turns out to be a little harder than I thought ;)
  
  We are an unusual research lab in that we have not only have a systems team that is unusually competent and large, but that they are allowed to do work on system management projects. Bcfg2 is the product of that team; while I wrote most of the code, it was mainly based on their comments, concerns and complaints.
  
  This environment has given me a great opportunity to think pretty hard about the issues and our configuration problems and come up with a solution that has taken a lot of time to implement (4 years and counting). It has also allowed me to proceed in a fairly methodical fashion.
  
  I think this takes us to the place where Luke and I disagree most strongly. He has made several design decisions in the puppet language design that I think are shortsighted for what he has cited as pragmatic reasons. Don’t get me wrong; Luke is a smart guy and a good admin, I just don’t think that he has a large enough sample set of users yet to have a good handle on the big picture. (I don’t even have a complete handle on this yet, and bcfg2 is used by 20ish admins across a few thousand systems)
  
  I will admit that bcfg2 has taken a little too long to get out of the sandbox and out to users. This has largely been an issue of learning what it takes to make a real open source project out of some code that works for me ™. I am confident that we are now past this.
  
  Dismissing bcfg2 as a research tool is somewhat amusing to me. There are a large number of hard problems facing system administrators that we don’t have a good understanding on yet. People aren’t good at dealing with large amounts of complexity. This is biting us all over. In the long term, tools need to help. Similarly, as tool get more sophisticated, we need to understand how they are operating. These sorts of problems aren’t going to be solved by hacking up a quick solution. It will be a combination of organized experiments, failed designs, and deep thought that will get us there.
  
  Reply
  1. Jeff Croft says:
    
    October 1, 2006 at 4:16 am
    
    I’ve recently worked at a place that used cfengine, and I really liked it, but I’m looking for something better, and that has lead me to evaluate bcfg2 and puppet, which both look really interesting. I was wondering if you could elaborate more on what you consider to be short-sighted about puppet?
    
    Thanks.
    
    Reply
John Goerzen says:

July 26, 2006 at 8:59 pm

I have one other thing to say about bcfg2.

Let’s compare both cfengine and bcfg2 to make. I think the analogy is apt — all three tools have rulesets for accomplishing things.

Now, make (well, at least GNU make) comes with a set of implicit targets. It may very well be able to divine how to turn a simple .c file into a .o file without you ever telling it a thing about a C compiler.

But will it do it the way I want to, with my optimization and debugging preferences? How can I be sure?

I think this question occurs to most people, and very few use these implicit make targets.

It sounds to me like bcfg2 has abstracted up a lot of the logic of things like “Ok, on Debian I stick this file in /usr/locl/bin, and on AIX it goes in /etc/libexec”. Which is a fine thing. But on the other hand, it’s fairly easy to abstract up that logic in cfengine myself and still not have to worry about it in the end. It looks easier to me to abstract that logic myself in cfengine than to figure out exactly what bcfg2 is doing and tweak it if I don’t like it. (I’m not talking add-on modules here — just class-based variables and decisions in cfengine.)

Even something like package management — which cfengine has abstracted already for us — could be done pretty easily with a couple of droppable scripts, I think.

I like being able to think of cfengine as a specialized version of make. It is familiar and that’s a good thing.

I wonder what bcfg2 would do with, say, AIX 5.1L which few in the Free Software community seem to have seen before (or handle properly)? In cfengine, it would be just another class to work with. Would the same hold in bcfg2?

Reply
1. Narayan Desai says:
  
  July 26, 2006 at 10:51 pm
  
  That seems like an apt analogy to me. Bcfg2 allows you to encode differences between architectures, so they work much like the explicit rules in your analogy.
  
  At its core, the bcfg2 server provides a boolean logic engine that allows you to describe your system in terms of overlapping and discrete groups. These group memberships are used to build client specific configurations, much as you would with cfengine. In the end, the result for all of these tools is just bits on disks, so that end is the same. The real difference that bcfg2 provides is a set of qualities about the deployment process. Administrators can observe the process and examine what the results would be if the client made specified changes. They can watch a patch being rolled out across a whole network (and similarly find hosts that have not applied it yet). We have written a script that builds a graphviz representation of the system metadata; this provides a map of the entire network’s system configurations.
  
  In some sense, dd(1) is the first configuration management tool; we are all just providing tools with a more refined user interface. ;)
  
  The characteristics I am describing above are vary greatly in importance from site to site, and depend greatly on the administrative environment. For example, a lone administrator is unlikely to have issues with collaboration issues. In our environment, we are at the other extreme; we need several administrators to be universally cross-trained across each other’s systems and able to debug problems where ever they come up. We have found that a tool with the right use properties provides a substantial benefit of this front.
  
  Getting back on to my personal hobby horse for a moment, I think that the use issues surrounding these tools are both an open research issue and a really interesting problem.
  
  Let me suggest an alternate analogy that is IMO more compelling in the long run. Consider a configuration management tool like a compiler. We have started with assembly code. It is quite concerned with moving bits around, in and out of registers. In order to make programming safer, we needed to move to higher level languages, where the toolchain could provide more safety checks and catch the most egregious errors. The assembly model is quick similar to the cfengine one. The operations are low-level, and you can write anything that you want using it.
  
  In the long term, in order to construct more complex and resilient networks, we need to have a better way to describe things. Consider the benefits of C over assembly code. C code is easier to write, because it is higher level. You can write the same programs with it as you can with assembly, but you will probably end up with a more maintainable program more quickly in C. As time goes on, C compilers have gotten better. In some sense, optimizing compilers have allowed architectures like ia64 to exist; if you have a good enough compiler, very few people need to understand how to optimize for a very complicated architecture.
  
  I see configuration management tools much like compilers in 1975. We have a few high-level languages, but we haven’t yet completely figured out what is good and bad about them. We are really bad at getting our tools to help administrators do complex things in a reliable way, but I think this will change over time.
  
  On the topic of client-side architecture support, the bcfg2 client is structured as a set of modules that implement support for system tools. we have a module that implements all of the posix stuff and is usable on any platform. So in the case of AIX (we actually have someone interested in it) we get all of the posix stuff for free, but need to add explicit support for their package manager and crazy inittab method for starting services. Similarly, if we wanted to support hpux, we could do posix things pretty easily, but couldn’t control packages and services until we added client support. A goal for our next release is to make it easier to mix and match arbitrary package and service management schemes on different architectures, so that you can use encap across several architectures, or apt on redhat, or apt + smf on nexenta.
  
  Reply
Stephen Quinney says:

July 27, 2006 at 3:21 am

You might also want to look at LCFG http://www.lcfg.org/. The initial learning curve is a bit steep but the eventual gains are huge if you have a lot of machines to manage. We are using it in the Edinburgh University (UK) School of Informatics to manage about 1000 machines and it would easily scale further with not much more effort.

Reply
Steve Kemp says:

July 27, 2006 at 4:13 am

What are you using to install the Debian packages if I could ask?

Right now I’m using the setup I described here:

http://www.debian-administration.org/articles/398

It works, but it always feels a little brittle. (There is updated code linked to in the comments for doing purges and version checks).

Reply
1. John Goerzen says:
  
  July 27, 2006 at 8:39 am
  
  Steve,
  
  cfengine2 has a built-in packages section.
  
  We have:
  
  [code]control:
  debian::
  DefaultPkgMgr = ( dpkg )
  pkgmgr = ( dpkg )
  DPKGInstallCommand = ( “/usr/bin/apt-get -y install %s”)[/code]
  
  And then things like this:
  
  [code]packages:
  debian::
  less action=install
  module-init-tools action=install
  ssh action=install
  xfsprogs action=install
  
  …
  
  debian.timeclients::
  ntp-simple action=install
  ntpdate action=install[/code]
  
  I also distribute a debconf.conf to each machine, along with a central database of default answers. So we really do get by with unattended installs on all this. Very slick and simple.
  
  cfengine2 has built-in support for querying the dpkg/apt database and will only try to install things that aren’t already installed. It also can be told to do more advanced things, such as look for particular versions of packages or just report on missing/old packages instead of upgrading them.
  
  Reply
  1. Steve Kemp says:
    
    July 29, 2006 at 1:00 pm
    
    Thanks for the tip. At the last time I experimented with this using CFengine dpkg refered to Sun’s packaging format, rather than Debians and I couldn’t get it to work out neatly.
    
    I’ll reinvestigate this.
    
    Reply
John Goerzen says:

July 27, 2006 at 4:09 pm

OK, you’ve convinced me. I found your ;login article and will try out bcfg2 also!

Reply
1. Narayan Desai says:
  
  July 27, 2006 at 6:53 pm
  
  Excellent! We are in the process of writing up more documentation along the lines that you have described. Like I said earlier, the first few weeks of a new user’s experiences with bcfg2 help us to refine the tool more than anything else. Please do report anything that seems counter-intuitive, or harder than it should be. We are eager to improve the usability and utility of bcfg2. Thanks for giving us a chance.
  
  Reply
  1. Pete Kazmier says:
    
    July 30, 2006 at 11:25 am
    
    I’ve recently been thinking about a configuration management tool to help manage various *nix servers (Solaris, HP, Linux) in a very large service provider network. After following this thread, bcfg2 sounds very appealing as I’m much more interested in a declarative tool that I can use to also validate the “actual” vs “expected” configurations on a server. Recently, someone made a provisioning change to a SS7 soft switch which cost the company several hundred thousand dollars in margin. I was going to build my own tools for this one specific service, but it seems that bcfg2 is just what I am looking for. Thanks for the interesting discussion!
    
    Reply
Luke Kanies says:

August 8, 2006 at 10:46 am

I can’t seem to figure out trackbacks in my blog, so a simple reference will have to do:

http://madstop.com/articles/2006/08/08/puppet-vs-bcfg2

I think you dismiss Puppet a bit too easily. It’s in production use in a number of places, has its own language as an interface akin to cfengine but much more powerful (see http://reductivelabs.com/projects/puppet/documentation/notcfengine.html) , and can easily model just about any element you want to manage.

At the least, give it a try; see how easy it is to get running vs. BCFG2 or even cfengine.

Clearly, tho, I need to do a better job of documenting who’s using Puppet.

Reply
1. Rick Robino says:
  
  June 11, 2007 at 4:45 pm
  
  KISS has been tossed about here rather alot, wrt all three tools. Like many admins, I take a peek at the outside tool landscape every few years and emerge a bit disappointed that my own scripts and perhaps a few extra bodies remains the safest choice for critical production. So, I read this thread and was struck by the following flaws/violations of KISS and thought I should share:
  – cfengine, no evolution driven by community && inevitable dissatisfaction driving other projects like puppet and bcfg. KISS Violation: complicated to learn one system and even moreso to have to hack around or completely invent a new one.
  – bcfg – XML. Generally unpopular compared to the other two? Just reading reviews that seems to be the case but otherwise seems to be the best tool. But XML? I can’t see how that is keeping things simple for a person usually in the context of vi and not an XML validator. XML, compared to less structured configuration storage idioms, is complex.
  – puppet. Sounds great if you are willing to install yet another entire interpreter, plus any number of modules, on your system. Oh yes, and learn a new programming language. That is not keeping it simple – writing a good system in ast-ksh, that along with the very passable grammars of all three tools (encoding aside) is keeping it simple. I can’ think of any “good” admins who are willing to deploy such a package (fast-moving development), especially without being able to hack the code themselves.
  
  Just a random $0.02. That’s USD so it’s worth less every second ;-)
  
  I’m off to evaluate bcfg or lcfg for a couple of hundred hpux and solaris machines, with my biggest problem to solve being the management of oracle-backed applications around the backup schedule. Somehow I’ll need to tie into HP OVO and whatever else. Sure seems like there is some money to be made here (good luck puppet!). And after that I half-expect to just go see if I’ll have to make a new wheel a bit less round, using the Software Tools Approach… in the name of KISS. It strikes me that there is another very obvious truism about this subject – simplicity is in the eye of the beholder, and is perhaps influenced indirectly by age (amount of languages and formats already using brain storage)…
  
  (p.s. This is in no way personal criticism toward any of the authors or communities or languages or other technologies. To the contrary, the amount of work, especially on the theoretical side, is something I esteem very highly. My intention is to underline previous declarations of how difficult the practial implementation of the KISS maxim truly is; like Liberty, there are various excuses to sacrifice a bit of the ideal and then lots of pretending not to notice that both the ideal and purpose have been utterly compromised.)
  
  Reply
Anonymous says:

December 16, 2007 at 11:24 am

I’m looking at tools of this ilk for a 100-200 machine outfit. That Puppet is written in Ruby and avoids XML are two overwhelmingly *positive* factors in my wanting to look at it first.

XML is a waste of time, space, and energy. QED.

Ruby is a superb language and an excellent sysadmin scripting tool. Writing anything halfway serious in shell would be, in my opinion, a grossly misguided undertaking.

Reply
1. Rommel says:
  
  December 19, 2007 at 1:32 am
  
  Hey man, are you sure XML is a waste of time, space, and energy?
  
  XML is so powerful and simple, u can see it in a lot of websites.
  
  Reply
2. anon says:
  
  January 12, 2009 at 8:57 am
  
  Why is installing Ruby on a couple of thousand boxes “a positive factor”?
  
  We already have Python and good old Perl installed, we don’t need another.
  
  Reply
John says:

March 25, 2008 at 5:22 am

I started using bcfg2 a bit over a month ago. This is a pretty interesting discussion, and I thought I’d share my experience.

Note that I started with bcfg2 after failing to get cfengine working; somehow, bcfg2 made instant sense to me. I have not tried puppet.

Bcfg2 installed easily, and the tutorials on the bcfg2 trac site were almost adequate to get me going. I think it took me two days from when I discovered bcfg2 to when I had my first machine under (more or less) configuration management (including kickstarting/bcfg2ing from scratch and coming to life fully configured). The second system took me another day, since I had to find the seams between the configuration of the first, and split the configurations there. A few more took a half day each.

I didn’t have a problem with some of the folks’ complaints here. For example, XML doesn’t seem difficult to me, even though it’s rather verbose (yes, I do everything with emacs).

I did have other problems, and some still bother me. I have never gotten the bcfg2 agent working properly so that configs can be pushed out to systems; the agent would always crash. As a result, I find myself doing a kind of round robin, making a change to a machine, then getting the machine synched with bcfg2’s idea of where it should be, then moving onto a different change on a different machine, and doing the same. I don’t blame bcfg2 for this, though, and at some point it will become enough of a priority for me to fix it!

Anyway, I am quite happy with bcfg2, and it’s super-cool to be able to reinstall every machine on the network at any time and have it come back online in a configured state, services working, in just a few minutes.

One last thing: Narayan and others on the IRC channel have been extremely helpful in helping me and other newcomers. This is quite refreshing, especially after spending too much time in places the Shorewall list. ;)

Reply
1. sri says:
  
  June 3, 2008 at 1:35 pm
  
  John
  
  I am seriously considering implementing bcfg2. However, you mentioned about agent crash when pushing config files. Can you please elaborate?
  
  Thaks in advance
  sri
  
  Reply
2. hydrostarr says:
  
  August 14, 2011 at 1:59 pm
  
  John writes: “Narayan and others on the IRC channel…”
  
  Which IRC channel is this?
  
  Reply
James says:

April 1, 2008 at 10:18 pm

I agree the anonymous above who prefers ease to use tools written in Ruby to overly verbose XML configuration files. However, I prefer AutomateIt (http://automateit.org/) to Puppet.

Reply
Adam Keck says:

September 8, 2008 at 1:02 pm

To those concerned with the verbosity of XML, I have found that clarity grows with verbosity, for the most part. That is, if I put up with more typing up front, I have a much better chance of understanding my code, configuration files, etc., months later. Also, XML can often be validated with standard tools, whereas one would have write new tools from scratch to validate a new language.

My $.02,

-Adam

Reply
John Arundel says:

February 8, 2010 at 7:20 am

I wrote up a comparison of Puppet and Chef which turned into a very interesting discussion in comments. The major design differences between them (notably the choice of a Ruby DSL, or a custom language, and the ability to guarantee deterministic ordering in the application of resources) are hashed out between some of the main coders and designers.

http://bitfieldconsulting.com/puppet-vs-chef

Reply
1. Robert Jones says:
  
  February 14, 2010 at 12:15 pm
  
  This “comparison” of yours is just another Puppet Propaganda Page.
  You must think that users are incredibly stupid to believe the things you are feeding them.
  
  The points you try to make are actually arguments for choosing Cfengine; install base, developer base, documentation, flexibility, platform support.
  
  The only thing missing is efficiency and scalability, as Cfengine uses a couple of megabytes of RAM, Puppet (and Chef) consume gigabytes: http://bit.ly/a3EhU3
  Additionally, Cfengine is around 20 times faster in execution time: http://www.usenix.org/publications/login/2010-02/index.html (Usenix login required).
  
  In a virtualised environment or where system resources matter, the choice should be simple.
  
  Reply
  1. John Arundel says:
    
    February 16, 2010 at 8:26 am
    
    Well, cfengine wasn’t one of the tools being compared (the article is called ‘Puppet vs Chef’ after all).
    
    I’d describe the article as ‘advocacy’ rather than ‘propaganda’ (I’m not affiliated with the makers of Puppet, nor do I get any financial reward from them for writing about Puppet).
    
    You make a fair point that cfengine uses less memory. My answer would be that you can write programs in C which run faster and use less memory than equivalent programs in Ruby or Perl. The reason most people prefer the higher-level languages is because they’re easier and more fun to code in, the code is easier to read and more maintainable, and they abstract away the details of the hardware so that you can concentrate on solving your problem at a more conceptual level.
    
    I think these reasons also factor in to the choice of Puppet or Chef over something like cfengine.
    
    Reply
    1. Tuomas Noraef says:
      
      July 15, 2010 at 8:31 am
      
      High-level languages are cool… but come on : I extensively tested Puppet, and its memory consumption is a blatant catastrophy. At worst times, it managed to somehow consume hundreds and hundreds of MB in RAM ! And it was continuously growing when I stopped the disaster ! Seriously : you expect me to install such behemoths on servers, some of which, very tiny, but as much very important, only consume 32MB without Puppet ? Really ?!?! Haven’t tested Chef yet, but I can’t tell you I since have a really big grudge against Ruby daemons. Occasionaly launched ruby apps are one thing – resident ruby daemons with a shitload of extensions, consuming such amounts of resources, is another, and is nowhere near even barely acceptable.
      
      Not to speak about the client/server versioning upgrade path (which disadvantage seems also shared by Chef) : not only is it not supposed to work whatever the respective versions of the server and the client parts, but very worse ! The server part must be upgraded BEFORE the client part… I sure have ideas about words simple enough to explain how stupid I feel it is, but I’d which to stay polite. Not only such system should be working whatever the respective versions of the clients and the server are, but in the already grotesque case it wasn’t, I cannot understand how anyone could think it could be a somewhat good design to upgrade the machine holding the most important pieces of information in my systems (which anyone even vaguely reasonable would place very high in the list of machines to be kept the most stable possible), while holding the upgrades of all the others ! Such design goes fundamentally backwards ! And I am supposed to trust the work of people responsible for such a flawed reasoning to help me MANAGE my systems ? Seriously ? Ok, then : but I’ll first let them come up with a proper versioning upgrade path, ie at least (or worse) let recent client versions take their configurations from older servers (and ideally let this all work, whatever versions of Puppet I am running anywhere)…
      
      Now, I also got grudges against Cfengine : I had big expectations about the third major version, like being able to pull values from databases to complete configuration skeletons on the fly before applying them. Oh, sure, it is now avalaible natively : through the proprietary Nova extension. Up this point, I do not even care about the syntax being less obscure than the atrocious previous one, or anything. This alone pisses me so much I decided not to even look further into it : I deem such strategy as nothing else but pure and simple crookery (and I’m being polite). If I would certainly not touch Puppet anymore, even with a barepole, I am not coming any closer to Cfengine, would it be version 3. No way.
      
      I haven’t tested bcfg2 yet : maybe I’ll like it more. Don’t know yet. For now, I use a custom pile of thingies, mainly because I haven’t been able to be satisfied with whatever “finished” product I tried, in the matter of configuration management softwares.
      
      What I want is :
      
      – first, a server, on which skeletons of configuration reside. Before making anything avalaible, the server should build personalized configuration files for its client, using database pulls, the database being the one holding (and allowing me to edit easily) every and each value differentiating one server from another.
      
      – once it’s done, the server should make those already customized files avalaible to the clients – but beware : each client should only have access to recipes relevant to him. I agree that the same basic recipe (or recipe skeleton) should describe a NTP server, wether it uses openntp or ntpdate or anything, wether it is Debian, CentOS, or anything-based… but that ends here ! No way that my mail servers should know even the slightest thing about the NTP server or anything else but their own configuration. I really couldn’t care less about push, or pull, or this kind of onanist theories (I actually believe both should be made possible, so that people wanting one, the other, or even both should be able to use what they want) – as long as it works, and as long as the ACL, protecting each configuration, so that it only is readable (let alone applied) by the machine it is legitimately aimed to, are strong, tight, and easy to use.
      
      I believe in classes being very important in configuration management, but that they should only let the machine know wether the NTP server it has the recipe for is already installed, so it can install it if not, wether it is running, so it can take measures if not, and such thing. Certainly NOT to let him choose what he wants to be in the hundred pages cookbook I let him access to (no my dear servers, I will not let you dream about being a raped princess, a chaos knight, a necromancer, or an electric sheep : I am your God, and that is the only thing you need to know ! For the rest, obey me, and all will be fine !)… I think the way, far too much seen in a lot of configuration management software examples, that all recipes are avalaible for every and all machines, letting them decide about everything and anything, based on classes, is such a wrong, wrong, wrong ! approach to this concept…
      
      – as for the clients, they should only see what they are deemed to see by the admins. But that is not the only mechanism I expect from them. I ABSOLUTELY need a way to make them communicate with me, ideally, through mail, to present me agreement requests, I can sign or not, before the management software accepts to apply some modifications. In the case of system upgrades, I totally refuse to let it happen all automatically : I want the software presenting me a list of what there is to upgrade, ideally sending me a crypted mail, me decrypting the mail using a private key, signing it (or not) with a private key, and crypting the answer with the server’s public key, sending it back to the machine, which would then decrypt it, verify the signature, and upgrade the packages if the list of things to upgrade has not been changed in the meantime, or warn me back through mail in the other hand. Of course, the system should be able to send the crypted mail to several admins (each mail crypted differently, so the different admins could decrypt the mail with their very own keys), and be able to check the multiple signatures of the people authorized to agree for an upgrade on this particular machine.
      
      Just another word about package management : if most package management apps are abusively simple (for instance, yum or such things not even being able to uninstall unused dependencies without using third pary tools), there are much better ones. Aptitude on Debian allows me to give a list of apps to be installed : if others are, I want the machine to tag those as “automatically installed”, which either make them dependencies (automatically uninstalled in the event nothing more still depends on them), or automatically uninstall them if they were originally installed manually. This at the time is so, so, so much more simple AND efficient than the “I am a big-clunky-dummy and if I think of the next step at the same time as the current one, I will fall, so I only ask one thing at a time”-way the integrated package recipes are so often encoutered in those kind of softwares (Puppet and Cfengine, for instance). Now, you don’t like it ? No problem, but instead of telling me how I should manage my packages, let me have a native way of accessing databases (this at least is most of the time standardized, contrarily to package management !).
      
      As I said, don’t know if I’ll like bcfg2 with the requirements I have. Do not know yet. But what I know is that Puppet and Cfengine are very, very, very horrible, and useless in the real world, pieces of junk. And what I suspect is that their designers probably haven’t ever administered anything more real than their bank accounts, or their weeners…
      
      Reply
Robert Jones says:

June 7, 2010 at 7:47 am

You’re the first one to mention Chef in this thread, obviously just trying to make the google rank of your “advocacy” page higher.
Coincidentally, that’s the exact same strategy used by the puppet guys to publish propaganda and lies about competitors throughout the years.

Sure, you’ll find developers thinking Ruby is more fun to code in than C (and the other way around), but the arguments you make are in any case just for the convenience of developers, not users.
The users just get a lot more complexity to cope with when they have to make sure all Ruby packages and dependencies (huge!) are updated and free from security holes.

Furthermore, since Ruby abstracts away so much details, how come Cfengine supports so many more platforms (windows, HPUX, AIX, etc. – see http://cfengine.com/pages/nova_supported_os)?
Are Cfengine developers just so much better?

Reply

The Changelog

Comments on family, technology, and society

Thoughs on cfengine, bcfg2, and puppet

34 thoughts on “Thoughs on cfengine, bcfg2, and puppet”

Leave a Reply Cancel reply