On Dec 8, 2011, at 3:43 PM, Chris wrote:
My company is pretty late to the Chef party, only getting things started about 6 months ago (after a year of asking for it), but now that we have things up and running we’ve run into a bit of a problem. The client consumes a fairly large amount of memory, between 175-250m per server. This has caused a lot of concern from the Operations team since that amount * N VMs can get quite expensive. I’ve been doing some research into this and noticed that the amount of resident memory can depend on how many recipes are loaded on a node, and Opscode docs seem to confirm this. Right now these cookbooks are loaded into a single base role and added to each node for ease of use. They’re all OS level recipes to manage hostfiles, resolv.conf etc… etc… There are 20 total. We also have application roles that can add another 3 or 4 recipes.
We’ve been doing chef since August using 0.10.4 on CentOS 5.6. We currently have 43 cookbooks and 37 roles across all of our machines, but I use roles very heavily (I’ll test a new cookbook as a new role on a new machine and then when I’m happy I might include that role as part of another larger role). We just spun up a “staging” environment today, which added twelve new nodes, taking us up to 33 total being managed by chef. On one of our most complex nodes, the run_list has five main roles loaded, while the expanded run_list is sixteen roles and comprises thirty recipes.
I checked, and when chef-client is active, we hit a VSS of about 195MB, but a Resident (working) Set Size of 60-70MB. Even a dry run includes multiple invocations of Python, Perl, and various other programs and languages, many of which have VSS & RSS that are almost as big as chef-client, even though they might only persist for a few seconds during the run.
In comparison, the RevealCloud agent that we run on every machine has a VSS of ~160MB, although the RSS is just over 2MB. This machine is brand-new and is virtually idle, but each httpd process has a VSS of ~150MB and an RSS of just under 5MB, and we spin up a total of seventeen of them.
This is on a Rackspace “flavor 3” VM which has allocated to it 1GB of RAM, 40GB of hard disk space (~35GB usable), etc… There are only two VM images that Rackspace makes available that are smaller than this – a “flavor 2” with 512MB of RAM, and a “flavor 1” with 256MB of RAM.
Compared to all the other things that this VM is doing, the overhead of chef-client seems pretty reasonable to me – not really any more than another httpd process, or the overhead from the RevealCloud monitoring system. Not something that I would consider totally negligible, but also not that significant.
Speaking only for myself, I believe that if you’ve got systems where you really are this tightly constrained for memory, then I think you’ve got much bigger problems than whether or not you can afford to run chef-client.
Brad Knowles email@example.com
SAGE Level IV, Chef Level 0.0.1