Server load

hi chef community,

i just installed the chef and chef-server via apt on ubuntu lucid - the setup is super easy and simply works.

but i noticed an increased server load of 1.5 - 2.0 with just running chef on the server.

when looking at the top output (before/after) below and you see that the merb-workers are permanently running and causing the load.

their is no real action (no connected nodes, etc.) on the server, i just installed “apt-get install chef chef-server”.

if this is the normal server load - what mininmal hardware requirements does you suggest to manage a 50 node cluster in production ?

test environment:

  • ubuntu lucid ec2 ami (32-bit ami-cf4d67bb)
  • ec2 c1.medium (2 x @2.33Ghz, 1.8 GB ram)
  • chef version 0.9.4

BEFORE INSTALLING CHEF

top - 11:56:37 up 5 min, 1 user, load average: 0.01, 0.01, 0.00
Tasks: 72 total, 1 running, 71 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.9%us, 0.3%sy, 0.0%ni, 98.7%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 0.8%us, 0.1%sy, 0.4%ni, 98.4%id, 0.1%wa, 0.0%hi, 0.0%si, 0.1%st
Mem: 1781976k total, 155492k used, 1626484k free, 3392k buffers
Swap: 917496k total, 0k used, 917496k free, 90472k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 2680 1612 1204 S 0 0.1 0:00.15 /sbin/init
2 root 20 0 0 0 0 S 0 0.0 0:00.00 [kthreadd]
3 root RT 0 0 0 0 S 0 0.0 0:00.00 [migration/0]
4 root 20 0 0 0 0 S 0 0.0 0:00.00 [ksoftirqd/0]
5 root RT 0 0 0 0 S 0 0.0 0:00.00 [watchdog/0]
6 root 20 0 0 0 0 S 0 0.0 0:00.00 [events/0]

AFTER INSTALLING CHEF

top - 12:07:19 up 15 min, 1 user, load average: 2.02, 1.46, 0.72
Tasks: 88 total, 1 running, 87 sleeping, 0 stopped, 0 zombie
Cpu0 : 8.0%us, 0.0%sy, 0.0%ni, 92.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1781976k total, 921812k used, 860164k free, 34052k buffers
Swap: 917496k total, 0k used, 917496k free, 635652k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ TIME COMMAND
6282 chef 20 0 41504 27m 2408 S 5 1.6 0:18.97 0:18 merb : worker (port 4040)
5536 chef 20 0 35256 21m 3020 S 3 1.2 0:12.20 0:12 merb : worker (port 4000)
5347 root 20 0 32164 20m 2348 S 0 1.2 0:01.87 0:01 merb : merb : master
5184 chef 20 0 408m 46m 11m S 0 2.7 0:01.26 0:01 java -Xmx256M -Xms256M -Dsolr.data.dir=/var/cache/chef/solr/data -Dsolr.solr.home=/var/lib/chef/solr -DSTART=/var/lib/chef/solr/solr-jetty/etc/start.config -j
4535 rabbitmq 20 0 49408 10m 2156 S 0 0.6 0:00.47 0:00 /usr/lib/erlang/erts-5.7.4/bin/beam.smp -W w -K true -A30 – -root /usr/lib/erlang -progname erl – -home /var/lib/rabbitmq – -pa /usr/lib/rabbitmq/lib/rabbi
5528 root 20 0 32228 19m 2372 S 0 1.1 0:00.38 0:00 merb : merb : master
6284 root 20 0 2544 1212 948 R 0 0.1 0:00.37 0:00 top
4196 couchdb 20 0 66296 9.8m 3604 S 0 0.6 0:00.29 0:00 /usr/lib/erlang/erts-5.7.4/bin/beam.smp -Bd -K true – -root /usr/lib/erlang -progname erl – -home /var/lib/couchdb – -noshell -noinput -smp auto -sasl errl
4449 root 20 0 29304 16m 1652 S 0 1.0 0:00.22 0:00 /usr/bin/ruby1.8 /usr/bin/chef-client -L /var/log/chef/client.log -d -c /etc/chef/client.rb -i 1800 -s 20
1 root 20 0 2680 1612 1204 S 0 0.1 0:00.15 0:00 /sbin/init

regards
jan zimmek

other people have reported the same issue and that there seems to be a bug
ticket at Ubuntu dealing with this
Bug #574910 “High load averages on Lucid while idling” : Bugs : linux-ec2 package : Ubuntu, I don’t
believe this has anything to do with Chef per se.

-Cary P

On Fri, Jul 2, 2010 at 5:14 AM, Jan Zimmek jan.zimmek@web.de wrote:

hi chef community,

i just installed the chef and chef-server via apt on ubuntu lucid - the
setup is super easy and simply works.

but i noticed an increased server load of 1.5 - 2.0 with just running chef
on the server.

when looking at the top output (before/after) below and you see that the
merb-workers are permanently running and causing the load.

their is no real action (no connected nodes, etc.) on the server, i just
installed "apt-get install chef chef-server".

if this is the normal server load - what mininmal hardware requirements
does you suggest to manage a 50 node cluster in production ?

test environment:

  • ubuntu lucid ec2 ami (32-bit ami-cf4d67bb)
  • ec2 c1.medium (2 x @2.33Ghz, 1.8 GB ram)
  • chef version 0.9.4

BEFORE INSTALLING CHEF

top - 11:56:37 up 5 min, 1 user, load average: 0.01, 0.01, 0.00
Tasks: 72 total, 1 running, 71 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.9%us, 0.3%sy, 0.0%ni, 98.7%id, 0.2%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu1 : 0.8%us, 0.1%sy, 0.4%ni, 98.4%id, 0.1%wa, 0.0%hi, 0.0%si,
0.1%st
Mem: 1781976k total, 155492k used, 1626484k free, 3392k buffers
Swap: 917496k total, 0k used, 917496k free, 90472k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 2680 1612 1204 S 0 0.1 0:00.15 /sbin/init
2 root 20 0 0 0 0 S 0 0.0 0:00.00 [kthreadd]
3 root RT 0 0 0 0 S 0 0.0 0:00.00 [migration/0]
4 root 20 0 0 0 0 S 0 0.0 0:00.00 [ksoftirqd/0]
5 root RT 0 0 0 0 S 0 0.0 0:00.00 [watchdog/0]
6 root 20 0 0 0 0 S 0 0.0 0:00.00 [events/0]

AFTER INSTALLING CHEF

top - 12:07:19 up 15 min, 1 user, load average: 2.02, 1.46, 0.72
Tasks: 88 total, 1 running, 87 sleeping, 0 stopped, 0 zombie
Cpu0 : 8.0%us, 0.0%sy, 0.0%ni, 92.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Mem: 1781976k total, 921812k used, 860164k free, 34052k buffers
Swap: 917496k total, 0k used, 917496k free, 635652k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ TIME COMMAND
6282 chef 20 0 41504 27m 2408 S 5 1.6 0:18.97 0:18 merb :
worker (port 4040)
5536 chef 20 0 35256 21m 3020 S 3 1.2 0:12.20 0:12 merb :
worker (port 4000)
5347 root 20 0 32164 20m 2348 S 0 1.2 0:01.87 0:01 merb :
merb : master
5184 chef 20 0 408m 46m 11m S 0 2.7 0:01.26 0:01 java
-Xmx256M -Xms256M -Dsolr.data.dir=/var/cache/chef/solr/data
-Dsolr.solr.home=/var/lib/chef/solr
-DSTART=/var/lib/chef/solr/solr-jetty/etc/start.config -j
4535 rabbitmq 20 0 49408 10m 2156 S 0 0.6 0:00.47 0:00
/usr/lib/erlang/erts-5.7.4/bin/beam.smp -W w -K true -A30 -- -root
/usr/lib/erlang -progname erl -- -home /var/lib/rabbitmq -- -pa
/usr/lib/rabbitmq/lib/rabbi
5528 root 20 0 32228 19m 2372 S 0 1.1 0:00.38 0:00 merb :
merb : master
6284 root 20 0 2544 1212 948 R 0 0.1 0:00.37 0:00 top
4196 couchdb 20 0 66296 9.8m 3604 S 0 0.6 0:00.29 0:00
/usr/lib/erlang/erts-5.7.4/bin/beam.smp -Bd -K true -- -root /usr/lib/erlang
-progname erl -- -home /var/lib/couchdb -- -noshell -noinput -smp auto -sasl
errl
4449 root 20 0 29304 16m 1652 S 0 1.0 0:00.22 0:00
/usr/bin/ruby1.8 /usr/bin/chef-client -L /var/log/chef/client.log -d -c
/etc/chef/client.rb -i 1800 -s 20
1 root 20 0 2680 1612 1204 S 0 0.1 0:00.15 0:00
/sbin/init

regards
jan zimmek

On Fri, Jul 2, 2010 at 9:51 AM, Cary Penniman cary@rightscale.com wrote:

other people have reported the same issue and that there seems to be a bug
ticket at Ubuntu dealing with this
Bug #574910 “High load averages on Lucid while idling” : Bugs : linux-ec2 package : Ubuntu, I don’t
believe this has anything to do with Chef per se.

I've seen this as well. You might look at some output from vmstat,
for example. One hint that something odd is going on is if you look
at your output, you see the higher load avg, but almost everything in
idle state.

  • seth

--
Seth Falcon | @sfalcon | http://userprimary.net/