We’ve been running for about a month with all of our VMs in a single
chef server, which adds up to about 395 clients and nodes.
I’ve recently begun running into issues where the server does not
respond for large periods of time… if I view the “prod” environment
and look at the client status pane in the webui, it can sometimes take
up to 2 or 3 minutes to display.
We have all of our clients configured to check in 5 or 6 times during
the night and not at all during the day, so I’m confused what the
server is “doing” during this time, since it isn’t serving clients.
Additionally when testing during the day, clients will often get 2 or
3 failed connections to the server before downloading the run list and
I’m running in vsphere, the chef VM has a single core equivalent to a
2ghz Xeon, and is using half of its 4GB of ram. During the times I am
waiting for the page to load I’ve remoted into the server and checked
active processes… nothing is sucking up CPU, it is running close to
I’m running chef server 0.10.6 on centos 6.1.
Any pointers on what I could/should check would be great!