Architecture or Hardware Specs for a Chef Server with around 15k nodes


#1

I have been mostly working on the chef-client side, so far. Please throw in some tips, to get my research started. I am looking to set up a chef server with around 15k nodes.
Are there different set up or architectures of setting up the chef server?
What is the recommended hardware spec for this kind of server. I think the frequency of chef-client runs on each nodes will play a major role too in the server’s harsware spec


#2

Off the cuff, ensure you have at least 4G of RAM. You may need more for that many nodes, I haven’t run at that scale, but definitely don’t skimp on the RAM side of the equation.

Nathan Clemons

DevOps Engineer

Moxie Cloud Services (MCS)

O +1.425.467.5075

M +1.360.861.6291

E nclemons@gomoxie.com

W www.gomoxie.comhttp://www.gomoxie.com/


#3

We actually have a pretty good doc on scaling the chef server. (1)

As you mentioned, the scale concern isn’t so much the nodes but the
frequency of check ins. In addition to the frequency, use of search,
storing additional attributes in the node object, or storing fewer
attributes all have an effect on the size you’ll need.

One of our engineers, Irving has a great blog post on scaling the chef
server(2), where he goes into some of the details.

The tl;dr is a chef server at 15k nodes needs to be pretty beefy and you
might want to consider setting up high availability and/or replication to
break that up.

  1. https://docs.chef.io/server_components.html

http://irvingpop.github.io/blog/2015/04/20/tuning-the-chef-server-for-scale/

–Mobile Galen


#4

What is a node object being referenced below?

The default maximum allowable size for a node object is 1MB, although it
_ is rare for nodes to exceed 150KB. Though compressed, this data is _
_replicated twice, once in Apache Solr, and once in PostgreSQL. In _
_practice, allowing a conservative 2MB of storage on the disk partition _
per node should be sufficient


#5

It’s the ruby object that you reference as node in your cookbooks. The bulk of its data is from ohai and your default/normal/override attributes. It gets sent to the server in JSON form, which is what I would guess those measurements are based on.


#6

This is not going to be on AWS but inside a data center. And i think the HA comes out of box for aws.

I am looking at the tired set-up. With the tired setup how do i install to have multiple backend servers? Will the backend servers replicate one another?


#7

I run 2 frontends and 3 backends in my setup. I have both frontends
pointed to 1 backend. Each backend runs GlusterFS and mounts it to the
data directory (/var/opt/opscode/) for chef. Fair warning though, the chef
docs (https://docs.chef.io/chef_system_requirements.html) say “The Chef
server MUST NOT use a network file system of any type”, however, i have not
had any issues with several thousand nodes.

-Grant