Performance and scalability of a chef server

Hi,

i am currently evaluating Chef against puppet for my company and i am having
a bit of trouble finding specific information regarding the performance and
scalability of a chef server.

Would anyone have any metrics on the load that a single chef server can take
in term node count based on a context (hardware and otherwise).

Also the docs & wikis keep boasting about the horizontal scalability of chef
but i cannot find any details on it.

  • How is it achieved/architect-ed?
  • what does it take to expand chef server cluster?
  • what are the implication in HA terms?

any pointers to these information would greatly appreciated.

best regards,
david.

On Mon, Aug 1, 2011 at 7:39 AM, david brpr david.eauee@gmail.com wrote:

i am currently evaluating Chef against puppet for my company and i am having
a bit of trouble finding specific information regarding the performance and
scalability of a chef server.

It depends on your environment. Chef scales by adding API endpoints
(chef-server-api, the webui) that are typically concurrency bound by
memory and eventually by CPU. On the back end, CouchDB and Solr both
have fairly well published scalability documentation - the gist is,
scale them vertically for a while, and when you need to, you can talk
about various sharding strategies.

Would anyone have any metrics on the load that a single chef server can take
in term node count based on a context (hardware and otherwise).

This depends on how big the chef server is. Most likely, you are going
to be bound by concurrency - RAM/CPU.

Also the docs & wikis keep boasting about the horizontal scalability of chef
but i cannot find any details on it.

It scales the same way a web application scales.

How is it achieved/architect-ed?

Use load balancers/reverse proxies in front of the API and Web UI
endpoints, and use traditional HA techniques for CouchDB and Solr.

what does it take to expand chef server cluster?

Add more API endpoints behind a LB/proxy.

what are the implication in HA terms?

That it's a pretty typical scenario, if you have more than a passing
familiarity with scaling web applications.

Best,
Adam

--
Opscode, Inc.
Adam Jacob, Chief Product Officer
T: (206) 619-7151 E: adam@opscode.com

Hi all,

On Mon, Aug 1, 2011 at 5:15 PM, Adam Jacob adam@opscode.com wrote:

On Mon, Aug 1, 2011 at 7:39 AM, david brpr david.eauee@gmail.com wrote:

....

what are the implication in HA terms?

That it's a pretty typical scenario, if you have more than a passing
familiarity with scaling web applications.

There was a thread back in July talking about this also:

http://lists.opscode.com/sympa/arc/chef/2011-07/msg00010.html

There's real value in these threads for the OPS guys IMO. Would be
great if someone collects that kind of info and put it in a wiki page
for future reference.

I can do it if you find it appropriate, at least to reference this and
previous threads.

Best,
Adam

--
Opscode, Inc.
Adam Jacob, Chief Product Officer
T: (206) 619-7151 E: adam@opscode.com

That would be great, Sergio!

Adam

On Tue, Aug 9, 2011 at 6:35 AM, Sergio Rubio rubiojr@frameos.org wrote:

Hi all,

On Mon, Aug 1, 2011 at 5:15 PM, Adam Jacob adam@opscode.com wrote:

On Mon, Aug 1, 2011 at 7:39 AM, david brpr david.eauee@gmail.com wrote:

....

what are the implication in HA terms?

That it's a pretty typical scenario, if you have more than a passing
familiarity with scaling web applications.

There was a thread back in July talking about this also:

chef - [chef] High availability for chef

There's real value in these threads for the OPS guys IMO. Would be
great if someone collects that kind of info and put it in a wiki page
for future reference.

I can do it if you find it appropriate, at least to reference this and
previous threads.

Best,
Adam

--
Opscode, Inc.
Adam Jacob, Chief Product Officer
T: (206) 619-7151 E: adam@opscode.com

--
Opscode, Inc.
Adam Jacob, Chief Product Officer
T: (206) 619-7151 E: adam@opscode.com

thanks Sergio,

Given the type of high scale use cases where chef would really shine, it
is surprisingly hard to find information on that particular facet of chef
capabilities.

Yes, it would very useful to have this formally documented.
It would nicely reinforced chef story for high scale / cloud scenarios.

thanks again,
david.

On Tue, Aug 9, 2011 at 2:35 PM, Sergio Rubio rubiojr@frameos.org wrote:

Hi all,

On Mon, Aug 1, 2011 at 5:15 PM, Adam Jacob adam@opscode.com wrote:

On Mon, Aug 1, 2011 at 7:39 AM, david brpr david.eauee@gmail.com
wrote:

....

what are the implication in HA terms?

That it's a pretty typical scenario, if you have more than a passing
familiarity with scaling web applications.

There was a thread back in July talking about this also:

chef - [chef] High availability for chef

There's real value in these threads for the OPS guys IMO. Would be
great if someone collects that kind of info and put it in a wiki page
for future reference.

I can do it if you find it appropriate, at least to reference this and
previous threads.

Best,
Adam

--
Opscode, Inc.
Adam Jacob, Chief Product Officer
T: (206) 619-7151 E: adam@opscode.com