Managing redundant nodes with Chef

Hi - I’m trying to determine whether Chef is the right tool to manage the configuration of redundant nodes. Most of our systems have a few dozen nodes but some have a few hundred. Some node types are n+1 redundant and others are 2n redundant. Our current process upgrades nodes one at a time to avoid system downtime. Is there a design pattern or standard approach with Chef to upgrade nodes one at a time or to roll out configuration changes in some other way that avoids downtime?



Yes, chef can definitely do this.

Chef has both “roles” and “environments”. If you have a redundent copy of every server (lucky you), then you can put half of them in one environment (e.g prod) and the other half in another environment (e.g stage).

Each environment then has identical settings and cookbook version pins. If you want to make a change to an attribute, or cookbook, make the change in just the stage environment. Once you verify it works on those nodes, make the same change to the prod nodes.

As for pushing your deploys, you can either use the chef-client cookbook to make chef run on a 30 minute schedule, or use an orchestration tool to run chef one at a time:

3 good orchestration tools:
chef push jobs:
knife ssh

Thanks for the tips and references Spencer.