I know how to run chef cookbooks on multiple nodes at once. What I need to do now is run cookbooks on multiple nodes but in blocks at a time. Running a mass rhel yum update on all chef nodes at once causes errors with the my internal rhel database. Is there a way to limit the number of nodes getting an update and when a group of nodes is complete, move on to the next group of nodes and so on?
Better yet I just need to run cookbooks on all my nodes but just run them 5 at a time. I don't want to run a single cookbook on all nodes at the same time because it causes major issues.
Thank you for your question - I totally understand why you want to do this, however in practice this ends up causing some pretty gnarly headaches within your environment.
My advice would be to use the chef-client cookbook to manage your client configurations on your nodes, then investigate setting the
node['chef-client']['splay'] attribute (https://github.com/chef-cookbooks/chef-client#attributes).
Splay is described as:
A random number between zero and splay that is added to interval. Use splay to help balance the load on the Chef server by ensuring that many chef-client runs are not occurring at the same interval. When the chef-client is run at intervals, --splay and --interval values are applied before the chef-client run.
For example, let's say you have a client run interval of 60 minutes. You should be able to safely set a
splay of 30-45 minutes to balance the chef-client converges in your environment.
This would hopefully smooth the load on your upstream artifact repositories that are struggling during converge.