Chef loop to run on all my nodes

I have over 800 nodes that I manage with Chef. I also had the issue of whenever I want to do massive updates I need to be able to not run them all at once. I had the idea to be able to create a loop where it would only process 10 nodes at a time and when that finished the loop would continue and do another 10 nodes until all nodes were finished. The point is to prevent a DoS. Would anyone have any idea on how to write this loop?

Short answer: you don't!

The client already has an option for it, called splay. As an example, in client.rb:

interval 1800
splay 300

means "Run Chef every 1800 seconds (30 mins), + or - 300 seconds"

The splay option works well for me. Iā€™m also interested in not creating a DoS. In my environment I run once per hour, with a 1 hr splay, so all my nodes are spread across the hour.

I only want to run and update once on all 800 nodes and not in re-occruing intervals but in increments just spacing out the processing of it all. I don't really care how long it takes as long as it doesn't effect our network. When I add the -i 3600 and -s 3600 it just keeps patching over and over again ever 1hr. That isn't the goal.

To solve your immediate problem, you can run knife ssh QUERY COMMAND --concurrency NUM where it will only run the command on NUM nodes at once, but chef is designed to be idempotent, so running once an hour should definitely not be a problem if the recipe is designed properly. You might need a not_if or only_if guard on your patching resource.

1 Like

worth noting that splay can work without interval. knife ssh <some query> 'chef-client --splay 600' will run once, and each node will wait a random time between 0 and 600 secs.

Like ccrebolder pointed out tho, it's not really the intended model.