If I need to sequence recipes around nodes (need not be a cluster), what are the tools available for the job?
By sequence, I mean,I need to run a recipe on one node, then I have to run a different recipe on different node, and then something else on 3rd node.
I know I can set some environment attributes and sleep in recipes to achieve this but am looking for elegant way.
Currently our infrastructure is not in cloud but it will be in the future.
Does the order matter that the recipes run? If not, just make a ‘role’ for each server type
If you are trying to orchestrate commands across multiple machines, then this is difficult to do because chef firmly believes that infrastructure should be “fully convergent” (meaning every vm takes care of itself and trusts other vms to do the same). There is no way to do cross-vm orchestration mid chef run.
Your only option will be to not run chef on a schedule, and use some sort of orchestration like bash/powershell to run the recipes on the nodes in the order that you want.
Here is an example using knife winrm for windows machines
The order does matter. Think of an example to restart suite. You DB has to start first, when they are fully up, you need to start Application servers etc.
In this scenario, I can set up a environment attribute once DB are up, my application servers will look for it and then reboot themselves.
Chef client will run as service and start on reboot, and set appropriate environment variable etc
It will be better if there is a tool I can use to do such things, I don’t have to reinvent wheels…
For ordinary chef nodes to do this, they have to individually publish the fact of their completion somewhere that it is visible to the Chef server. When Chef completes, it could set and record a node attribute that could be read from the chef server by other nodes. Alternatively, each node in an ordered list could be configured to run a web or SSH query against its predecessor in the list to “audit” before proceeding with the chef run, and to perform that update and itself report completion for the next node to audit.
Any attribute you can report with a “knife search” command should be sufficient to allow a chef tool to trigger the update on that node. What it can’t do, trivially, is prevent multiple nodes with identical configurations from firing off at the same time. It sounds like you need to generate an ordered list for this operation, and to decide how it should fail if any one node refuses to accept an update.
In my experience in small environments, the cost of building up the tool suite to audit, to do different things based on the audit, and to make it reliable and robust often overwhelms the price of doing it by hand, and setting all chef clients with a new role or configuration that auto-performs the update on any missed hosts that were down at the time of the update.