For ordinary chef nodes to do this, they have to individually publish the fact of their completion somewhere that it is visible to the Chef server. When Chef completes, it could set and record a node attribute that could be read from the chef server by other nodes. Alternatively, each node in an ordered list could be configured to run a web or SSH query against its predecessor in the list to “audit” before proceeding with the chef run, and to perform that update and itself report completion for the next node to audit.
Any attribute you can report with a “knife search” command should be sufficient to allow a chef tool to trigger the update on that node. What it can’t do, trivially, is prevent multiple nodes with identical configurations from firing off at the same time. It sounds like you need to generate an ordered list for this operation, and to decide how it should fail if any one node refuses to accept an update.
In my experience in small environments, the cost of building up the tool suite to audit, to do different things based on the audit, and to make it reliable and robust often overwhelms the price of doing it by hand, and setting all chef clients with a new role or configuration that auto-performs the update on any missed hosts that were down at the time of the update.