Chef-client (solo) run lock


#1

For those that are scheduling the chef-client to run on an interval and take advantage of the client locking so only one will run at a time, is the behavior where the client sits and waits to acquire a lock what folks depend on?

At first we had the assumption that the client would just shut back down if it can’t acquire the lock. However we now recognize this is not the case and there are instances where we have client runs stacking up on top of each other to the point where basically chef is running the whole time.

It’s a problem that is manageable but got me thinking, is there need/interest in having the behavior configurable. Maybe a command line or Chef::Config parameter that is like:

–run-lock wait or --run-lock terminate

Or in the config

run_lock “wait” or run_lock “terminate”

Default behavior being wait as it is today but if it is terminate then if the client can’t acquire the lock it just logs a message and shuts down?

Is this useful to people, am I missing a reason why we wouldn’t want this behavior?

Was hoping to generate some conversation and see what people think…

Kevin


#2

On Wednesday, October 9, 2013 at 11:26 AM, Moser, Kevin wrote:

For those that are scheduling the chef-client to run on an interval and take advantage of the client locking so only one will run at a time, is the behavior where the client sits and waits to acquire a lock what folks depend on?

At first we had the assumption that the client would just shut back down if it can’t acquire the lock. However we now recognize this is not the case and there are instances where we have client runs stacking up on top of each other to the point where basically chef is running the whole time.

It’s a problem that is manageable but got me thinking, is there need/interest in having the behavior configurable. Maybe a command line or Chef::Config parameter that is like:

–run-lock wait or --run-lock terminate

Or in the config

run_lock “wait” or run_lock “terminate”

Default behavior being wait as it is today but if it is terminate then if the client can’t acquire the lock it just logs a message and shuts down?

Is this useful to people, am I missing a reason why we wouldn’t want this behavior?

Was hoping to generate some conversation and see what people think…

Kevin
The reason it was written this way, is so if you run chef as a daemon, and then ssh into a bunch of nodes to run chef, the only way to be sure the desired updates are applied is to have the second (manual) run stack up behind the daemonized run.

If your shell has a flock function, you can implement the behavior you want right now with something like: http://stackoverflow.com/questions/7057234/bash-flock-exit-if-cant-acquire-lock/7057385#7057385


Daniel DeLeo


#3

Useful to us we run client on an interval and I have seen the stacking clients.

On Oct 9, 2013, at 11:26 AM, “Moser, Kevin” Kevin.Moser@nordstrom.com wrote:

For those that are scheduling the chef-client to run on an interval and take advantage of the client locking so only one will run at a time, is the behavior where the client sits and waits to acquire a lock what folks depend on?

At first we had the assumption that the client would just shut back down if it can’t acquire the lock. However we now recognize this is not the case and there are instances where we have client runs stacking up on top of each other to the point where basically chef is running the whole time.

It’s a problem that is manageable but got me thinking, is there need/interest in having the behavior configurable. Maybe a command line or Chef::Config parameter that is like:

–run-lock wait or --run-lock terminate

Or in the config

run_lock “wait” or run_lock “terminate”

Default behavior being wait as it is today but if it is terminate then if the client can’t acquire the lock it just logs a message and shuts down?

Is this useful to people, am I missing a reason why we wouldn’t want this behavior?

Was hoping to generate some conversation and see what people think…

Kevin