Handling "common" errors in a chef run


#1

I’m trying to think of a way to handle “common” errors that my chef
runs generate. Usually it is something related to AWS or opscode
platform and simply re-running chef-client will resolve the issue. I
was thinking that maybe I could create an error handler, check for
errors that can be resolved by running chef-client again, and spawn
another chef-client process prior to exiting. Thoughts?


#2

On Thursday, February 10, 2011 at 7:34 AM, Michael Hale wrote:
I’m trying to think of a way to handle “common” errors that my chef

runs generate. Usually it is something related to AWS or opscode
platform and simply re-running chef-client will resolve the issue. I
was thinking that maybe I could create an error handler, check for
errors that can be resolved by running chef-client again, and spawn
another chef-client process prior to exiting. Thoughts?
There’s a old, unloved ticket for adding retry logic to individual resources, if you’re interested in giving that a go. The original motivation was installing gems from rubyforge back when rubyforge was notoriously unstable, but I think it could help here, too.

If you just want to write a simple exception handler and get on with your life, you could probably just check if the error is in the class of transient errors you described and then use exec to replace the chef process with a different chef process. Chef no longer mangles ARGV, so the original arguments you pass to chef will be there.


Dan DeLeo