Maybe I just need to take some more time to better understand the chef-client cookbook….
We’re currently adding the following recipes to our baseos:
We set the [‘chef-client’][‘interval’] attribute to 86400 (1 day), under the assumption that means chef-client will converge the node once a day.
Our tests have been on ubuntu-12.04 boxes, which use the init_service by default.
I’m curious to know if we may be misunderstanding the usage, but it seems fairly straight forward. I’ll continue to do some troubleshooting to see what the issue is.
On Apr 3, 2014, at 4:29 PM, Daniel DeLeo <email@example.com:firstname.lastname@example.org> wrote:
On Tuesday, April 1, 2014 at 1:41 PM, Stewart, Curtis wrote:
We’re currently using the chef-client cookbook to setup a service (init by default) so chef-client runs on a specified interval. We’re also using the default value of 1800 (seconds) for the interval.
Apparently some of our runs take more than 1800 seconds, which is OK, however, the service stops running and logs a FATAL error, "FATAL: Chef is already running pid 21697”.
I’m going to up the interval to just once a day, but I’m wondering if there are solutions to skipping a converge if there’s already one running. I’d like to avoid scenarios in the future where the service does not start back up. At this point, we have to manually start the service back up if it ever fails.
If you run a single daemonized instance of chef-client, then the length of your run should not matter, because the interval is the time that chef client sleeps in between runs, not how often it tries to start a run. This behavior sounds like you have chef-client running via cron and init at the same time. Did you ever check if pid 21697 was alive and a running instance of chef-client?