for the record, I have created a ticket and offered a fix (http://tickets.opscode.com/browse/CHEF-4010)
From: Grégoire Seux
Sent: jeudi 14 mars 2013 09:29
Subject: RE: [chef] Re: Re: chef client locked
Thanks for both reply.
Indeed I have reproduced this only in the case where chef server is not accessible.
It seems to happen quite often, but I don’t know if it is due to high latency between nodes and server (~250 ms), over saturated connection or chef server 11.
I’ll wait for the fix then.
From: Paul Mooring [mailto:firstname.lastname@example.org]mailto:[mailto:email@example.com]
Sent: mercredi 13 mars 2013 18:09
Subject: [chef] Re: Re: chef client locked
This should be the result of loading the node from the server somehow failing. I believe Sascha is working on a proper fix, but in the mean time this shouldn’t happen if you have a connection to the server.
Systems Engineer and Customer Advocate
From: Sascha Bates <firstname.lastname@example.org:email@example.com>
Reply-To: "firstname.lastname@example.org:email@example.com" <firstname.lastname@example.org:email@example.com>
Date: Wednesday, March 13, 2013 10:02 AM
To: "firstname.lastname@example.org:email@example.com" <firstname.lastname@example.org:email@example.com>
Subject: [chef] Re: chef client locked
I can confirm this. I was debugging it earlier this week and have been looking for the time to write the code to submit a pull request instead of just submitting a bug report
using chef 11 (11.4.0) I have noticed a strange behavior when a run fails: the next run won’t start because of the locking introduced by http://tickets.opscode.com/browse/CHEF-867.
Log for the client is :
ERROR: Errno::ETIMEDOUT: Error connecting to https://chef03-am5 /nodes/mem02-ty5 - Connection timed out - connect(2)
[2013-03-13T11:40:03+01:00] FATAL: Stacktrace dumped to /var/cache/chef/chef-stacktrace.out
[2013-03-13T11:40:03+01:00] ERROR: Sleeping for 1800 seconds before trying again
[2013-03-13T12:10:04+01:00] INFO: Chef client is running, will wait for it to finish and then run.
I guess this is not the expected impact of the lock, is this a bug ?