Chef client locked


#1

Hello,

using chef 11 (11.4.0) I have noticed a strange behavior when a run fails: the next run won’t start because of the locking introduced by http://tickets.opscode.com/browse/CHEF-867.

Log for the client is :


ERROR: Errno::ETIMEDOUT: Error connecting to https://chef03-am5 /nodes/mem02-ty5 - Connection timed out - connect(2)
[2013-03-13T11:40:03+01:00] FATAL: Stacktrace dumped to /var/cache/chef/chef-stacktrace.out
[2013-03-13T11:40:03+01:00] ERROR: Sleeping for 1800 seconds before trying again
[2013-03-13T12:10:04+01:00] INFO: Chef client is running, will wait for it to finish and then run.

I guess this is not the expected impact of the lock, is this a bug ?

Cheers,


Grégoire


#2

I can confirm this. I was debugging it earlier this week and have been
looking for the time to write the code to submit a pull request instead of
just submitting a bug report :confused:

On Wed, Mar 13, 2013 at 5:27 AM, Grégoire Seux g.seux@criteo.com wrote:

Hello,

using chef 11 (11.4.0) I have noticed a strange behavior when a run fails:
the next run won’t start because of the locking introduced by
http://tickets.opscode.com/browse/CHEF-867.

Log for the client is :


ERROR: Errno::ETIMEDOUT: Error connecting to https://chef03-am5/nodes/mem02-ty5 - Connection timed out - connect(2)
[2013-03-13T11:40:03+01:00] FATAL: Stacktrace dumped to
/var/cache/chef/chef-stacktrace.out
[2013-03-13T11:40:03+01:00] ERROR: Sleeping for 1800 seconds before trying
again
[2013-03-13T12:10:04+01:00] INFO: Chef client is running, will wait for
it to finish and then run.

I guess this is not the expected impact of the lock, is this a bug ?

Cheers,


Grégoire


#3

This should be the result of loading the node from the server somehow failing. I believe Sascha is working on a proper fix, but in the mean time this shouldn’t happen if you have a connection to the server.

Paul Mooring
Systems Engineer and Customer Advocate

From: Sascha Bates <sascha.bates@gmail.commailto:sascha.bates@gmail.com>
Reply-To: "chef@lists.opscode.commailto:chef@lists.opscode.com" <chef@lists.opscode.commailto:chef@lists.opscode.com>
Date: Wednesday, March 13, 2013 10:02 AM
To: "chef@lists.opscode.commailto:chef@lists.opscode.com" <chef@lists.opscode.commailto:chef@lists.opscode.com>
Subject: [chef] Re: chef client locked

I can confirm this. I was debugging it earlier this week and have been looking for the time to write the code to submit a pull request instead of just submitting a bug report :confused:

On Wed, Mar 13, 2013 at 5:27 AM, Grégoire Seux <g.seux@criteo.commailto:g.seux@criteo.com> wrote:
Hello,

using chef 11 (11.4.0) I have noticed a strange behavior when a run fails: the next run won’t start because of the locking introduced by http://tickets.opscode.com/browse/CHEF-867.

Log for the client is :


ERROR: Errno::ETIMEDOUT: Error connecting to https://chef03-am5 /nodes/mem02-ty5 - Connection timed out - connect(2)
[2013-03-13T11:40:03+01:00] FATAL: Stacktrace dumped to /var/cache/chef/chef-stacktrace.out
[2013-03-13T11:40:03+01:00] ERROR: Sleeping for 1800 seconds before trying again
[2013-03-13T12:10:04+01:00] INFO: Chef client is running, will wait for it to finish and then run.

I guess this is not the expected impact of the lock, is this a bug ?

Cheers,


Grégoire


#4

Thanks for both reply.
Indeed I have reproduced this only in the case where chef server is not accessible.
It seems to happen quite often, but I don’t know if it is due to high latency between nodes and server (~250 ms), over saturated connection or chef server 11.
I’ll wait for the fix then.


Grégoire

From: Paul Mooring [mailto:paul@opscode.com]
Sent: mercredi 13 mars 2013 18:09
To: chef@lists.opscode.com
Subject: [chef] Re: Re: chef client locked

This should be the result of loading the node from the server somehow failing. I believe Sascha is working on a proper fix, but in the mean time this shouldn’t happen if you have a connection to the server.

Paul Mooring
Systems Engineer and Customer Advocate

www.opscode.comhttp://www.opscode.com

From: Sascha Bates <sascha.bates@gmail.commailto:sascha.bates@gmail.com>
Reply-To: "chef@lists.opscode.commailto:chef@lists.opscode.com" <chef@lists.opscode.commailto:chef@lists.opscode.com>
Date: Wednesday, March 13, 2013 10:02 AM
To: "chef@lists.opscode.commailto:chef@lists.opscode.com" <chef@lists.opscode.commailto:chef@lists.opscode.com>
Subject: [chef] Re: chef client locked

I can confirm this. I was debugging it earlier this week and have been looking for the time to write the code to submit a pull request instead of just submitting a bug report :confused:

On Wed, Mar 13, 2013 at 5:27 AM, Grégoire Seux <g.seux@criteo.commailto:g.seux@criteo.com> wrote:
Hello,

using chef 11 (11.4.0) I have noticed a strange behavior when a run fails: the next run won’t start because of the locking introduced by http://tickets.opscode.com/browse/CHEF-867.

Log for the client is :


ERROR: Errno::ETIMEDOUT: Error connecting to https://chef03-am5 /nodes/mem02-ty5 - Connection timed out - connect(2)
[2013-03-13T11:40:03+01:00] FATAL: Stacktrace dumped to /var/cache/chef/chef-stacktrace.out
[2013-03-13T11:40:03+01:00] ERROR: Sleeping for 1800 seconds before trying again
[2013-03-13T12:10:04+01:00] INFO: Chef client is running, will wait for it to finish and then run.

I guess this is not the expected impact of the lock, is this a bug ?

Cheers,


Grégoire