I’m still failing to get chef-client to run on it’s own without failing due
to timeouts. I have 4 CentOS 6.4 nodes in QA, overnight usually 2 will fail
and show as ‘red’ the next day - it runs hourly, so they’ll be down for 10+
hours. I login to the node, stop chef-client, run it manually, and it
works.
Why is it timing out on NTP, and even failing as it retries 4 times in a
row?
Why doesn’t it fix itself on the next run? It seems to fail, and then
never recover.
On Tuesday, November 19, 2013 at 10:13 AM, Phil Cryer wrote:
I'm still failing to get chef-client to run on it's own without failing due to timeouts. I have 4 CentOS 6.4 nodes in QA, overnight usually 2 will fail and show as 'red' the next day - it runs hourly, so they'll be down for 10+ hours. I login to the node, stop chef-client, run it manually, and it works.
Why is it timing out on NTP, and even failing as it retries 4 times in a row?
Why doesn't it fix itself on the next run? It seems to fail, and then never recover.
That’s been fixed in master and should get released with the next Chef 11.x release. You could try building the gem manually and installing it on one of your systems to see if it fixes the problem which would be greatly appreciated.
On Tue, Nov 19, 2013 at 12:20 PM, Daniel DeLeo dan@kallistec.com wrote:
On Tuesday, November 19, 2013 at 10:13 AM, Phil Cryer wrote:
I'm still failing to get chef-client to run on it's own without failing
due to timeouts. I have 4 CentOS 6.4 nodes in QA, overnight usually 2 will
fail and show as 'red' the next day - it runs hourly, so they'll be down
for 10+ hours. I login to the node, stop chef-client, run it manually,
and it works.
Why is it timing out on NTP, and even failing as it retries 4 times in
a row?
Why doesn't it fix itself on the next run? It seems to fail, and then
never recover.
That’s been fixed in master and should get released with the next Chef
11.x release. You could try building the gem manually and installing it on
one of your systems to see if it fixes the problem which would be greatly
appreciated.
I may push a prerelease version a little later today (chatting with the team about this now). If that happens you would just do /opt/chef/embedded/bin/gem install mixlib-shellout —pre (assuming omnibus). I’ll let you know what we decide.
Ok, I've got that in place on my qa01 - I did a chef-client stop, then a ps
to make sure all chef processes were gone, so we'll see how this one goes.
If you do a prerelease I can install it on another node, just let me know -
and thanks.
On Tue, Nov 19, 2013 at 2:06 PM, Daniel DeLeo dan@kallistec.com wrote:
On Tuesday, November 19, 2013 at 11:54 AM, Phil Cryer wrote:
I may push a prerelease version a little later today (chatting with the
team about this now). If that happens you would just do
/opt/chef/embedded/bin/gem install mixlib-shellout —pre (assuming omnibus).
I’ll let you know what we decide.
On Tuesday, November 19, 2013 at 12:30 PM, Phil Cryer wrote:
Ok, I've got that in place on my qa01 - I did a chef-client stop, then a ps to make sure all chef processes were gone, so we'll see how this one goes. If you do a prerelease I can install it on another node, just let me know - and thanks.
We decided to do the prerelease, so you can install that with:
[root@qa02 ~]# /opt/chef/embedded/bin/gem install mixlib-shellout --pre
Fetching: mixlib-shellout-1.3.0.rc.0.gem (100%)
Successfully installed mixlib-shellout-1.3.0.rc.0
1 gem installed
Installing ri documentation for mixlib-shellout-1.3.0.rc.0...
Installing RDoc documentation for mixlib-shellout-1.3.0.rc.0...
On Tue, Nov 19, 2013 at 2:32 PM, Daniel DeLeo dan@kallistec.com wrote:
On Tuesday, November 19, 2013 at 12:30 PM, Phil Cryer wrote:
Ok, I've got that in place on my qa01 - I did a chef-client stop, then a
ps to make sure all chef processes were gone, so we'll see how this one
goes. If you do a prerelease I can install it on another node, just let me
know - and thanks.
We decided to do the prerelease, so you can install that with:
Daniel, sorry to say this still fails with the new mixlib-shellout. It
fails, then fails every other time chef-client runs and never works. I
think it may still be an issue with the yum.pid getting in the way. To fix,
I ssh in and do:
-> stop chef-client
/etc/init.d/chef-client stop
Stopping chef-client: [ OK ]
-> make sure all chef process are done
ps -fe|grep chef
root 14303 14279 0 13:29 pts/0 00:00:00 grep chef
-> remove the stale yum pid
rm /var/run/yum.pid
-> run chef-client
chef-client
And it just works. So, here are the logs from when it was failing, notice
it continued to try every hour...
=====
[2013-11-21T11:48:37-06:00] WARN: Current
template[/opt/splunkforwarder/etc/system/local/outputs.conf]:
/var/chef/cache/cookbooks/logging/recipes/default.rb:51:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for file[/etc/httpd/conf.d/ssl.conf] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous file[/etc/httpd/conf.d/ssl.conf]: /var/chef/cache/cookbooks/apache2/recipes/default.rb:84:in block in
from_file'
[2013-11-21T11:48:37-06:00] WARN: Current
file[/etc/httpd/conf.d/ssl.conf]:
/var/chef/cache/cookbooks/apache2/recipes/mod_ssl.rb:28:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for template[/etc/httpd/ports.conf] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous template[/etc/httpd/ports.conf]: /var/chef/cache/cookbooks/apache2/recipes/default.rb:185:in from_file'
[2013-11-21T11:48:37-06:00] WARN: Current template[/etc/httpd/ports.conf]:
/var/chef/cache/cookbooks/apache2/recipes/mod_ssl.rb:34:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for template[/etc/varnish/default.vcl] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous template[/etc/varnish/default.vcl]: /var/chef/cache/cookbooks/chef-varnish/recipes/default.rb:60:in from_file'
[2013-11-21T11:48:37-06:00] WARN: Current
template[/etc/varnish/default.vcl]:
/var/chef/cache/cookbooks/app-savviscom/recipes/default.rb:133:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for service[moxi-server] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous service[moxi-server]: /var/chef/cache/cookbooks/couchbase/recipes/moxi.rb:42:in from_file'
[2013-11-21T11:48:37-06:00] WARN: Current service[moxi-server]:
/var/chef/cache/cookbooks/couchbase/recipes/moxi.rb:56:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for package[postfix] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous package[postfix]: /var/chef/cache/cookbooks/base/recipes/default.rb:82:in from_file'
[2013-11-21T11:48:37-06:00] WARN: Current package[postfix]:
/var/chef/cache/cookbooks/app-savviscom/recipes/default.rb:186:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for package[rsyslog] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous package[rsyslog]: /var/chef/cache/cookbooks/base/recipes/default.rb:90:in block in from_file'
[2013-11-21T11:48:37-06:00] WARN: Current package[rsyslog]:
/var/chef/cache/cookbooks/app-savviscom/recipes/default.rb:189:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for template[/etc/php.ini] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous template[/etc/php.ini]: /var/chef/cache/cookbooks/app-savviscom/recipes/default.rb:96:in from_file'
[2013-11-21T11:48:37-06:00] WARN: Current template[/etc/php.ini]:
/var/chef/cache/cookbooks/php/recipes/package.rb:27:in `from_file'
[2013-11-21T11:48:37-06:00] INFO: Processing template[/root/.profile]
action create (base::default line 14)
[2013-11-21T11:48:37-06:00] INFO: Processing template[/root/.curlrc] action
create (base::default line 20)
[2013-11-21T11:48:37-06:00] INFO: Processing template[/root/.bash_profile]
action create (base::default line 26)
[2013-11-21T11:48:37-06:00] INFO: Processing service[iptables] action
disable (base::default line 32)
[2013-11-21T11:48:37-06:00] INFO: Processing service[iptables] action stop
(base::default line 32)
[2013-11-21T11:48:37-06:00] INFO: Processing
template[/etc/profile.d/proxy.sh] action create (base::default line 35)
[2013-11-21T11:48:37-06:00] INFO: Processing
cookbook_file[/etc/selinux/config] action create (base::default line 41)
[2013-11-21T11:48:37-06:00] INFO: Processing
cookbook_file[/etc/yum/pluginconf.d/fastestmirror.conf] action create
(base::default line 57)
[2013-11-21T11:48:37-06:00] INFO: Processing
cookbook_file[/etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6] action create
(base::default line 63)
[2013-11-21T11:48:37-06:00] INFO: Processing yum_key[RPM-GPG-KEY-EPEL-6]
action add (yum::epel line 22)
[2013-11-21T11:48:37-06:00] INFO: Processing yum_repository[epel] action
add (yum::epel line 27)
[2013-11-21T11:48:37-06:00] INFO: Processing package[ntp] action install
(base::default line 77)
[2013-11-21T12:03:52-06:00] ERROR:
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.6.0/lib/chef/provider/package/yum-dump.py
exceeded timeout 900
[2013-11-21T12:03:52-06:00] INFO: Retrying execution of package[ntp], 3
attempt(s) left
[2013-11-21T12:19:21-06:00] ERROR:
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.6.0/lib/chef/provider/package/yum-dump.py
exceeded timeout 900
[2013-11-21T12:19:21-06:00] INFO: Retrying execution of package[ntp], 2
attempt(s) left
[2013-11-21T12:34:50-06:00] ERROR:
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.6.0/lib/chef/provider/package/yum-dump.py
exceeded timeout 900
[2013-11-21T12:34:50-06:00] INFO: Retrying execution of package[ntp], 1
attempt(s) left
[2013-11-21T12:50:19-06:00] ERROR:
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.6.0/lib/chef/provider/package/yum-dump.py
exceeded timeout 900
[2013-11-21T12:50:19-06:00] INFO: Retrying execution of package[ntp], 0
attempt(s) left
[2013-11-21T13:05:48-06:00] ERROR:
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.6.0/lib/chef/provider/package/yum-dump.py
exceeded timeout 900
[2013-11-21T13:05:48-06:00] INFO: Running queued delayed notifications
before re-raising exception
[2013-11-21T13:05:48-06:00] ERROR: Running exception handlers
[2013-11-21T13:05:48-06:00] ERROR: Exception handlers complete
[2013-11-21T13:05:48-06:00] FATAL: Stacktrace dumped to
/var/chef/cache/chef-stacktrace.out
[2013-11-21T13:05:48-06:00] ERROR: Chef::Exceptions::ChildConvergeError:
Chef run process exited unsuccessfully (exit code 1)
[2013-11-21T13:05:48-06:00] ERROR: Sleeping for 1800 seconds before trying
again
[root@qa02 ~]# /opt/chef/embedded/bin/gem install mixlib-shellout --pre
Fetching: mixlib-shellout-1.3.0.rc.0.gem (100%)
Successfully installed mixlib-shellout-1.3.0.rc.0
1 gem installed
Installing ri documentation for mixlib-shellout-1.3.0.rc.0...
Installing RDoc documentation for mixlib-shellout-1.3.0.rc.0...
On Tue, Nov 19, 2013 at 2:32 PM, Daniel DeLeo dan@kallistec.com wrote:
On Tuesday, November 19, 2013 at 12:30 PM, Phil Cryer wrote:
Ok, I've got that in place on my qa01 - I did a chef-client stop, then a
ps to make sure all chef processes were gone, so we'll see how this one
goes. If you do a prerelease I can install it on another node, just let me
know - and thanks.
We decided to do the prerelease, so you can install that with:
On Thursday, November 21, 2013 at 11:35 AM, Phil Cryer wrote:
Daniel, sorry to say this still fails with the new mixlib-shellout. It fails, then fails every other time chef-client runs and never works. I think it may still be an issue with the yum.pid getting in the way. To fix, I ssh in and do:
-> stop chef-client
/etc/init.d/chef-client stop
Stopping chef-client: [ OK ]
-> make sure all chef process are done
ps -fe|grep chef
root 14303 14279 0 13:29 pts/0 00:00:00 grep chef
-> remove the stale yum pid
rm /var/run/yum.pid
-> run chef-client
chef-client
And it just works. So, here are the logs from when it was failing, notice it continued to try every hour...
=====
[2013-11-21T11:48:37-06:00] WARN: Current template[/opt/splunkforwarder/etc/system/local/outputs.conf]: /var/chef/cache/cookbooks/logging/recipes/default.rb:51:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for file[/etc/httpd/conf.d/ssl.conf] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous file[/etc/httpd/conf.d/ssl.conf]: /var/chef/cache/cookbooks/apache2/recipes/default.rb:84:in block in from_file'
[2013-11-21T11:48:37-06:00] WARN: Current file[/etc/httpd/conf.d/ssl.conf]: /var/chef/cache/cookbooks/apache2/recipes/mod_ssl.rb:28:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for template[/etc/httpd/ports.conf] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous template[/etc/httpd/ports.conf]: /var/chef/cache/cookbooks/apache2/recipes/default.rb:185:in from_file'
[2013-11-21T11:48:37-06:00] WARN: Current template[/etc/httpd/ports.conf]: /var/chef/cache/cookbooks/apache2/recipes/mod_ssl.rb:34:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for template[/etc/varnish/default.vcl] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous template[/etc/varnish/default.vcl]: /var/chef/cache/cookbooks/chef-varnish/recipes/default.rb:60:in from_file'
[2013-11-21T11:48:37-06:00] WARN: Current template[/etc/varnish/default.vcl]: /var/chef/cache/cookbooks/app-savviscom/recipes/default.rb:133:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for service[moxi-server] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous service[moxi-server]: /var/chef/cache/cookbooks/couchbase/recipes/moxi.rb:42:in from_file'
[2013-11-21T11:48:37-06:00] WARN: Current service[moxi-server]: /var/chef/cache/cookbooks/couchbase/recipes/moxi.rb:56:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for package[postfix] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous package[postfix]: /var/chef/cache/cookbooks/base/recipes/default.rb:82:in from_file'
[2013-11-21T11:48:37-06:00] WARN: Current package[postfix]: /var/chef/cache/cookbooks/app-savviscom/recipes/default.rb:186:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for package[rsyslog] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous package[rsyslog]: /var/chef/cache/cookbooks/base/recipes/default.rb:90:in block in from_file'
[2013-11-21T11:48:37-06:00] WARN: Current package[rsyslog]: /var/chef/cache/cookbooks/app-savviscom/recipes/default.rb:189:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for template[/etc/php.ini] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous template[/etc/php.ini]: /var/chef/cache/cookbooks/app-savviscom/recipes/default.rb:96:in from_file'
[2013-11-21T11:48:37-06:00] WARN: Current template[/etc/php.ini]: /var/chef/cache/cookbooks/php/recipes/package.rb:27:in `from_file'
[2013-11-21T11:48:37-06:00] INFO: Processing template[/root/.profile] action create (base::default line 14)
[2013-11-21T11:48:37-06:00] INFO: Processing template[/root/.curlrc] action create (base::default line 20)
[2013-11-21T11:48:37-06:00] INFO: Processing template[/root/.bash_profile] action create (base::default line 26)
[2013-11-21T11:48:37-06:00] INFO: Processing service[iptables] action disable (base::default line 32)
[2013-11-21T11:48:37-06:00] INFO: Processing service[iptables] action stop (base::default line 32)
[2013-11-21T11:48:37-06:00] INFO: Processing template[/etc/profile.d/proxy.sh] action create (base::default line 35)
[2013-11-21T11:48:37-06:00] INFO: Processing cookbook_file[/etc/selinux/config] action create (base::default line 41)
[2013-11-21T11:48:37-06:00] INFO: Processing cookbook_file[/etc/yum/pluginconf.d/fastestmirror.conf] action create (base::default line 57)
[2013-11-21T11:48:37-06:00] INFO: Processing cookbook_file[/etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6] action create (base::default line 63)
[2013-11-21T11:48:37-06:00] INFO: Processing yum_key[RPM-GPG-KEY-EPEL-6] action add (yum::epel line 22)
[2013-11-21T11:48:37-06:00] INFO: Processing yum_repository[epel] action add (yum::epel line 27)
[2013-11-21T11:48:37-06:00] INFO: Processing package[ntp] action install (base::default line 77)
[2013-11-21T12:03:52-06:00] ERROR: /opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.6.0/lib/chef/provider/package/yum-dump.py exceeded timeout 900
[2013-11-21T12:03:52-06:00] INFO: Retrying execution of package[ntp], 3 attempt(s) left
[2013-11-21T12:19:21-06:00] ERROR: /opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.6.0/lib/chef/provider/package/yum-dump.py exceeded timeout 900
[2013-11-21T12:19:21-06:00] INFO: Retrying execution of package[ntp], 2 attempt(s) left
[2013-11-21T12:34:50-06:00] ERROR: /opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.6.0/lib/chef/provider/package/yum-dump.py exceeded timeout 900
[2013-11-21T12:34:50-06:00] INFO: Retrying execution of package[ntp], 1 attempt(s) left
[2013-11-21T12:50:19-06:00] ERROR: /opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.6.0/lib/chef/provider/package/yum-dump.py exceeded timeout 900
[2013-11-21T12:50:19-06:00] INFO: Retrying execution of package[ntp], 0 attempt(s) left
[2013-11-21T13:05:48-06:00] ERROR: /opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.6.0/lib/chef/provider/package/yum-dump.py exceeded timeout 900
[2013-11-21T13:05:48-06:00] INFO: Running queued delayed notifications before re-raising exception
[2013-11-21T13:05:48-06:00] ERROR: Running exception handlers
[2013-11-21T13:05:48-06:00] ERROR: Exception handlers complete
[2013-11-21T13:05:48-06:00] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out
[2013-11-21T13:05:48-06:00] ERROR: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)
[2013-11-21T13:05:48-06:00] ERROR: Sleeping for 1800 seconds before trying again
=====
Thanks
I would expect these error messages to look different with the new version of mixlib-shellout. Can you run this recipe:
require 'mixlib/shellout/version'
log "the shellout version is #{Mixlib::ShellOut::VERSION}" do
level :warn
end
-> I threw this in the end of one of my recipes:
require 'mixlib/shellout/version'
log "the shellout version is #{Mixlib::ShellOut::VERSION}" do
level :warn
end
-> ran it, and the log says:
log[the shellout version is 1.3.0.rc.0] action
write[2013-11-21T19:42:11-06:00] WARN: the shellout version is 1.3.0.rc.0
happy to test it again, or another gem, whatever - just let me know. Thanks
On Thu, Nov 21, 2013 at 1:51 PM, Daniel DeLeo dan@kallistec.com wrote:
On Thursday, November 21, 2013 at 11:35 AM, Phil Cryer wrote:
Daniel, sorry to say this still fails with the new mixlib-shellout. It
fails, then fails every other time chef-client runs and never works. I
think it may still be an issue with the yum.pid getting in the way. To fix,
I ssh in and do:
-> stop chef-client
/etc/init.d/chef-client stop
Stopping chef-client: [ OK ]
-> make sure all chef process are done
ps -fe|grep chef
root 14303 14279 0 13:29 pts/0 00:00:00 grep chef
-> remove the stale yum pid
rm /var/run/yum.pid
-> run chef-client
chef-client
And it just works. So, here are the logs from when it was failing, notice
it continued to try every hour...
=====
[2013-11-21T11:48:37-06:00] WARN: Current
template[/opt/splunkforwarder/etc/system/local/outputs.conf]:
/var/chef/cache/cookbooks/logging/recipes/default.rb:51:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for file[/etc/httpd/conf.d/ssl.conf] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous file[/etc/httpd/conf.d/ssl.conf]: /var/chef/cache/cookbooks/apache2/recipes/default.rb:84:in block in
from_file'
[2013-11-21T11:48:37-06:00] WARN: Current
file[/etc/httpd/conf.d/ssl.conf]:
/var/chef/cache/cookbooks/apache2/recipes/mod_ssl.rb:28:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for template[/etc/httpd/ports.conf] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous template[/etc/httpd/ports.conf]: /var/chef/cache/cookbooks/apache2/recipes/default.rb:185:in from_file'
[2013-11-21T11:48:37-06:00] WARN: Current
template[/etc/httpd/ports.conf]:
/var/chef/cache/cookbooks/apache2/recipes/mod_ssl.rb:34:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for template[/etc/varnish/default.vcl] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous template[/etc/varnish/default.vcl]: /var/chef/cache/cookbooks/chef-varnish/recipes/default.rb:60:in from_file'
[2013-11-21T11:48:37-06:00] WARN: Current
template[/etc/varnish/default.vcl]:
/var/chef/cache/cookbooks/app-savviscom/recipes/default.rb:133:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for service[moxi-server] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous service[moxi-server]: /var/chef/cache/cookbooks/couchbase/recipes/moxi.rb:42:in from_file'
[2013-11-21T11:48:37-06:00] WARN: Current service[moxi-server]:
/var/chef/cache/cookbooks/couchbase/recipes/moxi.rb:56:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for package[postfix] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous package[postfix]: /var/chef/cache/cookbooks/base/recipes/default.rb:82:in from_file'
[2013-11-21T11:48:37-06:00] WARN: Current package[postfix]:
/var/chef/cache/cookbooks/app-savviscom/recipes/default.rb:186:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for package[rsyslog] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous package[rsyslog]: /var/chef/cache/cookbooks/base/recipes/default.rb:90:in block in from_file'
[2013-11-21T11:48:37-06:00] WARN: Current package[rsyslog]:
/var/chef/cache/cookbooks/app-savviscom/recipes/default.rb:189:in from_file' [2013-11-21T11:48:37-06:00] WARN: Cloning resource attributes for template[/etc/php.ini] from prior resource (CHEF-3694) [2013-11-21T11:48:37-06:00] WARN: Previous template[/etc/php.ini]: /var/chef/cache/cookbooks/app-savviscom/recipes/default.rb:96:in from_file'
[2013-11-21T11:48:37-06:00] WARN: Current template[/etc/php.ini]:
/var/chef/cache/cookbooks/php/recipes/package.rb:27:in `from_file'
[2013-11-21T11:48:37-06:00] INFO: Processing template[/root/.profile]
action create (base::default line 14)
[2013-11-21T11:48:37-06:00] INFO: Processing template[/root/.curlrc]
action create (base::default line 20)
[2013-11-21T11:48:37-06:00] INFO: Processing template[/root/.bash_profile]
action create (base::default line 26)
[2013-11-21T11:48:37-06:00] INFO: Processing service[iptables] action
disable (base::default line 32)
[2013-11-21T11:48:37-06:00] INFO: Processing service[iptables] action stop
(base::default line 32)
[2013-11-21T11:48:37-06:00] INFO: Processing
template[/etc/profile.d/proxy.sh] action create (base::default line 35)
[2013-11-21T11:48:37-06:00] INFO: Processing
cookbook_file[/etc/selinux/config] action create (base::default line 41)
[2013-11-21T11:48:37-06:00] INFO: Processing
cookbook_file[/etc/yum/pluginconf.d/fastestmirror.conf] action create
(base::default line 57)
[2013-11-21T11:48:37-06:00] INFO: Processing
cookbook_file[/etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6] action create
(base::default line 63)
[2013-11-21T11:48:37-06:00] INFO: Processing yum_key[RPM-GPG-KEY-EPEL-6]
action add (yum::epel line 22)
[2013-11-21T11:48:37-06:00] INFO: Processing yum_repository[epel] action
add (yum::epel line 27)
[2013-11-21T11:48:37-06:00] INFO: Processing package[ntp] action install
(base::default line 77)
[2013-11-21T12:03:52-06:00] ERROR:
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.6.0/lib/chef/provider/package/yum-dump.py
exceeded timeout 900
[2013-11-21T12:03:52-06:00] INFO: Retrying execution of package[ntp], 3
attempt(s) left
[2013-11-21T12:19:21-06:00] ERROR:
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.6.0/lib/chef/provider/package/yum-dump.py
exceeded timeout 900
[2013-11-21T12:19:21-06:00] INFO: Retrying execution of package[ntp], 2
attempt(s) left
[2013-11-21T12:34:50-06:00] ERROR:
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.6.0/lib/chef/provider/package/yum-dump.py
exceeded timeout 900
[2013-11-21T12:34:50-06:00] INFO: Retrying execution of package[ntp], 1
attempt(s) left
[2013-11-21T12:50:19-06:00] ERROR:
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.6.0/lib/chef/provider/package/yum-dump.py
exceeded timeout 900
[2013-11-21T12:50:19-06:00] INFO: Retrying execution of package[ntp], 0
attempt(s) left
[2013-11-21T13:05:48-06:00] ERROR:
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.6.0/lib/chef/provider/package/yum-dump.py
exceeded timeout 900
[2013-11-21T13:05:48-06:00] INFO: Running queued delayed notifications
before re-raising exception
[2013-11-21T13:05:48-06:00] ERROR: Running exception handlers
[2013-11-21T13:05:48-06:00] ERROR: Exception handlers complete
[2013-11-21T13:05:48-06:00] FATAL: Stacktrace dumped to
/var/chef/cache/chef-stacktrace.out
[2013-11-21T13:05:48-06:00] ERROR: Chef::Exceptions::ChildConvergeError:
Chef run process exited unsuccessfully (exit code 1)
[2013-11-21T13:05:48-06:00] ERROR: Sleeping for 1800 seconds before trying
again
Thanks
I would expect these error messages to look different with the new version
of mixlib-shellout. Can you run this recipe:
require 'mixlib/shellout/version'
log "the shellout version is #{Mixlib::ShellOut::VERSION}" do
level :warn
end
On Thursday, November 21, 2013 at 5:47 PM, Phil Cryer wrote:
-> I threw this in the end of one of my recipes:
require 'mixlib/shellout/version'
log "the shellout version is #{Mixlib::ShellOut::VERSION}" do
level :warn
end
-> ran it, and the log says:
log[the shellout version is 1.3.0.rc.0] action write[2013-11-21T19:42:11-06:00] WARN: the shellout version is 1.3.0.rc.0
happy to test it again, or another gem, whatever - just let me know. Thanks
On Thursday, November 21, 2013 at 11:35 AM, Phil Cryer wrote:
Daniel, sorry to say this still fails with the new mixlib-shellout. It fails, then fails every other time chef-client runs and never works. I think it may still be an issue with the yum.pid getting in the way. To fix, I ssh in and do:
-> stop chef-client
/etc/init.d/chef-client stop
Stopping chef-client: [ OK ]
-> make sure all chef process are done
ps -fe|grep chef
root 14303 14279 0 13:29 pts/0 00:00:00 grep chef
-> remove the stale yum pid
rm /var/run/yum.pid
-> run chef-client
chef-client
And it just works. So, here are the logs from when it was failing, notice it continued to try every hour...
So is there a live process that has the yum lock? Or it’s truly stale and the pid in the file is a dead process? Also is this machine tight on RAM, and is there anything in the OOM killer log? At first glance I can’t think of what chef could do automatically for that case. Perhaps it could read the pid file and check if the process is alive, but that seems fraught with peril and prone to race conditions.
In any case if there are not any stale yum-dump.py processes hanging around then this behavior isn’t caused by MIXLIB-16, so I might have sent you on a wild goose chase.
so nothing I can see is locking yum, assuming it's something chef is
kicking off. What I do (when things aren't working) is login, stop
chef-client, kill any chef processes hanging around and make sure there's
no /var/run/yum.pid. Then I manually run chef-client, it works fine. I do
it again to be sure it can still work, that's fine, then I loggout. hours
later, it's stuck, in this condition again. This is across 4 VMware VMs,
each are just running apache and php, each has 2Gig RAM. They're CentOS
6.4. I want to get it fixed because I want chef-client to check in
regularly so I have all green when I run knife status. Anything else you
can think of to debug what's happening to make it fail (seemingly)
randomly? Thanks
On Thu, Nov 21, 2013 at 8:02 PM, Daniel DeLeo dan@kallistec.com wrote:
On Thursday, November 21, 2013 at 5:47 PM, Phil Cryer wrote:
-> I threw this in the end of one of my recipes:
require 'mixlib/shellout/version'
log "the shellout version is #{Mixlib::ShellOut::VERSION}" do
level :warn
end
-> ran it, and the log says:
log[the shellout version is 1.3.0.rc.0] action
write[2013-11-21T19:42:11-06:00] WARN: the shellout version is 1.3.0.rc.0
happy to test it again, or another gem, whatever - just let me know. Thanks
On Thu, Nov 21, 2013 at 1:51 PM, Daniel DeLeo dan@kallistec.com wrote:
On Thursday, November 21, 2013 at 11:35 AM, Phil Cryer wrote:
Daniel, sorry to say this still fails with the new mixlib-shellout. It
fails, then fails every other time chef-client runs and never works. I
think it may still be an issue with the yum.pid getting in the way. To fix,
I ssh in and do:
-> stop chef-client
/etc/init.d/chef-client stop
Stopping chef-client: [ OK ]
-> make sure all chef process are done
ps -fe|grep chef
root 14303 14279 0 13:29 pts/0 00:00:00 grep chef
-> remove the stale yum pid
rm /var/run/yum.pid
-> run chef-client
chef-client
And it just works. So, here are the logs from when it was failing, notice
it continued to try every hour...
So is there a live process that has the yum lock? Or it’s truly stale and
the pid in the file is a dead process? Also is this machine tight on RAM,
and is there anything in the OOM killer log? At first glance I can’t think
of what chef could do automatically for that case. Perhaps it could read
the pid file and check if the process is alive, but that seems fraught with
peril and prone to race conditions.
In any case if there are not any stale yum-dump.py processes hanging
around then this behavior isn’t caused by MIXLIB-16, so I might have sent
you on a wild goose chase.
On Friday, November 22, 2013 at 10:09 AM, Phil Cryer wrote:
so nothing I can see is locking yum, assuming it's something chef is kicking off. What I do (when things aren't working) is login, stop chef-client, kill any chef processes hanging around and make sure there's no /var/run/yum.pid. Then I manually run chef-client, it works fine. I do it again to be sure it can still work, that's fine, then I loggout. hours later, it's stuck, in this condition again. This is across 4 VMware VMs, each are just running apache and php, each has 2Gig RAM. They're CentOS 6.4. I want to get it fixed because I want chef-client to check in regularly so I have all green when I run knife status. Anything else you can think of to debug what's happening to make it fail (seemingly) randomly? Thanks
But the constant here is that there’s always a /var/run/yum.pid and there is no process matching that pid (or it’s been recycled)? If that’s the case, the next step is to figure out what process that is, and why it’s dying without cleaning up after itself.
On Thursday, November 21, 2013 at 5:47 PM, Phil Cryer wrote:
-> I threw this in the end of one of my recipes:
require 'mixlib/shellout/version'
log "the shellout version is #{Mixlib::ShellOut::VERSION}" do
level :warn
end
-> ran it, and the log says:
log[the shellout version is 1.3.0.rc.0] action write[2013-11-21T19:42:11-06:00] WARN: the shellout version is 1.3.0.rc.0
happy to test it again, or another gem, whatever - just let me know. Thanks
On Thursday, November 21, 2013 at 11:35 AM, Phil Cryer wrote:
Daniel, sorry to say this still fails with the new mixlib-shellout. It fails, then fails every other time chef-client runs and never works. I think it may still be an issue with the yum.pid getting in the way. To fix, I ssh in and do:
-> stop chef-client
/etc/init.d/chef-client stop
Stopping chef-client: [ OK ]
-> make sure all chef process are done
ps -fe|grep chef
root 14303 14279 0 13:29 pts/0 00:00:00 grep chef
-> remove the stale yum pid
rm /var/run/yum.pid
-> run chef-client
chef-client
And it just works. So, here are the logs from when it was failing, notice it continued to try every hour...
So is there a live process that has the yum lock? Or it’s truly stale and the pid in the file is a dead process? Also is this machine tight on RAM, and is there anything in the OOM killer log? At first glance I can’t think of what chef could do automatically for that case. Perhaps it could read the pid file and check if the process is alive, but that seems fraught with peril and prone to race conditions.
In any case if there are not any stale yum-dump.py processes hanging around then this behavior isn’t caused by MIXLIB-16, so I might have sent you on a wild goose chase.