Segfault with CentOS 5.4/5.5 + ruby 1.8.7 + chef 0.10.0

On Thu, Jun 9, 2011 at 5:12 PM, Daniel DeLeo dan@kallistec.com wrote:

On Thursday, June 9, 2011 at 5:55 AM, Sergio Rubio wrote:

On Thu, Jun 9, 2011 at 12:44 PM, Sergio Rubio <rubiojr@frameos.org(mailto:
rubiojr@frameos.org)> wrote:

I've opened a bug in case anyone is interested:

Backport #4856: Random segfaults running opscode chef-client 0.10 - Backport187 - Ruby Issue Tracking System

Let's hope we can get some assistance from devs.

Rgds.

I've made some progress tracing this. I've replaced the popen4 in yum
provider with simple shell quotes and my previous test no longer crashes
ruby.

If you are interested in testing the fix (albeit incomplete), the yum.rb
provider patch is here:

Fixes ruby 1.8.7 segfaulting in CentOS 5.6 i386 · GitHub

Did not have a chance to test Matthew Kent patches, but I'd like to have
a look at them if time permits. We can patch the RBEL RPMs and do not wait
to the 0.10.2 release if required.

Rgds.
I ran into similar issues when I developed Chef::ShellOut, which I found
were caused by object allocation during GC. Since I did not have the option
of forcing people to upgrade to a ruby without the issue, I disabled GC for
the affected portion of the code. You could try replacing popen4 with
shell_out and see if this fixes the issue.

https://github.com/opscode/chef/blob/master/chef/lib/chef/shell_out.rb
https://github.com/opscode/chef/blob/master/chef/lib/chef/shell_out/unix.rb

Awesome.

Disabling GC and re enabling it after the loop fixes the segfault also. Not
sure if playing with GC is better than replacing the popen4 with some other
stuff, as I don't know the code base at all.

If you guys feel like both approaches are valid, I can add a patch to the
Chef RPM to alleviate some of the pain the CentOS users are dealing with.

Rgds.

--

Dan DeLeo

On Thursday, June 9, 2011 at 8:32 AM, Sergio Rubio wrote:

On Thu, Jun 9, 2011 at 5:12 PM, Daniel DeLeo <dan@kallistec.com (mailto:dan@kallistec.com)> wrote:

On Thursday, June 9, 2011 at 5:55 AM, Sergio Rubio wrote:

On Thu, Jun 9, 2011 at 12:44 PM, Sergio Rubio <rubiojr@frameos.org (mailto:rubiojr@frameos.org) (mailto:rubiojr@frameos.org)> wrote:

I've opened a bug in case anyone is interested:

Backport #4856: Random segfaults running opscode chef-client 0.10 - Backport187 - Ruby Issue Tracking System

Let's hope we can get some assistance from devs.

Rgds.

I've made some progress tracing this. I've replaced the popen4 in yum provider with simple shell quotes and my previous test no longer crashes ruby.

If you are interested in testing the fix (albeit incomplete), the yum.rb provider patch is here:

Fixes ruby 1.8.7 segfaulting in CentOS 5.6 i386 · GitHub

Did not have a chance to test Matthew Kent patches, but I'd like to have a look at them if time permits. We can patch the RBEL RPMs and do not wait to the 0.10.2 release if required.

Rgds.
I ran into similar issues when I developed Chef::ShellOut, which I found were caused by object allocation during GC. Since I did not have the option of forcing people to upgrade to a ruby without the issue, I disabled GC for the affected portion of the code. You could try replacing popen4 with shell_out and see if this fixes the issue.

https://github.com/opscode/chef/blob/master/chef/lib/chef/shell_out.rb
https://github.com/opscode/chef/blob/master/chef/lib/chef/shell_out/unix.rb

Awesome.

Disabling GC and re enabling it after the loop fixes the segfault also. Not sure if playing with GC is better than replacing the popen4 with some other stuff, as I don't know the code base at all.

If you guys feel like both approaches are valid, I can add a patch to the Chef RPM to alleviate some of the pain the CentOS users are dealing with.

Rgds.
Yeah, that would be great. I'm in favor of switching popen4 to shell_out if you can since the API is cleaner and it has built-in support for nice error messages when a command fails as well as live updating of output to a tty in some conditions.

Feel free to hop on to #chef-hacking on freenode.net (http://freenode.net) if you have development questions, or you can mail the chef-dev list as well.

Dan DeLeo

--
Dan DeLeo

On Thu, Jun 9, 2011 at 5:36 PM, Daniel DeLeo dan@kallistec.com wrote:

On Thursday, June 9, 2011 at 8:32 AM, Sergio Rubio wrote:

If you guys feel like both approaches are valid, I can add a patch to the
Chef RPM to alleviate some of the pain the CentOS users are dealing with.

Rgds.
Yeah, that would be great. I'm in favor of switching popen4 to shell_out if
you can since the API is cleaner and it has built-in support for nice error
messages when a command fails as well as live updating of output to a tty in
some conditions.

Feel free to hop on to #chef-hacking on freenode.net (http://freenode.net)
if you have development questions, or you can mail the chef-dev list as
well.

Alright. I've modified the patch to use shell_out:

And the full yum.rb file in case anyone wants to test it:

(replaces
/usr/lib/ruby/gems/1.8/gems/chef-0.10.0/lib/chef/provider/package/yum.rb)

My plan is to do some more testing and If everything goes well, I'll push an
updated RPM soon after that.

I've seen that there are a lot of changes coming to the yum provider in the
next release, so I guess submitting a pull request is not necessary, right?

Rgds.

--
Dan DeLeo

--
Dan DeLeo

On Thursday, June 9, 2011 at 9:07 AM, Sergio Rubio wrote:

On Thu, Jun 9, 2011 at 5:36 PM, Daniel DeLeo <dan@kallistec.com (mailto:dan@kallistec.com)> wrote:

On Thursday, June 9, 2011 at 8:32 AM, Sergio Rubio wrote:

If you guys feel like both approaches are valid, I can add a patch to the Chef RPM to alleviate some of the pain the CentOS users are dealing with.

Rgds.
Yeah, that would be great. I'm in favor of switching popen4 to shell_out if you can since the API is cleaner and it has built-in support for nice error messages when a command fails as well as live updating of output to a tty in some conditions.

Feel free to hop on to #chef-hacking on freenode.net (http://freenode.net) (http://freenode.net) if you have development questions, or you can mail the chef-dev list as well.

Alright. I've modified the patch to use shell_out:

Fixes ruby 1.8.7 segfaulting in CentOS 5.6 i386 · GitHub

And the full yum.rb file in case anyone wants to test it:

patched chef yum.rb provider · GitHub

(replaces /usr/lib/ruby/gems/1.8/gems/chef-0.10.0/lib/chef/provider/package/yum.rb)

My plan is to do some more testing and If everything goes well, I'll push an updated RPM soon after that.

I've seen that there are a lot of changes coming to the yum provider in the next release, so I guess submitting a pull request is not necessary, right?

Rgds.

It would be best if you could create a ticket on our JIRA, tickets.opscode.com Let me know the ticket number and I'll fast track it for this release since this is a blocking issue for many of you.

--
Dan DeLeo

--
Dan DeLeo

--
Dan DeLeo

On Thu, Jun 9, 2011 at 6:12 PM, Daniel DeLeo dan@kallistec.com wrote:

On Thursday, June 9, 2011 at 9:07 AM, Sergio Rubio wrote:

On Thu, Jun 9, 2011 at 5:36 PM, Daniel DeLeo <dan@kallistec.com (mailto:
dan@kallistec.com)> wrote:

On Thursday, June 9, 2011 at 8:32 AM, Sergio Rubio wrote:

If you guys feel like both approaches are valid, I can add a patch to
the Chef RPM to alleviate some of the pain the CentOS users are dealing
with.

Rgds.
Yeah, that would be great. I'm in favor of switching popen4 to
shell_out if you can since the API is cleaner and it has built-in support
for nice error messages when a command fails as well as live updating of
output to a tty in some conditions.

Feel free to hop on to #chef-hacking on freenode.net (
http://freenode.net) (http://freenode.net) if you have development
questions, or you can mail the chef-dev list as well.

Alright. I've modified the patch to use shell_out:

Fixes ruby 1.8.7 segfaulting in CentOS 5.6 i386 · GitHub

And the full yum.rb file in case anyone wants to test it:

patched chef yum.rb provider · GitHub

(replaces
/usr/lib/ruby/gems/1.8/gems/chef-0.10.0/lib/chef/provider/package/yum.rb)

My plan is to do some more testing and If everything goes well, I'll push
an updated RPM soon after that.

I've seen that there are a lot of changes coming to the yum provider in
the next release, so I guess submitting a pull request is not necessary,
right?

Rgds.

It would be best if you could create a ticket on our JIRA,
tickets.opscode.com Let me know the ticket number and I'll fast track it
for this release since this is a blocking issue for many of you.

Sure, no prob: CHEF-2413 http://tickets.opscode.com/browse/CHEF-2413

Thanks.

--
Dan DeLeo

--
Dan DeLeo

--
Dan DeLeo

On Fri, Jun 10, 2011 at 9:37 AM, Sergio Rubio rubiojr@frameos.org wrote:

On Thu, Jun 9, 2011 at 6:12 PM, Daniel DeLeo dan@kallistec.com wrote:

It would be best if you could create a ticket on our JIRA,
tickets.opscode.com Let me know the ticket number and I'll fast track it
for this release since this is a blocking issue for many of you.

Sure, no prob: CHEF-2413 http://tickets.opscode.com/browse/CHEF-2413

Thanks.

Fix is now available in RBEL testing:

yum upgrade rubygem-chef --enablerepo rbel5-testing

[root@localhost ~]# rpm -qa|grep rubygem-chef
rubygem-chef-0.10.0-5.el5

--

Dan DeLeo

--
Dan DeLeo

--
Dan DeLeo