Hi, James,
Thank you for this great idea.
I experimented with different builds for chef 0.10:
ruby version [ie, 1.8.7-334]
ruby install type [package]
ruby source [own, based on rbel]
CentOS version [5.5]
CentOS source [iso]
Is prelink enabled? [1]
What resource bombed? Our own resource for zabbix managing
Does the error ever occur on the first run? No
When does the first error tend to occur? It isn't predictable, chef
can work two hours or two days
After the error occurs, do future runs work? Yes, with segfault after some time
Chef segfault message:
/usr/lib/ruby/gems/1.8/gems/chef-0.10.2/bin/../lib/chef/shell_out/unix.rb:212:
[BUG] Segmentation fault
ruby 1.8.7 (2011-02-18 patchlevel 334) [x86_64-linux]
ruby version [1.8.7-334]
ruby install type [package]
ruby source [own, based on rbel]
CentOS version [5.5]
CentOS source [iso]
Is prelink enabled? [1]
What resource bombed? Our own resource for sysctl managing
Does the error ever occur on the first run? No
When does the first error tend to occur? It isn't predictable, chef
can work two hours or two days
After the error occurs, do future runs work? Yes, with segfault after some time
Chef segfault message:
/usr/lib/ruby/gems/1.8/gems/chef-0.10.2/bin/../lib/chef/shell_out/unix.rb:22:
[BUG] Segmentation fault
ruby 1.8.7 (2011-02-18 patchlevel 334) [x86_64-linux]
ruby version [ree-1.8.7-2011.03]
ruby install type [package]
ruby source [own, with standart install script, installed to /opt]
CentOS version [5.5]
CentOS source [iso]
Is prelink enabled? [1]
Everything works fine. It was tested about a week.
ruby version [ree-1.8.7-2011.03]
ruby install type [package]
ruby source [own, based on rbel specs, replace system ruby]
CentOS version [5.5]
CentOS source [iso]
Is prelink enabled? [1]
What resource bombed? resource yum
Does the error ever occur on the first run? Yes
When does the first error tend to occur?
After the error occurs, do future runs work? Yes
Chef segfault message:
/usr/lib/ruby/gems/1.8/gems/chef-0.10.0/bin/../lib/chef/provider/package/yum.rb:555:
[BUG] Segmentation fault
ruby 1.8.7 (2011-02-18 patchlevel 334) [x86_64-linux], MBARI 0x6770,
Ruby Enterprise Edition 2011.03
ruby version [1.8.7-352]
ruby install type [package]
ruby source [own, based on rbel]
CentOS version [5.5]
CentOS source [iso]
Is prelink enabled? [1]
What resource bombed? service resource
Does the error ever occur on the first run? No
When does the first error tend to occur? It isn't predictable, chef
can work two hours or two days
After the error occurs, do future runs work? Yes, with segfault after some time
Chef segfault message:
/usr/lib/ruby/gems/1.8/gems/chef-0.10.0/bin/../lib/chef/mixin/command/unix.rb:190:
[BUG] Segmentation fault
ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux]
ruby version [1.8.7-352]
ruby install type [package]
ruby source [aegisco]
CentOS version [5.5]
CentOS source [iso]
Is prelink enabled? [1]
What resource bombed? On start phase - Sending HTTP Request via GET to
api.opscode.com:443/organizations/****/search/node
Does the error ever occur on the first run? No
When does the first error tend to occur? It isn't predictable, chef
can work two hours or two days
After the error occurs, do future runs work? Yes, with segfault after some time
Chef segfault message:
/usr/lib/ruby/1.8/net/protocol.rb:135: [BUG] Segmentation fault
ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux]
I suppose this information can help somebody to explain problems.
Seems that it's a number of causes of crash to different builds.
On 11 August 2011 00:07, James js@aegisco.com wrote:
That last question was phrased awkwardly, let's expand to three:
ruby version [ie, 1.8.7-352]
ruby install type [ie, package, source]
ruby source [ie, ruby,org, aegisco, rbel]
CentOS version [ie, 5.5, 5.6]
CentOS source [ie, AMI, iso, veewee]
Is prelink enabled? [1]
What resource bombed?
Does the error ever occur on the first run?
When does the first error tend to occur?
After the error occurs, do future runs work?
On Wed, Aug 10, 2011 at 12:45 PM, James js@aegisco.com wrote:
Thanks very much for this, I don't know if I would have found it...
I have to say that I'm confounded; we need more information. I'm not able
to reproduce these problems on the public CentOS AMI's I have used for
testing, and I don't have CentOS systems in production. I believe the most
important information we need is:
ruby version [ie, 1.8.7-352]
ruby install type [ie, package, source]
ruby source [ie, ruby,org, aegisco, rbel]
CentOS version [ie, 5.5, 5.6]
CentOS source [ie, AMI, iso, veewee]
Is prelink enabled? [1]
What resource bombed?
Does the error never occur on the first run, but does occur after the
first, intermittently?
It would also be ideal to get traces for these segfaults.
James
[1] Beware prelink and compiling ruby from source - Ruby - Ruby-Forum
On Wed, Aug 10, 2011 at 12:14 PM, John E. Vincent (lusis)
lusis.org+chef-list@gmail.com wrote:
There's a bug on RHEL5 with prelinking. I think it was x86_64 only but
I'm not positive at this point.
I had an RPM of 1.9.2 I build monolithically and packaged with fpm. It
install in the system path (we explicitly didn't install RHEL ruby at
all)
Every morning at 4AM all our ruby-based cronjobs and chef itself would
stop working. Prelink was the only cronjob of importance that ran at
that time so a bit of googling lead to this (iirc)
http://www.tsheffler.com/blog/?p=491
I opted to just remove prelink from the system since this isn't the
first time I've seen it cause issues. In your case, you'd probably add
the exclusion as part of the RPM.
It's easy enough to test, uninstall/disable the prelink cron job and
see if it still happens.
On Wed, Aug 10, 2011 at 2:44 PM, James js@aegisco.com wrote:
John,
Please elaborate?
Thanks,
James
On Wed, Aug 10, 2011 at 5:07 AM, John E. Vincent (lusis)
lusis.org+chef-list@gmail.com wrote:
After a day? Sounds like the prelinking bug.
On Aug 10, 2011 6:04 AM, "Titov Alexander" titoff.a@gmail.com wrote:
Hi James,
Problem is still persist:
[Tue, 09 Aug 2011 15:13:15 +0000] DEBUG: Sending HTTP Request via
GET
to api.opscode.com:443/organizations/qik/search/node
/usr/lib/ruby/1.8/net/protocol.rb:135: [BUG] Segmentation fault
ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux]
It happens after a day chef working. New binaries didn't help=(.
On 8 August 2011 22:25, James js@aegisco.com wrote:
Titov,
We use different build methods, but largely the same spec files,
for
all of
the dependencies. Some of the versions of dependencies are
different.
The
spec files for the rubygems are going to be different as well.
Sergio
uses
gem2rpm whereas I use fpm.
These have so far only been tested on my own test systems, I would
really
appreciate other people testing them because I'll move them to the
stable
repo and start building out the rest of the Server dependencies, as
well as
builds for el6 and potentially FC.
James
On Mon, Aug 8, 2011 at 4:50 AM, Titov Alexander
titoff.a@gmail.com
wrote:
What is the difference between spec files for your ruby rpm and
RBEL?
How long did you test this rpms for segfaults?
On 6 August 2011 01:06, James js@aegisco.com wrote:
All of the el5 i386 and x86_64 ruby, chef, and chef dependency
packages
have
been rebuilt, along with dependencies for 0.10.4/5. These are
currently
in
the testing.aegisco.com repo, and will be moved to the stable
one
after
vetting.
The rubygem-chef 0.10.5 package itself is held up on what I
think is
an
incompatibility between rubygem's pessimistic version constraint
[1]
and
rpm's lack of support for this feature. If someone can confirm
this
and
write a translation layer for fpm, that would be wonderful. In
the
interim,
I'll build these rpms by hand with modifications to the
fpm-generated
spec
files.
I have tested the i386 install for the segfaulting issues we
were
seeing
previously, and they seem to be resolved. Further confirmation
on
this
would
be helpful as well.
Example bootstrap:
https://gist.github.com/1128513
List of packages:
autoconf-2.68-2
flex-2.5.35-7
gecode-3.5.0-1
gecode-devel-3.5.0-1
gecode-doc-3.5.0-1
gecode-examples-3.5.0-1
m4-1.4.16-2
ruby-1.8.7.352-1
ruby-devel-1.8.7.352-1
ruby-irb-1.8.7.352-1
ruby-libs-1.8.7.352-1
ruby-rdoc-1.8.7.352-1
ruby-ri-1.8.7.352-1
ruby-static-1.8.7.352-1
ruby-tcltk-1.8.7.352-1
rubygem-bunny-0.7.1
rubygem-bunny-0.7.4
rubygem-chef-0.10.2
rubygem-erubis-2.7.0
rubygem-highline-1.6.2
rubygem-json-1.5.2
rubygem-json-1.5.3
rubygem-mime-types-1.16
rubygem-mixlib-authentication-1.1.4
rubygem-mixlib-cli-1.2.0
rubygem-mixlib-config-1.1.2
rubygem-mixlib-log-1.3.0
rubygem-moneta-0.6.0
rubygem-net-ssh-2.1.4
rubygem-net-ssh-gateway-1.1.0
rubygem-net-ssh-multi-1.0.1
rubygem-net-ssh-multi-1.1
rubygem-ohai-0.6.4
rubygem-polyglot-0.3.1
rubygem-polyglot-0.3.2
rubygem-rest-client-1.6.3
rubygem-systemu-2.2.0
rubygem-treetop-1.4.10
rubygem-treetop-1.4.9
rubygem-uuidtools-2.1.2
rubygem-yajl-ruby-0.8.2
rubygems-1.8.5-1
[1] http://docs.rubygems.org/read/chapter/16
--
Titov Alexander
--
Titov Alexander
--
С уважением,
Титов Александр