Chef servers crashing w/ kernel panics? this may help

My chef-controlled EC2 instances have been crashing pretty routinely, and I finally got a log of the kernel panic from one of them. It turns out to be this issue: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/999755

As you can see in the link, it’s related to an interaction bug among Ruby 1.9.3 (which is bundled w/ recent Omnibus installations), ohai, and fairly recent Linux kernels.

So if you’re getting kernel panics on your chef-controlled Linux boxes, make sure your kernel is patched against this bug. For Ubuntu, that means 3.2.0 >= 29.46 or 3.0.0 >= 24.40.

On EC2 I needed to do an apt-get dist-upgrade to pull down the “held back” kernel packages and then reboot. But now I’m running a patched kernel, so hopefully this problem will go away now.

I hope this helps someone else, as it was driving me a little bit nuts. Happy cooking!

Wes

thanks for the tips

On Sat, Oct 6, 2012 at 2:36 AM, Wes Morgan cap10morgan@gmail.com wrote:

My chef-controlled EC2 instances have been crashing pretty routinely, and
I finally got a log of the kernel panic from one of them. It turns out to
be this issue: Bug #999755 “Kernel crash in rb_next doing ohai loops” : Bugs : linux package : Ubuntu

As you can see in the link, it's related to an interaction bug among Ruby
1.9.3 (which is bundled w/ recent Omnibus installations), ohai, and fairly
recent Linux kernels.

So if you're getting kernel panics on your chef-controlled Linux boxes,
make sure your kernel is patched against this bug. For Ubuntu, that means
3.2.0 >= 29.46 or 3.0.0 >= 24.40.

On EC2 I needed to do an apt-get dist-upgrade to pull down the "held back"
kernel packages and then reboot. But now I'm running a patched kernel, so
hopefully this problem will go away now.

I hope this helps someone else, as it was driving me a little bit nuts.
Happy cooking!

Wes