Chef Client Run Failures When Installing Packages


#1

Hi,

I am using a private AMI with some packages prebaked into it. These
packages include RVM and Chef. When the instance boots up from the AMI for
the first time, the pre-installed Chef Client would register itself with
our private Chef Server. Depending what the instance is supposed to be,
the Chef Client would pull down the appropriate cookbooks from the Chef
Server.

We run Ubuntu 14.04 on AWS, and we use both Packer 0.8.5’s remote
shell provisioner and chef-client provisioner to install packages before
baking the AMI.

Sometimes one of the cookbooks that gets pulled down would require the
installation of additional packages. This scenario is when I would
consistently see fatal failures in my first Chef Client run:

https://gist.github.com/yhuang/d405c72830870c0f8140

In the above example, the following two packages need to be installed:

logentries
logentries-daemon

I have seen other package installation’s triggering the same error, so I do
not feel the error is particular to these two logentries-related packages.

Based on the log, I seem to be hitting an error during the execution of
this statement:

apt-get update -o Dir::Etc::sourcelist=‘sources.list.d/logentries.list’ -o
Dir::Etc::sou rceparts=’-’ -o APT::Get::List-Cleanup=‘0’

The specific error seems to be:

STDERR: E: Could not get lock /var/lib/dpkg/lock - open (11: Resource
temporarily unavailable) 2015-08-16 21:16:10,755 P1579 [INFO] E: Unable to
lock the administration directory (/var/lib/dpkg/), is another process
using it?

Many others seem to have run into similar issues and recommended the
removal of various lock files:

I use Packer to build my images, and right before I bake an image, I always
perform the following steps to:

1. remove the lock files;
2. run dpkg --configure -a after the removal of the lock files;
3. reboot;

https://gist.github.com/yhuang/461a94ecffd803c14101#file-gistfile1-txt-L54-L60

All these steps have been suggested by the thread on Ubuntu.
Unfortunately, I would still get the same error:

STDERR: E: Could not get lock /var/lib/dpkg/lock - open (11: Resource
temporarily unavailable) 2015-08-16 21:16:10,755 P1579 [INFO] E: Unable to
lock the administration directory (/var/lib/dpkg/), is another process
using it?

What makes this error especially baffling is that if I were to trigger
another run of the Chef Client on the new instance, the second run will run
to completion without error consistently.

One workaround I have been doing is to pre-install whatever missing package
is required by the Chef Client run in my base image; however, I do not feel
this workaround is sustainable in the long run. I really would like to
understand what makes this first Chef Client run different, and why I am
seeing this dpkg lock error every time with my baked AMIs.

Here is my /var/log/chef.log:

https://gist.github.com/yhuang/7c002fa926756c348047

Thank you for your help.

Jimmy