RE: Not detecting service start failure

On top of the script in /etc/init.d that starts the service, try adding the following line

. /etc/init.d/function

Chris

Sent from my iPad using Mail+ for Outlookhttp://taps.io/mailplus

From: Yoshi Spendiff
Sent: 5/11/15, 9:57 PM
To: chef
Subject: [chef] Not detecting service start failure
Hi,

I’m having trouble understanding why Chef isn’t detecting that a service is failing to start. I’m attempting to get Pound (proxy) to work. The package installs fine and I have a service declaration that says the service is started when it’s not. I’m running via chef solo at the moment (it’s a vagrant box) and running a chef-solo run with debug logging produces the following:

[2015-05-12T01:45:50+00:00] DEBUG: service[pound] you have specified a status command, running…
pound is stopped
pound 0:off 1:off 2:on 3:on 4:on 5:on 6:off
Starting Pound: starting…
[FAILED]
Starting Pound: starting…
[FAILED]
Starting Pound: starting…
[FAILED]
Starting Pound: starting…
[FAILED]
Starting Pound: starting…
[FAILED]
[2015-05-12T01:45:52+00:00] INFO: service[pound] started

  • start service service[pound]

This are my recipe declaration:

Install pound and deploy configuration file from template

package ‘Pound’ do
options '–enablerepo=epel’
end

template ‘/etc/pound.cfg’ do
source 'pound.cfg.erb’
mode 0644
end

service ‘pound’ do
supports :restart => false, :reload => false, :status => true
action [:start, :enable]
end

I have tried using the init_command and status_command options, as well as disabling and enabling the supports :status => true option

If I put the above service declaration into a test file and run it with chef-apply then it fails as it should:
[2015-05-12T01:56:28+00:00] FATAL: Mixlib::ShellOut::ShellCommandFailed: service[pound] ((chef-apply cookbook)::(chef-apply recipe) line 5) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received ‘1’
---- Begin output of /sbin/service pound start ----
STDOUT: Starting Pound: [FAILED]
STDERR: starting…
---- End output of /sbin/service pound start ----
Ran /sbin/service pound start returned 1

The pound init script is set up to run correctly, a non-zero exit code is returned on unsuccessful start of pound.

There are no processes called pound running on the system and there are no other service declarations called pound in all of my cookbooks.

What am I missing?

Cheers


Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.commailto:yoshi.spendiff@indochino.com

The init script already includes /etc/rc.d/init.d/functions, which is the
same thing.

Why is chef-apply working but not Chef solo?

On Mon, May 11, 2015 at 9:12 PM, Fouts, Chris Chris.Fouts@sensus.com
wrote:

On top of the script in /etc/init.d that starts the service, try adding
the following line

. /etc/init.d/function

Chris

Sent from my iPad using Mail+ for Outlook http://taps.io/mailplus

From: Yoshi Spendiff
Sent: 5/11/15, 9:57 PM
To: chef
Subject: [chef] Not detecting service start failure
Hi,

I'm having trouble understanding why Chef isn't detecting that a service
is failing to start. I'm attempting to get Pound (proxy) to work. The
package installs fine and I have a service declaration that says the
service is started when it's not. I'm running via chef solo at the moment
(it's a vagrant box) and running a chef-solo run with debug logging
produces the following:

  • [2015-05-12T01:45:50+00:00] DEBUG: service[pound] you have specified a
    status command, running.. pound is stopped pound 0:off
    1:off 2:on 3:on 4:on 5:on 6:off Starting Pound: starting...
    [FAILED] Starting Pound: starting... [FAILED] Starting Pound: starting...
    [FAILED] Starting Pound: starting... [FAILED] Starting Pound: starting...
    [FAILED] [2015-05-12T01:45:52+00:00] INFO: service[pound] started - start
    service service[pound]*

This are my recipe declaration:

*# Install pound and deploy configuration file from template package
'Pound' do options '--enablerepo=epel' end template '/etc/pound.cfg' do
source 'pound.cfg.erb' mode 0644 end service 'pound' do supports
:restart => false, :reload => false, :status => true action [:start,
:enable] end *
I have tried using the init_command and status_command options, as well
as disabling and enabling the supports :status => true option

If I put the above service declaration into a test file and run it with *chef-apply
*then it fails as it should:

[2015-05-12T01:56:28+00:00] FATAL: Mixlib::ShellOut::ShellCommandFailed:
service[pound] ((chef-apply cookbook)::(chef-apply recipe) line 5) had an
error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with
[0], but received '1' ---- Begin output of /sbin/service pound start ----
STDOUT: Starting Pound: [FAILED] STDERR: starting... ---- End output of
/sbin/service pound start ---- Ran /sbin/service pound start returned 1

The pound init script is set up to run correctly, a non-zero exit code is
returned on unsuccessful start of pound.

There are no processes called pound running on the system and there are
no other service declarations called pound in all of my cookbooks.

What am I missing?

Cheers

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com

Same thing happens if I boot up an instance and use chef-client against a
Chef server. Says it's starting, says it supports status, says it's
running, it's not.

chef-apply with the same service declaration works correctly on that
instance as well.

[2015-05-12T15:44:11+00:00] DEBUG: service[pound] supports status,
running[2015-05-12T15:44:13+00:00] INFO: service[pound] started

But running *service pound status *returns an exit code of 3 and there are
no processes called 'pound'

Is Chef actually using the init scripts? I thought if *supports :status =>
true *was set then essentially *service pound status *was run, and if supports
:status => false
then Chef checks for a process name (based on the
declaration name?). In both cases it should fail?

On Tue, May 12, 2015 at 8:13 AM, Yoshi Spendiff <
yoshi.spendiff@indochino.com> wrote:

The init script already includes /etc/rc.d/init.d/functions, which is the
same thing.

Why is chef-apply working but not Chef solo?

On Mon, May 11, 2015 at 9:12 PM, Fouts, Chris Chris.Fouts@sensus.com
wrote:

On top of the script in /etc/init.d that starts the service, try adding
the following line

. /etc/init.d/function

Chris

Sent from my iPad using Mail+ for Outlook http://taps.io/mailplus

From: Yoshi Spendiff
Sent: 5/11/15, 9:57 PM
To: chef
Subject: [chef] Not detecting service start failure
Hi,

I'm having trouble understanding why Chef isn't detecting that a service
is failing to start. I'm attempting to get Pound (proxy) to work. The
package installs fine and I have a service declaration that says the
service is started when it's not. I'm running via chef solo at the moment
(it's a vagrant box) and running a chef-solo run with debug logging
produces the following:

  • [2015-05-12T01:45:50+00:00] DEBUG: service[pound] you have specified a
    status command, running.. pound is stopped pound 0:off
    1:off 2:on 3:on 4:on 5:on 6:off Starting Pound: starting...
    [FAILED] Starting Pound: starting... [FAILED] Starting Pound: starting...
    [FAILED] Starting Pound: starting... [FAILED] Starting Pound: starting...
    [FAILED] [2015-05-12T01:45:52+00:00] INFO: service[pound] started - start
    service service[pound]*

This are my recipe declaration:

*# Install pound and deploy configuration file from template package
'Pound' do options '--enablerepo=epel' end template '/etc/pound.cfg' do
source 'pound.cfg.erb' mode 0644 end service 'pound' do supports
:restart => false, :reload => false, :status => true action [:start,
:enable] end *
I have tried using the init_command and status_command options, as well
as disabling and enabling the supports :status => true option

If I put the above service declaration into a test file and run it with *chef-apply
*then it fails as it should:

[2015-05-12T01:56:28+00:00] FATAL: Mixlib::ShellOut::ShellCommandFailed:
service[pound] ((chef-apply cookbook)::(chef-apply recipe) line 5) had an
error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with
[0], but received '1' ---- Begin output of /sbin/service pound start ----
STDOUT: Starting Pound: [FAILED] STDERR: starting... ---- End output of
/sbin/service pound start ---- Ran /sbin/service pound start returned 1

The pound init script is set up to run correctly, a non-zero exit code
is returned on unsuccessful start of pound.

There are no processes called pound running on the system and there are
no other service declarations called pound in all of my cookbooks.

What am I missing?

Cheers

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com

This is doing my nut in.

The redhat service provider is used, I can see that from the debug output:

    • service[pound] action start[2015-05-12T16:35:51+00:00] INFO:
      Processing service[pound] action start (indochino-role::web-server line
      94)[2015-05-12T16:35:51+00:00] DEBUG: providers for generic service
      resource enabled on node include: [Chef::Provider::Service::Init,
      Chef::Provider::Service::Redhat,
      Chef::Provider::Service::Upstart][2015-05-12T16:35:51+00:00] DEBUG:
      providers that refused resource service[pound] were:
      [Chef::Provider::Service::Upstart][2015-05-12T16:35:51+00:00] DEBUG:
      providers that support resource service[pound] include:
      [Chef::Provider::Service::Init,
      Chef::Provider::Service::Redhat][2015-05-12T16:35:51+00:00] DEBUG:
      providers that survived replacement include:
      [Chef::Provider::Service::Redhat][2015-05-12T16:35:51+00:00] DEBUG:
      service[pound] you have specified a status command, running..pound
      0:off 1:off 2:on 3:on 4:on 5:on 6:offStarting Pound:
      starting...[FAILED]POUND INIT COMMAND EXITED WITH CODE OF 1Starting Pound:
      starting...[FAILED]POUND INIT COMMAND EXITED WITH CODE OF 1Starting Pound:
      starting...[FAILED]POUND INIT COMMAND EXITED WITH CODE OF 1Starting Pound:
      starting...[FAILED]POUND INIT COMMAND EXITED WITH CODE OF 1Starting Pound:
      starting...[FAILED]POUND INIT COMMAND EXITED WITH CODE OF
      1[2015-05-12T16:35:53+00:00] INFO: service[pound] started*

I've tried to follow the stack as best I could, with my admittedly meager
ruby grokking skills, and as far as I can work out an exception should be
raised if the service fails to start.

So from what I can tell:

  1. The redhat provider (Chef::Provider::Service::Init::Redhat) is loaded,
    which sets the following parameter @init_command = "/sbin/service
    #{@new_resource.service_name}"
    , which is correct
  2. The service is loaded using load_current_resource in
    Chef::Provider::Service::Simple. The status is then verified either by a
    custom command if specified or the default using the default_init_script
    from the provider above.
  3. If the condition for starting is met (i.e. the service is not running)
    and the start action is specified then the action_start function in
    Chef::Provider::Service is run, which adds the start_service function to
    convergence using 'converge_by'. The start_service function comes from
    either Chef::Provider::Service::Simple or Chef::Provider::Service::Init
    depending on if a custom start command is specified. Either way the command
    is sent to a shell out.
  4. The start command is execute via the shell_out_with_systems_locale!
    function in Chef::Mixin::ShellOut which calls
    shell_out_with_systems_locale, which in turn calls shell_out_command, which
    in turn creates an instance of Mixlib::ShellOut to run the command
  5. The .error! method of Mixlib::ShellOut is called against the instance
    after the command is run, which should generate a ShellCommandFailed
    exception with the message invalid!("Expected process to exit with
    #{valid_exit_codes.inspect}, but received '#{exitstatus}'") if error?

The last is the message I see if I run the recipe by chef-apply (and what
I'm pretty sure I normally see when a service fails). Adding some hacky
debugging to the chef code on a test instance I can see that
Mixlib::ShellOut picks up the non-zero exit code, confirms it's not a valid
exit code and I can see the error status is set to generate an error if the
.error! method is called, which it is.
I can only assume that whatever in the convergence causes the service start
command to loop 5 times also catches the exception and handles it somehow,
I can't work that bit out.

This is the debugging fwiw (unsure why action_start and start_service
aren't called the first time on the loop):

    • service[pound] action start[2015-05-12T19:35:34+00:00] INFO:
      shell_out_command method called[2015-05-12T19:35:34+00:00] INFO: ShellOut
      exit status was 1. Allowed exit codes are [0][2015-05-12T19:35:34+00:00]
      INFO: shell_out_command error status is true[2015-05-12T19:35:34+00:00]
      INFO: action_start method called[2015-05-12T19:35:34+00:00] INFO:
      start_service method called[2015-05-12T19:35:34+00:00] INFO:
      shell_out_with_systems_locale! method called[2015-05-12T19:35:34+00:00]
      INFO: shell_out_with_systems_locale method
      called[2015-05-12T19:35:34+00:00] INFO: shell_out_command method
      called[2015-05-12T19:35:35+00:00] INFO: ShellOut exit status was 1. Allowed
      exit codes are [0][2015-05-12T19:35:35+00:00] INFO: shell_out_command error
      status is true[2015-05-12T19:35:35+00:00] INFO:
      shell_out_with_systems_locale! error status is
      true[2015-05-12T19:35:35+00:00] INFO: start_service method
      called[2015-05-12T19:35:35+00:00] INFO: shell_out_with_systems_locale!
      method called[2015-05-12T19:35:35+00:00] INFO:
      shell_out_with_systems_locale method called[2015-05-12T19:35:35+00:00]
      INFO: shell_out_command method called[2015-05-12T19:35:35+00:00] INFO:
      ShellOut exit status was 1. Allowed exit codes are
      [0][2015-05-12T19:35:35+00:00] INFO: shell_out_command error status is
      true[2015-05-12T19:35:35+00:00] INFO: shell_out_with_systems_locale! error
      status is true[2015-05-12T19:35:35+00:00] INFO: start_service method
      called[2015-05-12T19:35:35+00:00] INFO: shell_out_with_systems_locale!
      method called[2015-05-12T19:35:35+00:00] INFO:
      shell_out_with_systems_locale method called[2015-05-12T19:35:35+00:00]
      INFO: shell_out_command method called[2015-05-12T19:35:35+00:00] INFO:
      ShellOut exit status was 1. Allowed exit codes are
      [0][2015-05-12T19:35:35+00:00] INFO: shell_out_command error status is
      true[2015-05-12T19:35:35+00:00] INFO: shell_out_with_systems_locale! error
      status is true[2015-05-12T19:35:35+00:00] INFO: start_service method
      called[2015-05-12T19:35:35+00:00] INFO: shell_out_with_systems_locale!
      method called[2015-05-12T19:35:35+00:00] INFO:
      shell_out_with_systems_locale method called[2015-05-12T19:35:35+00:00]
      INFO: shell_out_command method called[2015-05-12T19:35:36+00:00] INFO:
      ShellOut exit status was 1. Allowed exit codes are
      [0][2015-05-12T19:35:36+00:00] INFO: shell_out_command error status is
      true[2015-05-12T19:35:36+00:00] INFO: shell_out_with_systems_locale! error
      status is true[2015-05-12T19:35:36+00:00] INFO: start_service method
      called[2015-05-12T19:35:36+00:00] INFO: shell_out_with_systems_locale!
      method called[2015-05-12T19:35:36+00:00] INFO:
      shell_out_with_systems_locale method called[2015-05-12T19:35:36+00:00]
      INFO: shell_out_command method called[2015-05-12T19:35:36+00:00] INFO:
      ShellOut exit status was 1. Allowed exit codes are
      [0][2015-05-12T19:35:36+00:00] INFO: shell_out_command error status is
      true[2015-05-12T19:35:36+00:00] INFO: shell_out_with_systems_locale! error
      status is true[2015-05-12T19:35:36+00:00] INFO: service[pound] started -
      start service service[pound]*

This is where I need someone to tell me it's either working as intended and
I'm doing it wrong or it's a bug, because I'm outta talent.

On Tue, May 12, 2015 at 8:52 AM, Yoshi Spendiff <
yoshi.spendiff@indochino.com> wrote:

Same thing happens if I boot up an instance and use chef-client against a
Chef server. Says it's starting, says it supports status, says it's
running, it's not.

chef-apply with the same service declaration works correctly on that
instance as well.

[2015-05-12T15:44:11+00:00] DEBUG: service[pound] supports status,
running[2015-05-12T15:44:13+00:00] INFO: service[pound] started

But running *service pound status *returns an exit code of 3 and there
are no processes called 'pound'

Is Chef actually using the init scripts? I thought if *supports :status
=> true *was set then essentially *service pound status *was run, and if supports
:status => false
then Chef checks for a process name (based on the
declaration name?). In both cases it should fail?

On Tue, May 12, 2015 at 8:13 AM, Yoshi Spendiff <
yoshi.spendiff@indochino.com> wrote:

The init script already includes /etc/rc.d/init.d/functions, which is the
same thing.

Why is chef-apply working but not Chef solo?

On Mon, May 11, 2015 at 9:12 PM, Fouts, Chris Chris.Fouts@sensus.com
wrote:

On top of the script in /etc/init.d that starts the service, try
adding the following line

. /etc/init.d/function

Chris

Sent from my iPad using Mail+ for Outlook http://taps.io/mailplus

From: Yoshi Spendiff
Sent: 5/11/15, 9:57 PM
To: chef
Subject: [chef] Not detecting service start failure
Hi,

I'm having trouble understanding why Chef isn't detecting that a
service is failing to start. I'm attempting to get Pound (proxy) to work.
The package installs fine and I have a service declaration that says the
service is started when it's not. I'm running via chef solo at the moment
(it's a vagrant box) and running a chef-solo run with debug logging
produces the following:

  • [2015-05-12T01:45:50+00:00] DEBUG: service[pound] you have specified a
    status command, running.. pound is stopped pound 0:off
    1:off 2:on 3:on 4:on 5:on 6:off Starting Pound: starting...
    [FAILED] Starting Pound: starting... [FAILED] Starting Pound: starting...
    [FAILED] Starting Pound: starting... [FAILED] Starting Pound: starting...
    [FAILED] [2015-05-12T01:45:52+00:00] INFO: service[pound] started - start
    service service[pound]*

This are my recipe declaration:

*# Install pound and deploy configuration file from template package
'Pound' do options '--enablerepo=epel' end template '/etc/pound.cfg' do
source 'pound.cfg.erb' mode 0644 end service 'pound' do supports
:restart => false, :reload => false, :status => true action [:start,
:enable] end *
I have tried using the init_command and status_command options, as well
as disabling and enabling the supports :status => true option

If I put the above service declaration into a test file and run it with *chef-apply
*then it fails as it should:

[2015-05-12T01:56:28+00:00] FATAL:
Mixlib::ShellOut::ShellCommandFailed: service[pound] ((chef-apply
cookbook)::(chef-apply recipe) line 5) had an error:
Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0],
but received '1' ---- Begin output of /sbin/service pound start ----
STDOUT: Starting Pound: [FAILED] STDERR: starting... ---- End output of
/sbin/service pound start ---- Ran /sbin/service pound start returned 1

The pound init script is set up to run correctly, a non-zero exit code
is returned on unsuccessful start of pound.

There are no processes called pound running on the system and there are
no other service declarations called pound in all of my cookbooks.

What am I missing?

Cheers

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com

Looks like it's not just the pound service either. I changed my recipe to
remove all Listen entries for the ports.conf file in apache configuration
(so they can be pushed out via vhost), which generates an error when
attempting to start httpd, and httpd failed silently also.

On Tue, May 12, 2015 at 1:01 PM, Yoshi Spendiff <
yoshi.spendiff@indochino.com> wrote:

This is doing my nut in.

The redhat service provider is used, I can see that from the debug output:

    • service[pound] action start[2015-05-12T16:35:51+00:00] INFO:
      Processing service[pound] action start (indochino-role::web-server line
      94)[2015-05-12T16:35:51+00:00] DEBUG: providers for generic service
      resource enabled on node include: [Chef::Provider::Service::Init,
      Chef::Provider::Service::Redhat,
      Chef::Provider::Service::Upstart][2015-05-12T16:35:51+00:00] DEBUG:
      providers that refused resource service[pound] were:
      [Chef::Provider::Service::Upstart][2015-05-12T16:35:51+00:00] DEBUG:
      providers that support resource service[pound] include:
      [Chef::Provider::Service::Init,
      Chef::Provider::Service::Redhat][2015-05-12T16:35:51+00:00] DEBUG:
      providers that survived replacement include:
      [Chef::Provider::Service::Redhat][2015-05-12T16:35:51+00:00] DEBUG:
      service[pound] you have specified a status command, running..pound
      0:off 1:off 2:on 3:on 4:on 5:on 6:offStarting Pound:
      starting...[FAILED]POUND INIT COMMAND EXITED WITH CODE OF 1Starting Pound:
      starting...[FAILED]POUND INIT COMMAND EXITED WITH CODE OF 1Starting Pound:
      starting...[FAILED]POUND INIT COMMAND EXITED WITH CODE OF 1Starting Pound:
      starting...[FAILED]POUND INIT COMMAND EXITED WITH CODE OF 1Starting Pound:
      starting...[FAILED]POUND INIT COMMAND EXITED WITH CODE OF
      1[2015-05-12T16:35:53+00:00] INFO: service[pound] started*

I've tried to follow the stack as best I could, with my admittedly meager
ruby grokking skills, and as far as I can work out an exception should be
raised if the service fails to start.

So from what I can tell:

  1. The redhat provider (Chef::Provider::Service::Init::Redhat) is loaded,
    which sets the following parameter @init_command = "/sbin/service
    #{@new_resource.service_name}"
    , which is correct
  2. The service is loaded using load_current_resource in
    Chef::Provider::Service::Simple. The status is then verified either by a
    custom command if specified or the default using the default_init_script
    from the provider above.
  3. If the condition for starting is met (i.e. the service is not running)
    and the start action is specified then the action_start function in
    Chef::Provider::Service is run, which adds the start_service function to
    convergence using 'converge_by'. The start_service function comes from
    either Chef::Provider::Service::Simple or Chef::Provider::Service::Init
    depending on if a custom start command is specified. Either way the command
    is sent to a shell out.
  4. The start command is execute via the shell_out_with_systems_locale!
    function in Chef::Mixin::ShellOut which calls
    shell_out_with_systems_locale, which in turn calls shell_out_command, which
    in turn creates an instance of Mixlib::ShellOut to run the command
  5. The .error! method of Mixlib::ShellOut is called against the instance
    after the command is run, which should generate a ShellCommandFailed
    exception with the message invalid!("Expected process to exit with
    #{valid_exit_codes.inspect}, but received '#{exitstatus}'") if error?

The last is the message I see if I run the recipe by chef-apply (and what
I'm pretty sure I normally see when a service fails). Adding some hacky
debugging to the chef code on a test instance I can see that
Mixlib::ShellOut picks up the non-zero exit code, confirms it's not a valid
exit code and I can see the error status is set to generate an error if the
.error! method is called, which it is.
I can only assume that whatever in the convergence causes the service
start command to loop 5 times also catches the exception and handles it
somehow, I can't work that bit out.

This is the debugging fwiw (unsure why action_start and start_service
aren't called the first time on the loop):

    • service[pound] action start[2015-05-12T19:35:34+00:00] INFO:
      shell_out_command method called[2015-05-12T19:35:34+00:00] INFO: ShellOut
      exit status was 1. Allowed exit codes are [0][2015-05-12T19:35:34+00:00]
      INFO: shell_out_command error status is true[2015-05-12T19:35:34+00:00]
      INFO: action_start method called[2015-05-12T19:35:34+00:00] INFO:
      start_service method called[2015-05-12T19:35:34+00:00] INFO:
      shell_out_with_systems_locale! method called[2015-05-12T19:35:34+00:00]
      INFO: shell_out_with_systems_locale method
      called[2015-05-12T19:35:34+00:00] INFO: shell_out_command method
      called[2015-05-12T19:35:35+00:00] INFO: ShellOut exit status was 1. Allowed
      exit codes are [0][2015-05-12T19:35:35+00:00] INFO: shell_out_command error
      status is true[2015-05-12T19:35:35+00:00] INFO:
      shell_out_with_systems_locale! error status is
      true[2015-05-12T19:35:35+00:00] INFO: start_service method
      called[2015-05-12T19:35:35+00:00] INFO: shell_out_with_systems_locale!
      method called[2015-05-12T19:35:35+00:00] INFO:
      shell_out_with_systems_locale method called[2015-05-12T19:35:35+00:00]
      INFO: shell_out_command method called[2015-05-12T19:35:35+00:00] INFO:
      ShellOut exit status was 1. Allowed exit codes are
      [0][2015-05-12T19:35:35+00:00] INFO: shell_out_command error status is
      true[2015-05-12T19:35:35+00:00] INFO: shell_out_with_systems_locale! error
      status is true[2015-05-12T19:35:35+00:00] INFO: start_service method
      called[2015-05-12T19:35:35+00:00] INFO: shell_out_with_systems_locale!
      method called[2015-05-12T19:35:35+00:00] INFO:
      shell_out_with_systems_locale method called[2015-05-12T19:35:35+00:00]
      INFO: shell_out_command method called[2015-05-12T19:35:35+00:00] INFO:
      ShellOut exit status was 1. Allowed exit codes are
      [0][2015-05-12T19:35:35+00:00] INFO: shell_out_command error status is
      true[2015-05-12T19:35:35+00:00] INFO: shell_out_with_systems_locale! error
      status is true[2015-05-12T19:35:35+00:00] INFO: start_service method
      called[2015-05-12T19:35:35+00:00] INFO: shell_out_with_systems_locale!
      method called[2015-05-12T19:35:35+00:00] INFO:
      shell_out_with_systems_locale method called[2015-05-12T19:35:35+00:00]
      INFO: shell_out_command method called[2015-05-12T19:35:36+00:00] INFO:
      ShellOut exit status was 1. Allowed exit codes are
      [0][2015-05-12T19:35:36+00:00] INFO: shell_out_command error status is
      true[2015-05-12T19:35:36+00:00] INFO: shell_out_with_systems_locale! error
      status is true[2015-05-12T19:35:36+00:00] INFO: start_service method
      called[2015-05-12T19:35:36+00:00] INFO: shell_out_with_systems_locale!
      method called[2015-05-12T19:35:36+00:00] INFO:
      shell_out_with_systems_locale method called[2015-05-12T19:35:36+00:00]
      INFO: shell_out_command method called[2015-05-12T19:35:36+00:00] INFO:
      ShellOut exit status was 1. Allowed exit codes are
      [0][2015-05-12T19:35:36+00:00] INFO: shell_out_command error status is
      true[2015-05-12T19:35:36+00:00] INFO: shell_out_with_systems_locale! error
      status is true[2015-05-12T19:35:36+00:00] INFO: service[pound] started -
      start service service[pound]*

This is where I need someone to tell me it's either working as intended
and I'm doing it wrong or it's a bug, because I'm outta talent.

On Tue, May 12, 2015 at 8:52 AM, Yoshi Spendiff <
yoshi.spendiff@indochino.com> wrote:

Same thing happens if I boot up an instance and use chef-client against a
Chef server. Says it's starting, says it supports status, says it's
running, it's not.

chef-apply with the same service declaration works correctly on that
instance as well.

[2015-05-12T15:44:11+00:00] DEBUG: service[pound] supports status,
running[2015-05-12T15:44:13+00:00] INFO: service[pound] started

But running *service pound status *returns an exit code of 3 and there
are no processes called 'pound'

Is Chef actually using the init scripts? I thought if *supports :status
=> true *was set then essentially *service pound status *was run, and if supports
:status => false
then Chef checks for a process name (based on the
declaration name?). In both cases it should fail?

On Tue, May 12, 2015 at 8:13 AM, Yoshi Spendiff <
yoshi.spendiff@indochino.com> wrote:

The init script already includes /etc/rc.d/init.d/functions, which is
the same thing.

Why is chef-apply working but not Chef solo?

On Mon, May 11, 2015 at 9:12 PM, Fouts, Chris Chris.Fouts@sensus.com
wrote:

On top of the script in /etc/init.d that starts the service, try
adding the following line

. /etc/init.d/function

Chris

Sent from my iPad using Mail+ for Outlook http://taps.io/mailplus

From: Yoshi Spendiff
Sent: 5/11/15, 9:57 PM
To: chef
Subject: [chef] Not detecting service start failure
Hi,

I'm having trouble understanding why Chef isn't detecting that a
service is failing to start. I'm attempting to get Pound (proxy) to work.
The package installs fine and I have a service declaration that says the
service is started when it's not. I'm running via chef solo at the moment
(it's a vagrant box) and running a chef-solo run with debug logging
produces the following:

  • [2015-05-12T01:45:50+00:00] DEBUG: service[pound] you have specified
    a status command, running.. pound is stopped pound 0:off
    1:off 2:on 3:on 4:on 5:on 6:off Starting Pound: starting...
    [FAILED] Starting Pound: starting... [FAILED] Starting Pound: starting...
    [FAILED] Starting Pound: starting... [FAILED] Starting Pound: starting...
    [FAILED] [2015-05-12T01:45:52+00:00] INFO: service[pound] started - start
    service service[pound]*

This are my recipe declaration:

*# Install pound and deploy configuration file from template package
'Pound' do options '--enablerepo=epel' end template '/etc/pound.cfg' do
source 'pound.cfg.erb' mode 0644 end service 'pound' do supports
:restart => false, :reload => false, :status => true action [:start,
:enable] end *
I have tried using the init_command and status_command options, as
well as disabling and enabling the supports :status => true option

If I put the above service declaration into a test file and run it
with *chef-apply *then it fails as it should:

[2015-05-12T01:56:28+00:00] FATAL:
Mixlib::ShellOut::ShellCommandFailed: service[pound] ((chef-apply
cookbook)::(chef-apply recipe) line 5) had an error:
Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0],
but received '1' ---- Begin output of /sbin/service pound start ----
STDOUT: Starting Pound: [FAILED] STDERR: starting... ---- End output of
/sbin/service pound start ---- Ran /sbin/service pound start returned 1

The pound init script is set up to run correctly, a non-zero exit code
is returned on unsuccessful start of pound.

There are no processes called pound running on the system and there
are no other service declarations called pound in all of my cookbooks.

What am I missing?

Cheers

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com

Happens in chef client 12.3 and 12.0.3

Is there a global ignore_failure true that can be/is set that could be on
accidentally?

Has this behaviour changed in Chef 12+?

On Tue, May 12, 2015 at 1:25 PM, Yoshi Spendiff <
yoshi.spendiff@indochino.com> wrote:

Looks like it's not just the pound service either. I changed my recipe to
remove all Listen entries for the ports.conf file in apache configuration
(so they can be pushed out via vhost), which generates an error when
attempting to start httpd, and httpd failed silently also.

On Tue, May 12, 2015 at 1:01 PM, Yoshi Spendiff <
yoshi.spendiff@indochino.com> wrote:

This is doing my nut in.

The redhat service provider is used, I can see that from the debug output:

    • service[pound] action start[2015-05-12T16:35:51+00:00] INFO:
      Processing service[pound] action start (indochino-role::web-server line
      94)[2015-05-12T16:35:51+00:00] DEBUG: providers for generic service
      resource enabled on node include: [Chef::Provider::Service::Init,
      Chef::Provider::Service::Redhat,
      Chef::Provider::Service::Upstart][2015-05-12T16:35:51+00:00] DEBUG:
      providers that refused resource service[pound] were:
      [Chef::Provider::Service::Upstart][2015-05-12T16:35:51+00:00] DEBUG:
      providers that support resource service[pound] include:
      [Chef::Provider::Service::Init,
      Chef::Provider::Service::Redhat][2015-05-12T16:35:51+00:00] DEBUG:
      providers that survived replacement include:
      [Chef::Provider::Service::Redhat][2015-05-12T16:35:51+00:00] DEBUG:
      service[pound] you have specified a status command, running..pound
      0:off 1:off 2:on 3:on 4:on 5:on 6:offStarting Pound:
      starting...[FAILED]POUND INIT COMMAND EXITED WITH CODE OF 1Starting Pound:
      starting...[FAILED]POUND INIT COMMAND EXITED WITH CODE OF 1Starting Pound:
      starting...[FAILED]POUND INIT COMMAND EXITED WITH CODE OF 1Starting Pound:
      starting...[FAILED]POUND INIT COMMAND EXITED WITH CODE OF 1Starting Pound:
      starting...[FAILED]POUND INIT COMMAND EXITED WITH CODE OF
      1[2015-05-12T16:35:53+00:00] INFO: service[pound] started*

I've tried to follow the stack as best I could, with my admittedly meager
ruby grokking skills, and as far as I can work out an exception should be
raised if the service fails to start.

So from what I can tell:

  1. The redhat provider (Chef::Provider::Service::Init::Redhat) is loaded,
    which sets the following parameter @init_command = "/sbin/service
    #{@new_resource.service_name}"
    , which is correct
  2. The service is loaded using load_current_resource in
    Chef::Provider::Service::Simple. The status is then verified either by a
    custom command if specified or the default using the default_init_script
    from the provider above.
  3. If the condition for starting is met (i.e. the service is not running)
    and the start action is specified then the action_start function in
    Chef::Provider::Service is run, which adds the start_service function to
    convergence using 'converge_by'. The start_service function comes from
    either Chef::Provider::Service::Simple or Chef::Provider::Service::Init
    depending on if a custom start command is specified. Either way the command
    is sent to a shell out.
  4. The start command is execute via the shell_out_with_systems_locale!
    function in Chef::Mixin::ShellOut which calls
    shell_out_with_systems_locale, which in turn calls shell_out_command, which
    in turn creates an instance of Mixlib::ShellOut to run the command
  5. The .error! method of Mixlib::ShellOut is called against the instance
    after the command is run, which should generate a ShellCommandFailed
    exception with the message invalid!("Expected process to exit with
    #{valid_exit_codes.inspect}, but received '#{exitstatus}'") if error?

The last is the message I see if I run the recipe by chef-apply (and what
I'm pretty sure I normally see when a service fails). Adding some hacky
debugging to the chef code on a test instance I can see that
Mixlib::ShellOut picks up the non-zero exit code, confirms it's not a valid
exit code and I can see the error status is set to generate an error if the
.error! method is called, which it is.
I can only assume that whatever in the convergence causes the service
start command to loop 5 times also catches the exception and handles it
somehow, I can't work that bit out.

This is the debugging fwiw (unsure why action_start and start_service
aren't called the first time on the loop):

    • service[pound] action start[2015-05-12T19:35:34+00:00] INFO:
      shell_out_command method called[2015-05-12T19:35:34+00:00] INFO: ShellOut
      exit status was 1. Allowed exit codes are [0][2015-05-12T19:35:34+00:00]
      INFO: shell_out_command error status is true[2015-05-12T19:35:34+00:00]
      INFO: action_start method called[2015-05-12T19:35:34+00:00] INFO:
      start_service method called[2015-05-12T19:35:34+00:00] INFO:
      shell_out_with_systems_locale! method called[2015-05-12T19:35:34+00:00]
      INFO: shell_out_with_systems_locale method
      called[2015-05-12T19:35:34+00:00] INFO: shell_out_command method
      called[2015-05-12T19:35:35+00:00] INFO: ShellOut exit status was 1. Allowed
      exit codes are [0][2015-05-12T19:35:35+00:00] INFO: shell_out_command error
      status is true[2015-05-12T19:35:35+00:00] INFO:
      shell_out_with_systems_locale! error status is
      true[2015-05-12T19:35:35+00:00] INFO: start_service method
      called[2015-05-12T19:35:35+00:00] INFO: shell_out_with_systems_locale!
      method called[2015-05-12T19:35:35+00:00] INFO:
      shell_out_with_systems_locale method called[2015-05-12T19:35:35+00:00]
      INFO: shell_out_command method called[2015-05-12T19:35:35+00:00] INFO:
      ShellOut exit status was 1. Allowed exit codes are
      [0][2015-05-12T19:35:35+00:00] INFO: shell_out_command error status is
      true[2015-05-12T19:35:35+00:00] INFO: shell_out_with_systems_locale! error
      status is true[2015-05-12T19:35:35+00:00] INFO: start_service method
      called[2015-05-12T19:35:35+00:00] INFO: shell_out_with_systems_locale!
      method called[2015-05-12T19:35:35+00:00] INFO:
      shell_out_with_systems_locale method called[2015-05-12T19:35:35+00:00]
      INFO: shell_out_command method called[2015-05-12T19:35:35+00:00] INFO:
      ShellOut exit status was 1. Allowed exit codes are
      [0][2015-05-12T19:35:35+00:00] INFO: shell_out_command error status is
      true[2015-05-12T19:35:35+00:00] INFO: shell_out_with_systems_locale! error
      status is true[2015-05-12T19:35:35+00:00] INFO: start_service method
      called[2015-05-12T19:35:35+00:00] INFO: shell_out_with_systems_locale!
      method called[2015-05-12T19:35:35+00:00] INFO:
      shell_out_with_systems_locale method called[2015-05-12T19:35:35+00:00]
      INFO: shell_out_command method called[2015-05-12T19:35:36+00:00] INFO:
      ShellOut exit status was 1. Allowed exit codes are
      [0][2015-05-12T19:35:36+00:00] INFO: shell_out_command error status is
      true[2015-05-12T19:35:36+00:00] INFO: shell_out_with_systems_locale! error
      status is true[2015-05-12T19:35:36+00:00] INFO: start_service method
      called[2015-05-12T19:35:36+00:00] INFO: shell_out_with_systems_locale!
      method called[2015-05-12T19:35:36+00:00] INFO:
      shell_out_with_systems_locale method called[2015-05-12T19:35:36+00:00]
      INFO: shell_out_command method called[2015-05-12T19:35:36+00:00] INFO:
      ShellOut exit status was 1. Allowed exit codes are
      [0][2015-05-12T19:35:36+00:00] INFO: shell_out_command error status is
      true[2015-05-12T19:35:36+00:00] INFO: shell_out_with_systems_locale! error
      status is true[2015-05-12T19:35:36+00:00] INFO: service[pound] started -
      start service service[pound]*

This is where I need someone to tell me it's either working as intended
and I'm doing it wrong or it's a bug, because I'm outta talent.

On Tue, May 12, 2015 at 8:52 AM, Yoshi Spendiff <
yoshi.spendiff@indochino.com> wrote:

Same thing happens if I boot up an instance and use chef-client against
a Chef server. Says it's starting, says it supports status, says it's
running, it's not.

chef-apply with the same service declaration works correctly on that
instance as well.

[2015-05-12T15:44:11+00:00] DEBUG: service[pound] supports status,
running[2015-05-12T15:44:13+00:00] INFO: service[pound] started

But running *service pound status *returns an exit code of 3 and there
are no processes called 'pound'

Is Chef actually using the init scripts? I thought if *supports :status
=> true *was set then essentially *service pound status *was run, and
if supports :status => false then Chef checks for a process name
(based on the declaration name?). In both cases it should fail?

On Tue, May 12, 2015 at 8:13 AM, Yoshi Spendiff <
yoshi.spendiff@indochino.com> wrote:

The init script already includes /etc/rc.d/init.d/functions, which is
the same thing.

Why is chef-apply working but not Chef solo?

On Mon, May 11, 2015 at 9:12 PM, Fouts, Chris Chris.Fouts@sensus.com
wrote:

On top of the script in /etc/init.d that starts the service, try
adding the following line

. /etc/init.d/function

Chris

Sent from my iPad using Mail+ for Outlook http://taps.io/mailplus

From: Yoshi Spendiff
Sent: 5/11/15, 9:57 PM
To: chef
Subject: [chef] Not detecting service start failure
Hi,

I'm having trouble understanding why Chef isn't detecting that a
service is failing to start. I'm attempting to get Pound (proxy) to work.
The package installs fine and I have a service declaration that says the
service is started when it's not. I'm running via chef solo at the moment
(it's a vagrant box) and running a chef-solo run with debug logging
produces the following:

  • [2015-05-12T01:45:50+00:00] DEBUG: service[pound] you have specified
    a status command, running.. pound is stopped pound 0:off
    1:off 2:on 3:on 4:on 5:on 6:off Starting Pound: starting...
    [FAILED] Starting Pound: starting... [FAILED] Starting Pound: starting...
    [FAILED] Starting Pound: starting... [FAILED] Starting Pound: starting...
    [FAILED] [2015-05-12T01:45:52+00:00] INFO: service[pound] started - start
    service service[pound]*

This are my recipe declaration:

*# Install pound and deploy configuration file from template package
'Pound' do options '--enablerepo=epel' end template '/etc/pound.cfg' do
source 'pound.cfg.erb' mode 0644 end service 'pound' do supports
:restart => false, :reload => false, :status => true action [:start,
:enable] end *
I have tried using the init_command and status_command options, as
well as disabling and enabling the supports :status => true option

If I put the above service declaration into a test file and run it
with *chef-apply *then it fails as it should:

[2015-05-12T01:56:28+00:00] FATAL:
Mixlib::ShellOut::ShellCommandFailed: service[pound] ((chef-apply
cookbook)::(chef-apply recipe) line 5) had an error:
Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0],
but received '1' ---- Begin output of /sbin/service pound start ----
STDOUT: Starting Pound: [FAILED] STDERR: starting... ---- End output of
/sbin/service pound start ---- Ran /sbin/service pound start returned 1

The pound init script is set up to run correctly, a non-zero exit
code is returned on unsuccessful start of pound.

There are no processes called pound running on the system and there
are no other service declarations called pound in all of my cookbooks.

What am I missing?

Cheers

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com

--
Yoshi Spendiff
Ops Engineer
Indochino
Mobile: +1 778 952 2025
Email: yoshi.spendiff@indochino.com