Chef 10.14.0.rc.1 fails on disabling non-existent services


#1

No one has commented on an issue I submitted a couple of weeks ago, so I
thought i would ask about it on the list. Here is the issue:
http://tickets.opscode.com/browse/CHEF-3380.
I’m wondering if it’s expected new behavior with the new release of Chef?
I’ll have to refactor quite a bit of Chef code if it’s expected behavior
and will be staying.


John Alberts


#2

Definitely a bug and probably an easy fix. I’ll make sure it gets attention.


Daniel DeLeo

On Wednesday, August 29, 2012 at 8:21 PM, John Alberts wrote:

No one has commented on an issue I submitted a couple of weeks ago, so I thought i would ask about it on the list. Here is the issue: http://tickets.opscode.com/browse/CHEF-3380.
I’m wondering if it’s expected new behavior with the new release of Chef? I’ll have to refactor quite a bit of Chef code if it’s expected behavior and will be staying.


John Alberts


#3

That’s great to hear. Thanks

On Wed, Aug 29, 2012 at 10:26 PM, Daniel DeLeo dan@kallistec.com wrote:

Definitely a bug and probably an easy fix. I’ll make sure it gets
attention.


Daniel DeLeo

On Wednesday, August 29, 2012 at 8:21 PM, John Alberts wrote:

No one has commented on an issue I submitted a couple of weeks ago, so I
thought i would ask about it on the list. Here is the issue:
http://tickets.opscode.com/browse/CHEF-3380.
I’m wondering if it’s expected new behavior with the new release of Chef?
I’ll have to refactor quite a bit of Chef code if it’s expected behavior
and will be staying.


John Alberts


John Alberts


#4

On Thu, Aug 30, 2012 at 6:19 AM, John Alberts john.m.alberts@gmail.com wrote:

That’s great to hear. Thanks

On Wed, Aug 29, 2012 at 10:26 PM, Daniel DeLeo dan@kallistec.com wrote:

Definitely a bug and probably an easy fix. I’ll make sure it gets
attention.

I commented on the ticket—I strongly believe the current behavior is
correct. Here it is for convenience:

There have been questions before, people wondering why Chef doesn’t
create an init script—and the general consensus was that the service
resource should assume everything is set up correctly, including an
init script.

Following that line of reasoning, if I ask Chef to stop a service and
the init script is missing, Chef won’t be able to stop it—and that’s a
fatal error if there ever was one.
The stop action cannot assume the recipe is also calling disable—maybe
the recipe is stopping it to apply some changes and then start it
again. If the service doesn’t stop, all bets are off; subsequent
actions may corrupt data and so on.

The disable action is different.
On most distros it doesn’t depend on the init script at all, in which
case I agree it can ignore the absence of the init script.
The bottom line is, the service shouldn’t swallow exceptions.

I think that’s already the case, so here’s my vote for rejecting this ticket.

(Note that I’m not arguing the implementation is perfect—if a
distro-specific init system doesn’t need the init script to disable
the service yet it fails, then the distro-specific provider should be
fixed. I’m only arguing this shouldn’t leak into the generic provider)


#5

On Wednesday, August 29, 2012 at 11:36 PM, Andrea Campi wrote:

On Thu, Aug 30, 2012 at 6:19 AM, John Alberts <john.m.alberts@gmail.com (mailto:john.m.alberts@gmail.com)> wrote:

That’s great to hear. Thanks

On Wed, Aug 29, 2012 at 10:26 PM, Daniel DeLeo <dan@kallistec.com (mailto:dan@kallistec.com)> wrote:

Definitely a bug and probably an easy fix. I’ll make sure it gets
attention.

I commented on the ticket—I strongly believe the current behavior is
correct. Here it is for convenience:

There have been questions before, people wondering why Chef doesn’t
create an init script—and the general consensus was that the service
resource should assume everything is set up correctly, including an
init script.

Following that line of reasoning, if I ask Chef to stop a service and
the init script is missing, Chef won’t be able to stop it—and that’s a
fatal error if there ever was one.
The stop action cannot assume the recipe is also calling disable—maybe
the recipe is stopping it to apply some changes and then start it
again. If the service doesn’t stop, all bets are off; subsequent
actions may corrupt data and so on.

The disable action is different.
On most distros it doesn’t depend on the init script at all, in which
case I agree it can ignore the absence of the init script.
The bottom line is, the service shouldn’t swallow exceptions.

I think that’s already the case, so here’s my vote for rejecting this ticket.

(Note that I’m not arguing the implementation is perfect—if a
distro-specific init system doesn’t need the init script to disable
the service yet it fails, then the distro-specific provider should be
fixed. I’m only arguing this shouldn’t leak into the generic provider)

From the error message, it looks like it’s failing on action ‘disable’, so it looks like you and John agree on what the behavior should be in this case.

We’ll have to see what the behavior is for the ‘stop’ action in this case. Your argument about the desired behavior for that case is convincing; however, we don’t intend to introduce breaking changes in this release, so we’ll have to make the behavior for 10.14 match that of 10.12. If we want to make a change, it could go into master, which will become Chef 11.0.0.


Daniel DeLeo


#6

On 8/30/12 9:00 AM, Daniel DeLeo wrote:

On Wednesday, August 29, 2012 at 11:36 PM, Andrea Campi wrote:

On Thu, Aug 30, 2012 at 6:19 AM, John Alberts
<john.m.alberts@gmail.com mailto:john.m.alberts@gmail.com> wrote:

That’s great to hear. Thanks

On Wed, Aug 29, 2012 at 10:26 PM, Daniel DeLeo <dan@kallistec.com
mailto:dan@kallistec.com> wrote:

Definitely a bug and probably an easy fix. I’ll make sure it gets
attention.

I commented on the ticket—I strongly believe the current behavior is
correct. Here it is for convenience:

There have been questions before, people wondering why Chef doesn’t
create an init script—and the general consensus was that the service
resource should assume everything is set up correctly, including an
init script.

Following that line of reasoning, if I ask Chef to stop a service and
the init script is missing, Chef won’t be able to stop it—and that’s a
fatal error if there ever was one.
The stop action cannot assume the recipe is also calling disable—maybe
the recipe is stopping it to apply some changes and then start it
again. If the service doesn’t stop, all bets are off; subsequent
actions may corrupt data and so on.

The disable action is different.
On most distros it doesn’t depend on the init script at all, in which
case I agree it can ignore the absence of the init script.
The bottom line is, the service shouldn’t swallow exceptions.

I think that’s already the case, so here’s my vote for rejecting this
ticket.

(Note that I’m not arguing the implementation is perfect—if a
distro-specific init system doesn’t need the init script to disable
the service yet it fails, then the distro-specific provider should be
fixed. I’m only arguing this shouldn’t leak into the generic provider)
From the error message, it looks like it’s failing on action
’disable’, so it looks like you and John agree on what the behavior
should be in this case.

We’ll have to see what the behavior is for the ‘stop’ action in this
case. Your argument about the desired behavior for that case is
convincing; however, we don’t intend to introduce breaking changes in
this release, so we’ll have to make the behavior for 10.14 match that
of 10.12. If we want to make a change, it could go into master, which
will become Chef 11.0.0.


Daniel DeLeo

The behavior is currently (pre-10.14) that both :stop and :disable
ignore the non-existence of init scripts on ubuntu 10.04. Taking -l
debug, but suppressing a bunch of ps output:

This is with only action :disable

[Thu, 30 Aug 2012 18:46:27 +0000] DEBUG: Processing service[lkdjsflksjd]
on opscode-ubuntu-10-04.opscode.us
[Thu, 30 Aug 2012 18:46:27 +0000] INFO: Processing service[lkdjsflksjd]
action disable (test::default line 2)
[Thu, 30 Aug 2012 18:46:27 +0000] DEBUG: service[lkdjsflksjd] falling
back to process table inspection
[Thu, 30 Aug 2012 18:46:27 +0000] DEBUG: service[lkdjsflksjd] attempting
to match ‘lkdjsflksjd’ (/lkdjsflksjd/) against process list
UID PID PPID C STIME TTY TIME CMD
[…snip…]
[Thu, 30 Aug 2012 18:46:27 +0000] DEBUG: service[lkdjsflksjd] running: false
[Thu, 30 Aug 2012 18:46:27 +0000] DEBUG: service[lkdjsflksjd] already
disabled - nothing to do

This is with only action :stop

[Thu, 30 Aug 2012 18:46:27 +0000] DEBUG: Processing
service[alsdkjfalkdsjfla] on opscode-ubuntu-10-04.opscode.us
[Thu, 30 Aug 2012 18:46:27 +0000] INFO: Processing
service[alsdkjfalkdsjfla] action stop (test::default line 6)
[Thu, 30 Aug 2012 18:46:27 +0000] DEBUG: service[alsdkjfalkdsjfla]
falling back to process table inspection
[Thu, 30 Aug 2012 18:46:27 +0000] DEBUG: service[alsdkjfalkdsjfla]
attempting to match ‘alsdkjfalkdsjfla’ (/alsdkjfalkdsjfla/) against
process list
UID PID PPID C STIME TTY TIME CMD
[…snip…]
[Thu, 30 Aug 2012 18:46:27 +0000] DEBUG: service[alsdkjfalkdsjfla]
running: false
[Thu, 30 Aug 2012 18:46:27 +0000] DEBUG: service[alsdkjfalkdsjfla]
already stopped - nothing to do

This is with action [:stop, :disable]

[Thu, 30 Aug 2012 18:46:27 +0000] DEBUG: Processing service[dfglkj3b23n]
on opscode-ubuntu-10-04.opscode.us
[Thu, 30 Aug 2012 18:46:27 +0000] INFO: Processing service[dfglkj3b23n]
action stop (test::default line 10)
[Thu, 30 Aug 2012 18:46:27 +0000] DEBUG: service[dfglkj3b23n] falling
back to process table inspection
[Thu, 30 Aug 2012 18:46:27 +0000] DEBUG: service[dfglkj3b23n] attempting
to match ‘dfglkj3b23n’ (/dfglkj3b23n/) against process list
UID PID PPID C STIME TTY TIME CMD
[…snip…]
[Thu, 30 Aug 2012 18:46:27 +0000] DEBUG: service[dfglkj3b23n] running: false
[Thu, 30 Aug 2012 18:46:27 +0000] DEBUG: service[dfglkj3b23n] already
stopped - nothing to do
[Thu, 30 Aug 2012 18:46:27 +0000] INFO: Processing service[dfglkj3b23n]
action disable (test::default line 10)
[Thu, 30 Aug 2012 18:46:27 +0000] DEBUG: service[dfglkj3b23n] falling
back to process table inspection
[Thu, 30 Aug 2012 18:46:27 +0000] DEBUG: service[dfglkj3b23n] attempting
to match ‘dfglkj3b23n’ (/dfglkj3b23n/) against process list
UID PID PPID C STIME TTY TIME CMD
[…snip…]
[Thu, 30 Aug 2012 18:46:27 +0000] DEBUG: service[dfglkj3b23n] running: false
[Thu, 30 Aug 2012 18:46:27 +0000] DEBUG: service[dfglkj3b23n] already
disabled - nothing to do

And I know that at my previous job I left a lot of chef code behind that
assumed that the latter case behaved exactly like that and will break
with the new behavior. I put a lot of rambling in the ticket about why
I disgree and believe that the current behavior is more desired since
generally what I’m trying to do is write policy language which is
"please make sure telnetd is never running anywhere" and if the init
script (or upstart, or systemd, or xinetd config) simply doesn’t exist,
then I’m fine with that. Throwing exceptions and making me deal with
them because on some systems it is named telnetd instead of telnet, and
some systems its inetd instead of upstart or sysv, and some system may
not have the package installed at all is just making me think about edge
cases that I don’t care about to get my job done – and once you hit
thousands or tens of thousands of servers to manage this just becomes
abusive ‘shoulds’ – at that level while you ‘should’ know the starting
state of all your systems, the reality is that you don’t, not completely
anyway.