CHEF-3506 - Don't save the node object when using an override run list?

https://tickets.opscode.com/browse/CHEF-3506

This ticket proposes not saving the recipes and roles attributes when
using an override run list. Dan wondered if we should save the node
object at all. I can see being surprised in both situation; when your
node object changes on the next normal client run (or gives non-normal
search results in between) and when your node object doesn’t change on
an override run, but when you’re using override you’re heading down
the path of someone who just asked for things to be a little sideways.

Anyone have a use case that would be upset by the node object being
not saved at the end of an override run?


Bryan McLellan | opscode | technical program manager, open source
© 206.607.7108 | (t) @btmspox | (b) http://blog.loftninjas.org

I know the following doesn't directly answer the question at hand, but
being able to override the run_list has always seemed like an anti-feature
to me. It breaks a central tenet of configuration management in that you
should be able to reproduce your infrastructure from the committed state of
your source repository. In essence, there is nothing separating a for-loop
around ssh and an override run other than the DSL. You've made a
non-reproducible snowflake all the same, or you've re-introduced run books.
Both poor outcomes.

Whilst I am certainly not immune to practicality, a lot of the
justifications for override run_lists I've attempted to make ("I want to do
this one thing on this one box once to test", etc.) are simply excuses
being made for poor process and an incomplete ability to test changes
before they are applied against production.

So, whilst not perhaps the answer that is wanted, my reply would be "none
of the above, remove the ability to override". It smacks of enabling a
rewrite of bash into the Chef DSL whilst missing the real point behind what
we're all trying to do with configuration management.

Sam Pointer
Lead Consultant
www.opsunit.com

On 17 September 2013 04:49, Bryan McLellan btm@opscode.com wrote:

https://tickets.opscode.com/browse/CHEF-3506

This ticket proposes not saving the recipes and roles attributes when
using an override run list. Dan wondered if we should save the node
object at all. I can see being surprised in both situation; when your
node object changes on the next normal client run (or gives non-normal
search results in between) and when your node object doesn't change on
an override run, but when you're using override you're heading down
the path of someone who just asked for things to be a little sideways.

Anyone have a use case that would be upset by the node object being
not saved at the end of an override run?

--
Bryan McLellan | opscode | technical program manager, open source
(c) 206.607.7108 | (t) @btmspox | (b) http://blog.loftninjas.org

Wouldn’t a change to the current behavior break bootstrapping? I believe it relies on being able to override the run_list, where the overridden run_list is responsible for the initial setup of the node.

Kevin Keane

The NetTech

760-721-8339

http://www.4nettech.com

Our values: Privacy, Liberty, Justice

See https://www.4nettech.com/corp/the-nettech-values.html

-----Original message-----
From:Sam Pointer sam.pointer@opsunit.com
Sent:Tue 09-17-2013 03:49 am
Subject:[chef] Re: CHEF-3506 - Don‘t save the node object when using an override run list?
To:chef@lists.opscode.com;

I know the following doesn’t directly answer the question at hand, but being able to override the run_list has always seemed like an anti-feature to me. It breaks a central tenet of configuration management in that you should be able to reproduce your infrastructure from the committed state of your source repository. In essence, there is nothing separating a for-loop around ssh and an override run other than the DSL. You’ve made a non-reproducible snowflake all the same, or you’ve re-introduced run books. Both poor outcomes.
Whilst I am certainly not immune to practicality, a lot of the justifications for override run_lists I’ve attempted to make (“I want to do this one thing on this one box once to test”, etc.) are simply excuses being made for poor process and an incomplete ability to test changes before they are applied against production.

So, whilst not perhaps the answer that is wanted, my reply would be “none of the above, remove the ability to override”. It smacks of enabling a rewrite of bash into the Chef DSL whilst missing the real point behind what we’re all trying to do with configuration management.

Sam Pointer
Lead Consultant
www.opsunit.com

On 17 September 2013 04:49, Bryan McLellan btm@opscode.com wrote:
https://tickets.opscode.com/browse/CHEF-3506

This ticket proposes not saving the recipes and roles attributes when
using an override run list. Dan wondered if we should save the node
object at all. I can see being surprised in both situation; when your
node object changes on the next normal client run (or gives non-normal
search results in between) and when your node object doesn’t change on
an override run, but when you’re using override you’re heading down
the path of someone who just asked for things to be a little sideways.

Anyone have a use case that would be upset by the node object being
not saved at the end of an override run?


Bryan McLellan | opscode | technical program manager, open source
© 206.607.7108 | (t) @btmspox | (b) http://blog.loftninjas.org

On Tue, Sep 17, 2013 at 7:13 AM, Kevin Keane Subscription
subscription@kkeane.com wrote:

Wouldn't a change to the current behavior break bootstrapping? I believe it
relies on being able to override the run_list, where the overridden run_list
is responsible for the initial setup of the node.

The current bootstrap procedure writes a run_list to first-boot.json
and then loads it with "chef-client -j" -- not quite the same thing as
"-o".

  • Julian

Sam,

While I absolutely see your point, I disagree that removing the feature is
the right path.

This may not be best practices, but I employ a pattern where for every
recipe I write, I write an "uninstall" recipe in parallel that sort of
unwinds the install. In cases where we are running physical boxes, having
the ability to use Chef to uninstall something is great, and not something
I want to muck around with node run_lists to do - hence, having the
override run_list feature really is helpful. Being able to kick off that
uninstall recipe in cases where the preferred "kill off the VM and
recreate" path is not available and without having to jump through hoops is
really awesome.

I don't necessarily see any value in saving the run_list back to the node
object (or any part of the node object, actually), so I'm in the "don't
bother saving in an override situation" crowd.

~Adam

On Tue, Sep 17, 2013 at 6:49 AM, Sam Pointer sam.pointer@opsunit.comwrote:

I know the following doesn't directly answer the question at hand, but
being able to override the run_list has always seemed like an anti-feature
to me. It breaks a central tenet of configuration management in that you
should be able to reproduce your infrastructure from the committed state of
your source repository. In essence, there is nothing separating a for-loop
around ssh and an override run other than the DSL. You've made a
non-reproducible snowflake all the same, or you've re-introduced run books.
Both poor outcomes.

Whilst I am certainly not immune to practicality, a lot of the
justifications for override run_lists I've attempted to make ("I want to do
this one thing on this one box once to test", etc.) are simply excuses
being made for poor process and an incomplete ability to test changes
before they are applied against production.

So, whilst not perhaps the answer that is wanted, my reply would be "none
of the above, remove the ability to override". It smacks of enabling a
rewrite of bash into the Chef DSL whilst missing the real point behind what
we're all trying to do with configuration management.

Sam Pointer
Lead Consultant
www.opsunit.com

On 17 September 2013 04:49, Bryan McLellan btm@opscode.com wrote:

https://tickets.opscode.com/browse/CHEF-3506

This ticket proposes not saving the recipes and roles attributes when
using an override run list. Dan wondered if we should save the node
object at all. I can see being surprised in both situation; when your
node object changes on the next normal client run (or gives non-normal
search results in between) and when your node object doesn't change on
an override run, but when you're using override you're heading down
the path of someone who just asked for things to be a little sideways.

Anyone have a use case that would be upset by the node object being
not saved at the end of an override run?

--
Bryan McLellan | opscode | technical program manager, open source
(c) 206.607.7108 | (t) @btmspox | (b) http://blog.loftninjas.org

I will throw my 2 pences into the fire.

I don't think we should save the override run_list back to the node.

Here's why. I have some level 1 techs that do some fairly basic
troubleshooting. They have a 24/7 shift and have administrative access to
our 1k + nodes. There are times where we want to roll out some
changes/upgrades/packages late night because of various reasons and want to
provide them a one liner something like chef-client -o 'recipe[dosomething::awesome] that they and us can be confident will be
idempotent and won't break because of xyz. But in our testing in doing so
with the run_list getting saved back to the node, we weren't confident that
this was our solution as we didn't want to disrupt the node data that may
or may not affect other nodes.

so i will toss in a vote for "don't save the run_list back to the node in
an override run_list situation"

On Tue, Sep 17, 2013 at 9:19 AM, Adam Leff adam@leff.co wrote:

Sam,

While I absolutely see your point, I disagree that removing the feature is
the right path.

This may not be best practices, but I employ a pattern where for every
recipe I write, I write an "uninstall" recipe in parallel that sort of
unwinds the install. In cases where we are running physical boxes, having
the ability to use Chef to uninstall something is great, and not something
I want to muck around with node run_lists to do - hence, having the
override run_list feature really is helpful. Being able to kick off that
uninstall recipe in cases where the preferred "kill off the VM and
recreate" path is not available and without having to jump through hoops is
really awesome.

I don't necessarily see any value in saving the run_list back to the node
object (or any part of the node object, actually), so I'm in the "don't
bother saving in an override situation" crowd.

~Adam

On Tue, Sep 17, 2013 at 6:49 AM, Sam Pointer sam.pointer@opsunit.comwrote:

I know the following doesn't directly answer the question at hand, but
being able to override the run_list has always seemed like an anti-feature
to me. It breaks a central tenet of configuration management in that you
should be able to reproduce your infrastructure from the committed state of
your source repository. In essence, there is nothing separating a for-loop
around ssh and an override run other than the DSL. You've made a
non-reproducible snowflake all the same, or you've re-introduced run books.
Both poor outcomes.

Whilst I am certainly not immune to practicality, a lot of the
justifications for override run_lists I've attempted to make ("I want to do
this one thing on this one box once to test", etc.) are simply excuses
being made for poor process and an incomplete ability to test changes
before they are applied against production.

So, whilst not perhaps the answer that is wanted, my reply would be "none
of the above, remove the ability to override". It smacks of enabling a
rewrite of bash into the Chef DSL whilst missing the real point behind what
we're all trying to do with configuration management.

Sam Pointer
Lead Consultant
www.opsunit.com

On 17 September 2013 04:49, Bryan McLellan btm@opscode.com wrote:

https://tickets.opscode.com/browse/CHEF-3506

This ticket proposes not saving the recipes and roles attributes when
using an override run list. Dan wondered if we should save the node
object at all. I can see being surprised in both situation; when your
node object changes on the next normal client run (or gives non-normal
search results in between) and when your node object doesn't change on
an override run, but when you're using override you're heading down
the path of someone who just asked for things to be a little sideways.

Anyone have a use case that would be upset by the node object being
not saved at the end of an override run?

--
Bryan McLellan | opscode | technical program manager, open source
(c) 206.607.7108 | (t) @btmspox | (b) http://blog.loftninjas.org

--
Elvin Abordo
Mobile: (845) 475-8744

On Tuesday, September 17, 2013 at 6:57 AM, Elvin Abordo wrote:

I will throw my 2 pences into the fire.

I don't think we should save the override run_list back to the node.

The current behavior is that the original run_list gets set back on the node before Chef saves it at the end of the run. There is a bug where manually saving the node during a recipe will overwrite the run_list and we've accepted a patch for that.

What we're asking here is whether all node saves should be skipped when using an override run_list. The reasoning for this is that node attributes that other recipes rely on for search may get removed by running chef with an override run_list. While we can special case a few important ones, we cannot know in advance all of the attributes that are important to your particular cookbooks, so in any case, saving the node at the end of the run could lead to incorrect configuration elsewhere.

--
Daniel DeLeo

On Tue, Sep 17, 2013 at 6:49 AM, Sam Pointer sam.pointer@opsunit.com wrote:

So, whilst not perhaps the answer that is wanted, my reply would be "none of
the above, remove the ability to override". It smacks of enabling a rewrite
of bash into the Chef DSL whilst missing the real point behind what we're
all trying to do with configuration management.

Chef often hands you a big hammer and asks that you handle it with
respect. One of the fundamentals of Chef is that it doesn't force you
into configuring your systems using a fixed model. We, the Chef
developers, don't know your infrastructure and your problems. As time
goes on and those problems become more defined, we extend Chef to be
able to solve them more directly, as best we can.

A number of use cases came up in the discussion around CHEF-2988 [1],
especially on the mailing list thread. [2] Some may be solved in the
future by Push [3] or failure zones [4]. Even so, in the interim
override run lists are helping folks get work done.

--
Bryan McLellan | opscode | technical program manager, open source
(c) 206.607.7108 | (t) @btmspox | (b) http://blog.loftninjas.org

[1] https://tickets.opscode.com/browse/CHEF-2988
[2] chef-dev - [chef-dev] Re: CHEF-2988 allowed_recipes, restricted_recipes, and override_recipes
[3] http://www.youtube.com/watch?v=yHub6E4DNvg
[4] https://tickets.opscode.com/browse/CHEF-2070

Let's ask this question:

What is the general use case for using an override run_list?
And during such a run, would you expect that the run_list items be
available for any other purpose in your ecosystem, not just this node
during this run?

These are open questions, and I think my answers are:
I barely ever use the override, and would expect it to only be "transient"
state - if I wanted the stuff to be around for longer, I'd make it
persistent.

-M

On Tue, Sep 17, 2013 at 11:40 AM, Bryan McLellan btm@opscode.com wrote:

On Tue, Sep 17, 2013 at 6:49 AM, Sam Pointer sam.pointer@opsunit.com
wrote:

So, whilst not perhaps the answer that is wanted, my reply would be
"none of
the above, remove the ability to override". It smacks of enabling a
rewrite
of bash into the Chef DSL whilst missing the real point behind what we're
all trying to do with configuration management.

Chef often hands you a big hammer and asks that you handle it with
respect. One of the fundamentals of Chef is that it doesn't force you
into configuring your systems using a fixed model. We, the Chef
developers, don't know your infrastructure and your problems. As time
goes on and those problems become more defined, we extend Chef to be
able to solve them more directly, as best we can.

A number of use cases came up in the discussion around CHEF-2988 [1],
especially on the mailing list thread. [2] Some may be solved in the
future by Push [3] or failure zones [4]. Even so, in the interim
override run lists are helping folks get work done.

--
Bryan McLellan | opscode | technical program manager, open source
(c) 206.607.7108 | (t) @btmspox | (b) http://blog.loftninjas.org

[1] https://tickets.opscode.com/browse/CHEF-2988
[2] chef-dev - [chef-dev] Re: CHEF-2988 allowed_recipes, restricted_recipes, and override_recipes
[3] http://www.youtube.com/watch?v=yHub6E4DNvg
[4] https://tickets.opscode.com/browse/CHEF-2070

We use the override run_list feature and take advantage of the fact that
the run_list is not saved only if it is not empty. We run override run_list
when a node is discovered or after a (long) installation process to quickly
make sure that chef-client service is running.

The nice feature is that if the node discovered is "new" and has no
run_list the data is saved and the run_list is permanently set. If the node
is known and has a run_list we don't want to touch anything except make
sure the bare minimum is set up (chef-client running).

I don't know if I'm clear but we don't mind the node not being saved only
if the node already has a run_list and node data. We want it saved if it's
new.
On Sep 17, 2013 6:03 PM, "Mike" miketheman@gmail.com wrote:

Let's ask this question:

What is the general use case for using an override run_list?
And during such a run, would you expect that the run_list items be
available for any other purpose in your ecosystem, not just this node
during this run?

These are open questions, and I think my answers are:
I barely ever use the override, and would expect it to only be "transient"
state - if I wanted the stuff to be around for longer, I'd make it
persistent.

-M

On Tue, Sep 17, 2013 at 11:40 AM, Bryan McLellan btm@opscode.com wrote:

On Tue, Sep 17, 2013 at 6:49 AM, Sam Pointer sam.pointer@opsunit.com
wrote:

So, whilst not perhaps the answer that is wanted, my reply would be
"none of
the above, remove the ability to override". It smacks of enabling a
rewrite
of bash into the Chef DSL whilst missing the real point behind what
we're
all trying to do with configuration management.

Chef often hands you a big hammer and asks that you handle it with
respect. One of the fundamentals of Chef is that it doesn't force you
into configuring your systems using a fixed model. We, the Chef
developers, don't know your infrastructure and your problems. As time
goes on and those problems become more defined, we extend Chef to be
able to solve them more directly, as best we can.

A number of use cases came up in the discussion around CHEF-2988 [1],
especially on the mailing list thread. [2] Some may be solved in the
future by Push [3] or failure zones [4]. Even so, in the interim
override run lists are helping folks get work done.

--
Bryan McLellan | opscode | technical program manager, open source
(c) 206.607.7108 | (t) @btmspox | (b) http://blog.loftninjas.org

[1] https://tickets.opscode.com/browse/CHEF-2988
[2] chef-dev - [chef-dev] Re: CHEF-2988 allowed_recipes, restricted_recipes, and override_recipes
[3] http://www.youtube.com/watch?v=yHub6E4DNvg
[4] https://tickets.opscode.com/browse/CHEF-2070

I'm torn here. I make EXTENSIVE use of overriding the run_list for one-shot
tasks in our installer. Things like "recipe[foo:enable]" or
"recipe[foo:disable]" but we do this all with chef-solo.
I'm generally of the opinion that if I want to save any critical state
back, I'd call node.save. I was burned by the
bootstrap-failure-not-saving-back change a long time ago and I have scars.
Ugly nasty scars.

The things I generally use override run_list for are things that I know
have no lasting side effects. I don't intend it to blow away the state of
the node. Like a one-shot.

Then again the things I use it for I'm pretty diligent about ensuring
doesn't have side effects like that.

On Tue, Sep 17, 2013 at 12:03 PM, Mike miketheman@gmail.com wrote:

Let's ask this question:

What is the general use case for using an override run_list?
And during such a run, would you expect that the run_list items be
available for any other purpose in your ecosystem, not just this node
during this run?

These are open questions, and I think my answers are:
I barely ever use the override, and would expect it to only be "transient"
state - if I wanted the stuff to be around for longer, I'd make it
persistent.

-M

On Tue, Sep 17, 2013 at 11:40 AM, Bryan McLellan btm@opscode.com wrote:

On Tue, Sep 17, 2013 at 6:49 AM, Sam Pointer sam.pointer@opsunit.com
wrote:

So, whilst not perhaps the answer that is wanted, my reply would be
"none of
the above, remove the ability to override". It smacks of enabling a
rewrite
of bash into the Chef DSL whilst missing the real point behind what
we're
all trying to do with configuration management.

Chef often hands you a big hammer and asks that you handle it with
respect. One of the fundamentals of Chef is that it doesn't force you
into configuring your systems using a fixed model. We, the Chef
developers, don't know your infrastructure and your problems. As time
goes on and those problems become more defined, we extend Chef to be
able to solve them more directly, as best we can.

A number of use cases came up in the discussion around CHEF-2988 [1],
especially on the mailing list thread. [2] Some may be solved in the
future by Push [3] or failure zones [4]. Even so, in the interim
override run lists are helping folks get work done.

--
Bryan McLellan | opscode | technical program manager, open source
(c) 206.607.7108 | (t) @btmspox | (b) http://blog.loftninjas.org

[1] https://tickets.opscode.com/browse/CHEF-2988
[2] chef-dev - [chef-dev] Re: CHEF-2988 allowed_recipes, restricted_recipes, and override_recipes
[3] http://www.youtube.com/watch?v=yHub6E4DNvg
[4] https://tickets.opscode.com/browse/CHEF-2070

I use a simple 'chef-client -o ‘recipe[ntp]’ or similar non-intrusive operation when switching nodes from one chef server to another, especially clustered servers, to get the node information available for the monitoring systems and cluster configuration tools. That allows me to do the switchover outside the standard maintenance window when other operations, such as planned and enabled configuration changes awaiting the scheduled maintenance window, would be unwelcome.


From: Bryan McLellan [btm@opscode.com]
Sent: Monday, September 16, 2013 11:49 PM
To: chef@lists.opscode.com
Subject: [chef] CHEF-3506 - Don’t save the node object when using an override run list?

https://tickets.opscode.com/browse/CHEF-3506

This ticket proposes not saving the recipes and roles attributes when
using an override run list. Dan wondered if we should save the node
object at all. I can see being surprised in both situation; when your
node object changes on the next normal client run (or gives non-normal
search results in between) and when your node object doesn’t change on
an override run, but when you’re using override you’re heading down
the path of someone who just asked for things to be a little sideways.

Anyone have a use case that would be upset by the node object being
not saved at the end of an override run?


Bryan McLellan | opscode | technical program manager, open source
© 206.607.7108 | (t) @btmspox | (b) http://blog.loftninjas.org

OTOH, properly constructed CM should be idempotent and orthogonal. If I
need to do a software deployment, I don't generally need to touch ntp or
smtp or anything outside of the cookbook that deploys the software. In
fact if I --why-run a software deployment I don't expect to see anything
else change (and non-idempotent stuff that changes on every run I expect
to be convergent, so that the software deployment doesn't really depend
on that). I should be able to more quickly rollout software by using an
override run list to only run the software deployment cookbook, and the
rest of the state of the server should be orthogonal and idempotent. If
this does not work, then you've architected some kind of a mess, and
should really refactor your infrastructure. Since it works, there's no
good reason to not utilize it.

On 9/17/13 3:49 AM, Sam Pointer wrote:

I know the following doesn't directly answer the question at hand, but
being able to override the run_list has always seemed like an
anti-feature to me. It breaks a central tenet of configuration
management in that you should be able to reproduce your infrastructure
from the committed state of your source repository. In essence, there
is nothing separating a for-loop around ssh and an override run other
than the DSL. You've made a non-reproducible snowflake all the same,
or you've re-introduced run books. Both poor outcomes.

Whilst I am certainly not immune to practicality, a lot of the
justifications for override run_lists I've attempted to make ("I want
to do this one thing on this one box once to test", etc.) are simply
excuses being made for poor process and an incomplete ability to test
changes before they are applied against production.

So, whilst not perhaps the answer that is wanted, my reply would be
"none of the above, remove the ability to override". It smacks of
enabling a rewrite of bash into the Chef DSL whilst missing the real
point behind what we're all trying to do with configuration management.

Sam Pointer
Lead Consultant
www.opsunit.com http://www.opsunit.com

On 17 September 2013 04:49, Bryan McLellan <btm@opscode.com
mailto:btm@opscode.com> wrote:

https://tickets.opscode.com/browse/CHEF-3506

This ticket proposes not saving the recipes and roles attributes when
using an override run list. Dan wondered if we should save the node
object at all. I can see being surprised in both situation; when your
node object changes on the next normal client run (or gives non-normal
search results in between) and when your node object doesn't change on
an override run, but when you're using override you're heading down
the path of someone who just asked for things to be a little sideways.

Anyone have a use case that would be upset by the node object being
not saved at the end of an override run?

--
Bryan McLellan | opscode | technical program manager, open source
(c) 206.607.7108 <tel:206.607.7108> | (t) @btmspox | (b)
http://blog.loftninjas.org

A bit late to this party, but our team is definitely in favor of removing the save when using an override. From the discussion on this list it seems nearly unanimous.

-----Original Message-----
From: Bryan McLellan [mailto:btm@opscode.com]
Sent: Monday, September 16, 2013 11:50 PM
To: chef@lists.opscode.com
Subject: [chef] CHEF-3506 - Don’t save the node object when using an override run list?

https://tickets.opscode.com/browse/CHEF-3506

This ticket proposes not saving the recipes and roles attributes when using an override run list. Dan wondered if we should save the node object at all. I can see being surprised in both situation; when your node object changes on the next normal client run (or gives non-normal search results in between) and when your node object doesn’t change on an override run, but when you’re using override you’re heading down the path of someone who just asked for things to be a little sideways.

Anyone have a use case that would be upset by the node object being not saved at the end of an override run?


Bryan McLellan | opscode | technical program manager, open source
© 206.607.7108 | (t) @btmspox | (b) http://blog.loftninjas.org