Ad-hoc tasks and ad-hoc sources


#1

Are there any best practices that have emerged around executing ad-hoc
tasks via Chef? A use case that seems to come up often for me is performing
code deployments using the application cookbook without executing any other
configuration tasks that would usually be executed by the run_list. I
understand that with Chef one typically considers a deployment to be
inclusive of the prerequisite package installations, etc., but humor me for
a bit. :slight_smile:

Anyway, a couple of ways that I thought the above might be handled are to
either set the run_list on the fly prior to the chef-client run, or to
execute chef-client on the target nodes with the -j option, passing it an
ad hoc run_list in the JSON. The former would require the previous run_list
to be saved, then restored after the chef-client run (assuming it is
successful). The latter appears to set the run_list on the Chef server
after the chef-client run. Both options seem to make it difficult to use
the nodes’ run_list settings on the Chef server as a source of truth for a
node’s function, so I’m not really happy with either one.

I’d love to hear how you all have handled the above problem. Or maybe I’m
just looking at this the wrong way? In any case, thanks in advance for any
advice.

Also on this topic, I noticed the following on
http://wiki.opscode.com/display/chef/FAQ:
"…by making Chef’s Resourceshttp://wiki.opscode.com/display/chef/Resources
capable of taking action from ad-hoc sources. (This isn’t released yet,
but it will be soon - and trust us, it is awesome.)"

The above seems promising for the use case I described, and is interesting
in general. Can anyone provide any more detail on this? Thanks everyone!


#2

This has come up in my travels as well, and I never came up with a viable ad-hoc solution, so the community’s input is of interest to me, as well.

I have found that a well-crafted runlist can re-execute really fast if there is only one cookbook change, but it requires being extremely diligent about service notifications. But it has almost the same effect as an ad-hoc recipe might.

Best,

Vincent

On Mar 1, 2012, at 8:50 PM, Three Tee wrote:

Are there any best practices that have emerged around executing ad-hoc tasks via Chef? A use case that seems to come up often for me is performing code deployments using the application cookbook without executing any other configuration tasks that would usually be executed by the run_list. I understand that with Chef one typically considers a deployment to be inclusive of the prerequisite package installations, etc., but humor me for a bit. :slight_smile:

Anyway, a couple of ways that I thought the above might be handled are to either set the run_list on the fly prior to the chef-client run, or to execute chef-client on the target nodes with the -j option, passing it an ad hoc run_list in the JSON. The former would require the previous run_list to be saved, then restored after the chef-client run (assuming it is successful). The latter appears to set the run_list on the Chef server after the chef-client run. Both options seem to make it difficult to use the nodes’ run_list settings on the Chef server as a source of truth for a node’s function, so I’m not really happy with either one.

I’d love to hear how you all have handled the above problem. Or maybe I’m just looking at this the wrong way? In any case, thanks in advance for any advice.

Also on this topic, I noticed the following on http://wiki.opscode.com/display/chef/FAQ:
"…by making Chef’s Resourceshttp://wiki.opscode.com/display/chef/Resources capable of taking action from ad-hoc sources. (This isn’t released yet, but it will be soon - and trust us, it is awesome.)"

The above seems promising for the use case I described, and is interesting in general. Can anyone provide any more detail on this? Thanks everyone!


#3

On Thu, Mar 1, 2012 at 10:50 PM, Three Tee threetee@gmail.com wrote:

Are there any best practices that have emerged around executing ad-hoc tasks
via Chef? A use case that seems to come up often for me is performing code
deployments using the application cookbook without executing any other
configuration tasks that would usually be executed by the run_list. I
understand that with Chef one typically considers a deployment to be
inclusive of the prerequisite package installations, etc., but humor me for
a bit. :slight_smile:

Anyway, a couple of ways that I thought the above might be handled are to
either set the run_list on the fly prior to the chef-client run, or to
execute chef-client on the target nodes with the -j option, passing it an ad
hoc run_list in the JSON. The former would require the previous run_list to
be saved, then restored after the chef-client run (assuming it is
successful). The latter appears to set the run_list on the Chef server after
the chef-client run. Both options seem to make it difficult to use the
nodes’ run_list settings on the Chef server as a source of truth for a
node’s function, so I’m not really happy with either one.

I’d love to hear how you all have handled the above problem. Or maybe I’m
just looking at this the wrong way? In any case, thanks in advance for any
advice.

This may come out the wrong way but…you’re looking at it the wrong
way. Two things:

  1. The Chef application cookbook is opinionated.
  2. Client runs are idempotent

Pretty much it boils down to this. If you don’t trust your client run
then you shouldn’t be using the application cookbook. If you trust
your client run, the only penalty from using the application cookbook
is dealing with its opinions and the fact that will take a little
longer than had you just run the application cookbook.

If the problem is time it takes to deploy then stop using the
application cookbook. If the problem is that you “might lose changes”,
then you’ve got a more fundamental problem.

The main reason I don’t/have never used the chef-client run is that I
didn’t like its opinions. That doesn’t mean there isn’t something to
learn from it. At the previous gig, we did all deploys via jenkins.
Worked fine for us.

However if you’re insistent on trying to run JUST the application
cookbook, you can either use chef-solo or you can jump through the
hoops you mentioned. However if you use chef-solo, you’ve now got to
build data bags that have all the information that you might need from
searches.


#4

On 03/02/2012 08:50 AM, Three Tee wrote:

Are there any best practices that have emerged around executing ad-hoc
tasks via Chef? A use case that seems to come up often for me is
performing code deployments using the application cookbook without
executing any other configuration tasks that would usually be executed
by the run_list. I understand that with Chef one typically considers a
deployment to be inclusive of the prerequisite package installations,
etc., but humor me for a bit. :slight_smile:

Anyway, a couple of ways that I thought the above might be handled are
to either set the run_list on the fly prior to the chef-client run, or
to execute chef-client on the target nodes with the -j option, passing
it an ad hoc run_list in the JSON. The former would require the
previous run_list to be saved, then restored after the chef-client run
(assuming it is successful). The latter appears to set the run_list on
the Chef server after the chef-client run. Both options seem to make
it difficult to use the nodes’ run_list settings on the Chef server as
a source of truth for a node’s function, so I’m not really happy with
either one.

I’d love to hear how you all have handled the above problem. Or maybe
I’m just looking at this the wrong way? In any case, thanks in advance
for any advice.

Also on this topic, I noticed the following on
http://wiki.opscode.com/display/chef/FAQ:
"…by making Chef’sResources
http://wiki.opscode.com/display/chef/Resourcescapable of taking
action from ad-hoc sources. (This isn’t released yet, but it will be
soon - and trust us, it is awesome.)"

The above seems promising for the use case I described, and is
interesting in general. Can anyone provide any more detail on this?
Thanks everyone!

Hi, I prefer not save node data at all to saving/restoring run_list. If
you like you can easily achieve this by this simple code:

class Chef::Client
def save_updated_node
end
end

Regards,
Denis


#5

I’m actually quite interested in ad hoc resource execution as well,
particularly as applied to more advanced orchestration uses. For command
and control, full chef-client runs are expensive and awkward, particularly
when I need to run one-off resources/recipes. I love having all my code in
one place, and utilizing all the existing knowledge/state that the
chef-server keeps about my infrastructure, so having to use some external
system (like mcollective) to handle this kind of thing isn’t appealing.

I’d love to hear what others have come up with / hacked together. I’ve
started investigating using RightLink (
https://github.com/rightscale/right_link ) and particularly its
RemoteRecipe feature (without using RightScale’s saas platform). I’m also
considering writing my own (much simpler) message command --> recipe
execution layer, but I still need a decent way to execute arbitrary
Recipes/Resources on a node without mucking about with run_list hacks.

-a

On Tue, Mar 6, 2012 at 1:44 AM, Denis Barishev denis.barishev@gmail.comwrote:

On 03/02/2012 08:50 AM, Three Tee wrote:

Are there any best practices that have emerged around executing ad-hoc
tasks via Chef? A use case that seems to come up often for me is performing
code deployments using the application cookbook without executing any other
configuration tasks that would usually be executed by the run_list. I
understand that with Chef one typically considers a deployment to be
inclusive of the prerequisite package installations, etc., but humor me for
a bit. :slight_smile:

Anyway, a couple of ways that I thought the above might be handled are to
either set the run_list on the fly prior to the chef-client run, or to
execute chef-client on the target nodes with the -j option, passing it an
ad hoc run_list in the JSON. The former would require the previous run_list
to be saved, then restored after the chef-client run (assuming it is
successful). The latter appears to set the run_list on the Chef server
after the chef-client run. Both options seem to make it difficult to use
the nodes’ run_list settings on the Chef server as a source of truth for a
node’s function, so I’m not really happy with either one.

I’d love to hear how you all have handled the above problem. Or maybe I’m
just looking at this the wrong way? In any case, thanks in advance for any
advice.

Also on this topic, I noticed the following on
http://wiki.opscode.com/display/chef/FAQ:
"…by making Chef’s Resourceshttp://wiki.opscode.com/display/chef/Resources
capable of taking action from ad-hoc sources. (This isn’t released yet,
but it will be soon - and trust us, it is awesome.)"

The above seems promising for the use case I described, and is interesting
in general. Can anyone provide any more detail on this? Thanks everyone!

Hi, I prefer not save node data at all to saving/restoring run_list. If
you like you can easily achieve this by this simple code:

class Chef::Client
def save_updated_node
end
end

Regards,
Denis


#6

On Mar 21, 2012, at 11:49 AM, Alon Rohter wrote:

I’m actually quite interested in ad hoc resource execution as well, particularly as applied to more advanced orchestration uses. For command and control, full chef-client runs are expensive and awkward, particularly when I need to run one-off resources/recipes. I love having all my code in one place, and utilizing all the existing knowledge/state that the chef-server keeps about my infrastructure, so having to use some external system (like mcollective) to handle this kind of thing isn’t appealing.

I’d love to hear what others have come up with / hacked together. I’ve started investigating using RightLink ( https://github.com/rightscale/right_link ) and particularly its RemoteRecipe feature (without using RightScale’s saas platform). I’m also considering writing my own (much simpler) message command --> recipe execution layer, but I still need a decent way to execute arbitrary Recipes/Resources on a node without mucking about with run_list hacks.

There’s been some discussion on chef-dev recently about this as it relates to http://tickets.opscode.com/browse/CHEF-2988 which I’ve had a hand in testing.

The forked branch in question adds 3 run list modifiers that could help accomplish they type of partial chef runs you seem to be looking for.

Partial run list support is a contentious thing and there’s definite valid points on both sides as to whether or not it’s really a good approach, it’s up to you to decide if the risks are worth it for your infrastructure.

-sean

On Tue, Mar 6, 2012 at 1:44 AM, Denis Barishev denis.barishev@gmail.com wrote:
On 03/02/2012 08:50 AM, Three Tee wrote:

Are there any best practices that have emerged around executing ad-hoc tasks via Chef? A use case that seems to come up often for me is performing code deployments using the application cookbook without executing any other configuration tasks that would usually be executed by the run_list. I understand that with Chef one typically considers a deployment to be inclusive of the prerequisite package installations, etc., but humor me for a bit. :slight_smile:

Anyway, a couple of ways that I thought the above might be handled are to either set the run_list on the fly prior to the chef-client run, or to execute chef-client on the target nodes with the -j option, passing it an ad hoc run_list in the JSON. The former would require the previous run_list to be saved, then restored after the chef-client run (assuming it is successful). The latter appears to set the run_list on the Chef server after the chef-client run. Both options seem to make it difficult to use the nodes’ run_list settings on the Chef server as a source of truth for a node’s function, so I’m not really happy with either one.

I’d love to hear how you all have handled the above problem. Or maybe I’m just looking at this the wrong way? In any case, thanks in advance for any advice.

Also on this topic, I noticed the following on http://wiki.opscode.com/display/chef/FAQ:
"…by making Chef’s Resources capable of taking action from ad-hoc sources. (This isn’t released yet, but it will be soon - and trust us, it is awesome.)"

The above seems promising for the use case I described, and is interesting in general. Can anyone provide any more detail on this? Thanks everyone!

Hi, I prefer not save node data at all to saving/restoring run_list. If you like you can easily achieve this by this simple code:

class Chef::Client
def save_updated_node
end
end

Regards,
Denis


#7

Thanks for the link. Pretty close to what I need, although my primary use
case is adding a recipe to the run_list as a one-time run (i.e.
–add-recipes).

I can certainly see how run_list manipulation could be controversial, but
having some flexibility for power users who know/handle the risk would be
nice. I’ll be watching to see how this pull request goes :slight_smile:

-a

On Thu, Mar 22, 2012 at 11:44 AM, sean escriva sean.escriva@gmail.comwrote:

On Mar 21, 2012, at 11:49 AM, Alon Rohter wrote:

I’m actually quite interested in ad hoc resource execution as well,
particularly as applied to more advanced orchestration uses. For command
and control, full chef-client runs are expensive and awkward, particularly
when I need to run one-off resources/recipes. I love having all my code in
one place, and utilizing all the existing knowledge/state that the
chef-server keeps about my infrastructure, so having to use some external
system (like mcollective) to handle this kind of thing isn’t appealing.

I’d love to hear what others have come up with / hacked together. I’ve
started investigating using RightLink (
https://github.com/rightscale/right_link ) and particularly its
RemoteRecipe feature (without using RightScale’s saas platform). I’m also
considering writing my own (much simpler) message command --> recipe
execution layer, but I still need a decent way to execute arbitrary
Recipes/Resources on a node without mucking about with run_list hacks.

There’s been some discussion on chef-dev recently about this as it relates
to http://tickets.opscode.com/browse/CHEF-2988 which I’ve had a hand in
testing.

The forked branch in question adds 3 run list modifiers that could help
accomplish they type of partial chef runs you seem to be looking for.

Partial run list support is a contentious thing and there’s definite valid
points on both sides as to whether or not it’s really a good approach, it’s
up to you to decide if the risks are worth it for your infrastructure.

-sean

On Tue, Mar 6, 2012 at 1:44 AM, Denis Barishev denis.barishev@gmail.comwrote:

On 03/02/2012 08:50 AM, Three Tee wrote:

Are there any best practices that have emerged around executing ad-hoc
tasks via Chef? A use case that seems to come up often for me is performing
code deployments using the application cookbook without executing any other
configuration tasks that would usually be executed by the run_list. I
understand that with Chef one typically considers a deployment to be
inclusive of the prerequisite package installations, etc., but humor me for
a bit. :slight_smile:

Anyway, a couple of ways that I thought the above might be handled are to
either set the run_list on the fly prior to the chef-client run, or to
execute chef-client on the target nodes with the -j option, passing it an
ad hoc run_list in the JSON. The former would require the previous run_list
to be saved, then restored after the chef-client run (assuming it is
successful). The latter appears to set the run_list on the Chef server
after the chef-client run. Both options seem to make it difficult to use
the nodes’ run_list settings on the Chef server as a source of truth for a
node’s function, so I’m not really happy with either one.

I’d love to hear how you all have handled the above problem. Or maybe I’m
just looking at this the wrong way? In any case, thanks in advance for any
advice.

Also on this topic, I noticed the following on
http://wiki.opscode.com/display/chef/FAQ:
"…by making Chef’s Resourceshttp://wiki.opscode.com/display/chef/Resources
capable of taking action from ad-hoc sources. (This isn’t released yet,
but it will be soon - and trust us, it is awesome.)"

The above seems promising for the use case I described, and is
interesting in general. Can anyone provide any more detail on this? Thanks
everyone!

Hi, I prefer not save node data at all to saving/restoring run_list. If
you like you can easily achieve this by this simple code:

class Chef::Client
def save_updated_node
end
end

Regards,
Denis


#8

On Mar 21, 2012, at 6:49 PM, Alon Rohter wrote:

I’m actually quite interested in ad hoc resource execution as well, particularly as applied to more advanced orchestration uses. For command and control, full chef-client runs are expensive and awkward, particularly when I need to run one-off resources/recipes. I love having all my code in one place, and utilizing all the existing knowledge/state that the chef-server keeps about my infrastructure, so having to use some external system (like mcollective) to handle this kind of thing isn’t appealing.

If your full chef-client runs are too expensive and awkward, then my view is that they have not been written correctly. Or, at the very least, you’re going to have to do a lot of work to convince me that they have been written as well as they can be, and that there are some operations that just take too long.

Even the default Zenoss cookbook will take nearly 30 minutes to run when it does a full re-model, but it’s smart and keeps some internal timers so that only happens every six hours or so. And even that could be potentially moved outside of chef, so that chef is just responsible for installation and configuration of which nodes need to be monitored, and all the rest is done asynchronously to the chef-client run.

If you need some orchestration hooks, then you can use heavier tools like zookeeper, or lighter ones like noah. That would at least get you the near real-time callback mechanism, which could then be hooked into a chef run.

I remain extremely skeptical that there is a real need for running ad-hoc individual recipes outside of a proper chef run. In my book, this falls into the category of extraordinary claims. Feel free to provide extraordinary proof in order to convince me.


Brad Knowles bknowles@ihiji.com
SAGE Level IV, Chef Level 0.0.1


#9

My recipes/roles are fairly efficient, and most of my chef-client runs
complete in 15sec or less, so having to execute the full run_list isn’t
terribly bad today, although the engineer in me cries a little to see many
dozens of resources called when only one resource execution is actually
needed.

My real need is the ability to execute recipes/resources not in the normal
run_list, as ad-hoc tasks (that I dont want run every 30min by the daemon).
Things like triggering the deployment of a new application build, creating
and submitting a debug dump, exporting or importing a dataset, etc. Today
I handle this by incorporating all these tasks into the run_list recipes,
and then use file or databag guards/flags to prevent them from running
until I want them to. It works, but is inelegant and becomes awkward to
scale.

I’m looking for more direct interaction and control of chef executions.
This is an area where RightScale’s RightLink has been particularly
interesting to me, as I believe it’s trying to achieve similar goals in
terms of elegant chef recipe runs; which boils down to direct recipe
executions triggered by pushed (amqp type) messages. Basically I’m looking
to move away from static & monolithic ssh’d ./chef-client command line
runs, towards something more directly programmatic.

This may not persuade you, but for me Chef gets me 90% of the awesomeness I
need, so finding a solution for the other 10% is my goal.

-a

On Fri, Mar 23, 2012 at 4:56 PM, Brad Knowles bknowles@ihiji.com wrote:

On Mar 21, 2012, at 6:49 PM, Alon Rohter wrote:

I’m actually quite interested in ad hoc resource execution as well,
particularly as applied to more advanced orchestration uses. For command
and control, full chef-client runs are expensive and awkward, particularly
when I need to run one-off resources/recipes. I love having all my code in
one place, and utilizing all the existing knowledge/state that the
chef-server keeps about my infrastructure, so having to use some external
system (like mcollective) to handle this kind of thing isn’t appealing.

If your full chef-client runs are too expensive and awkward, then my view
is that they have not been written correctly. Or, at the very least,
you’re going to have to do a lot of work to convince me that they have been
written as well as they can be, and that there are some operations that
just take too long.

Even the default Zenoss cookbook will take nearly 30 minutes to run when
it does a full re-model, but it’s smart and keeps some internal timers so
that only happens every six hours or so. And even that could be
potentially moved outside of chef, so that chef is just responsible for
installation and configuration of which nodes need to be monitored, and all
the rest is done asynchronously to the chef-client run.

If you need some orchestration hooks, then you can use heavier tools like
zookeeper, or lighter ones like noah. That would at least get you the near
real-time callback mechanism, which could then be hooked into a chef run.

I remain extremely skeptical that there is a real need for running ad-hoc
individual recipes outside of a proper chef run. In my book, this falls
into the category of extraordinary claims. Feel free to provide
extraordinary proof in order to convince me.


Brad Knowles bknowles@ihiji.com
SAGE Level IV, Chef Level 0.0.1