Installing large numbers of packages

I’m getting my first chef recipe in order, a simple one to prep a
workstation – which is mostly “install this giant list of packages”.

I was surprised to find that the simple, obvious way to do this:

package %w{foo bar baz} do
    action :install
end

Didn’t work. Resources don’t accept an array as the namevar to do the
obvious thing. Instead, I’ve got to do either:

package "foo" do
    action :install
end

package "bar" do
    action :install
end

package "baz" do
    action :install
end

Which is ridiculously verbose, or else:

%w{foo bar baz}.each do |pkg|
    package pkg do
        action :install
    end
end

Which doesn’t do anything for my “but you don’t need to learn much
Ruby” claims to the rest of the team, and still isn’t as compact and
clean as it could be.

Am I missing something obvious, or is one of the above options really
the recommended way to create big lists of resources?

Thanks,

  • Matt

On Aug 31, 2011, at 6:53 PM, Matt Palmer wrote:

I'm getting my first chef recipe in order, a simple one to prep a
workstation -- which is mostly "install this giant list of packages".

That's going to be one of the first real steps that I'm going to be doing for our infrastructure overhaul.

Which doesn't do anything for my "but you don't need to learn much
Ruby" claims to the rest of the team, and still isn't as compact and
clean as it could be.

Am I missing something obvious, or is one of the above options really
the recommended way to create big lists of resources?

I haven't actually done any real Chef yet, but my understanding was that you wanted to put as much information as possible in databags, and then have your rules reference the databags. So, I would expect the code from your first example would not be that different from the version that pulls the list of packages out of the appropriate databag.

--
Brad Knowles bknowles@ihiji.com

On Thu, Sep 1, 2011 at 10:00 AM, Brad Knowles bknowles@ihiji.com wrote:

On Aug 31, 2011, at 6:53 PM, Matt Palmer wrote:

I'm getting my first chef recipe in order, a simple one to prep a
workstation -- which is mostly "install this giant list of packages".

That's going to be one of the first real steps that I'm going to be doing for our infrastructure overhaul.

I suspect that'd be most people's first target... no production risk,
and usually pretty well understood.

Which doesn't do anything for my "but you don't need to learn much
Ruby" claims to the rest of the team, and still isn't as compact and
clean as it could be.

Am I missing something obvious, or is one of the above options really
the recommended way to create big lists of resources?

I haven't actually done any real Chef yet, but my understanding was that you wanted to put as much information as possible in databags, and then have your rules reference the databags. So, I would expect the code from your first example would not be that different from the version that pulls the list of packages out of the appropriate databag.

Well, chef-solo doesn't do databags, and chef's server looks like such
a hairball I'm going to avoid it for as long as I possibly can.
There's also resilience issues, a strong aversion to centralisation,
and too many painful memories of Puppet scaling nightmares to get
over.

Even without that, though, I'm having trouble working out how it's
better to have a list of packages in one place, and a resource
specification that installs those packages somewhere else. I can
almost convince myself that putting attributes in an external JSON
file makes sense for roles (although I think it's codifying the same
mistake that practically everyone makes using Puppet, where you define
a pile of global variables and cross your fingers that everything
works, rather than having locally-passed parameters that define how
you want to use something Here and Now), but making a list somewhere
external just so I can avoid having to walk an array is insane. My
recipe says "this is how you configure a workstation", and the list of
packages you have to install in order to do that should be in that
recipe.

  • Matt

On Aug 31, 2011, at 7:08 PM, Matt Palmer wrote:

Well, chef-solo doesn't do databags, and chef's server looks like such
a hairball I'm going to avoid it for as long as I possibly can.

I can't speak for chef-solo, but I did do a chef repo/chef-client install for Hosted Chef, and I can tell you that with the omnibus installer, that process was almost as painless as I've ever had. The only remaining issue I have with that is outlined in the CHEF-2578 ticket.

So far as I know, this omnibus installer is intended for use with all types of Chef installations, whether that be chef-solo, chef-client, chef-server, Hosted Chef, etc....

There's also resilience issues, a strong aversion to centralisation,
and too many painful memories of Puppet scaling nightmares to get
over.

Well, all CM systems are about centralization, regularization, categorization, and management of information, so I don't think you're going to get away from that. In the case of Chef, what you're doing is trying to get all this information about your internal systems & network infrastructure recorded into a reliable and version-controlled CM system.

Even without that, though, I'm having trouble working out how it's
better to have a list of packages in one place, and a resource
specification that installs those packages somewhere else. I can
almost convince myself that putting attributes in an external JSON
file makes sense for roles (although I think it's codifying the same
mistake that practically everyone makes using Puppet, where you define
a pile of global variables and cross your fingers that everything
works, rather than having locally-passed parameters that define how
you want to use something Here and Now), but making a list somewhere
external just so I can avoid having to walk an array is insane. My
recipe says "this is how you configure a workstation", and the list of
packages you have to install in order to do that should be in that
recipe.

I think it comes down to a separation of "code" from "data". You should put your "code" into a code repository, but when the data that the code is going to be operating needs to be changed, you shouldn't necessarily have to change the code just to accommodate the change in the data.

The installation script should be simple and easy to read, regardless of how many packages are being installed -- that information should come from the database. And when all that is changing is the data, it should work as you want with the existing code that is already in place on all your nodes.

When you're talking about a small number of nodes to be managed, I'm not sure that this makes such a difference. But as you try to scale up, it's going to become more and more important that you keep this separation between code & data.

So, do you want to learn the right way from Day One, or do you want to learn a single-file method that you will have to unlearn as you try to scale up?

Maybe your problem with CM systems isn't with the systems themselves, but with the way you're trying to use them -- or maybe misuse them?

Anyway, that's just food for thought from the peanut gallery. I've been down the scaling road before, but not with Chef. It's going to be interesting to see how this works out.

--
Brad Knowles bknowles@ihiji.com

On Thu, Sep 1, 2011 at 10:34 AM, Brad Knowles bknowles@ihiji.com wrote:

On Aug 31, 2011, at 7:08 PM, Matt Palmer wrote:

There's also resilience issues, a strong aversion to centralisation,
and too many painful memories of Puppet scaling nightmares to get
over.

Well, all CM systems are about centralization, regularization, categorization, and management of information, so I don't think you're going to get away from that. In the case of Chef, what you're doing is trying to get all this information about your internal systems & network infrastructure recorded into a reliable and version-controlled CM system.

Configuration management doesn't imply reliability on a giant central
server infrastructure that's going to have to be scaled and managed
itself.

Even without that, though, I'm having trouble working out how it's
better to have a list of packages in one place, and a resource
specification that installs those packages somewhere else. I can
almost convince myself that putting attributes in an external JSON
file makes sense for roles (although I think it's codifying the same
mistake that practically everyone makes using Puppet, where you define
a pile of global variables and cross your fingers that everything
works, rather than having locally-passed parameters that define how
you want to use something Here and Now), but making a list somewhere
external just so I can avoid having to walk an array is insane. My
recipe says "this is how you configure a workstation", and the list of
packages you have to install in order to do that should be in that
recipe.

I think it comes down to a separation of "code" from "data". You should put your "code" into a code repository, but when the data that the code is going to be operating needs to be changed, you shouldn't necessarily have to change the code just to accommodate the change in the data.

My system configuration is all data... "this is what I want to
happen". A list of packages is no more code or data than the fact
that I want those packages installed, and not removed. I want to
revision control it all.

The installation script should be simple and easy to read, regardless of how many packages are being installed -- that information should come from the database. And when all that is changing is the data, it should work as you want with the existing code that is already in place on all your nodes.

The installation script is simple and easy to read, if the syntax is
appropriate.

When you're talking about a small number of nodes to be managed, I'm not sure that this makes such a difference. But as you try to scale up, it's going to become more and more important that you keep this separation between code & data.

So, do you want to learn the right way from Day One, or do you want to learn a single-file method that you will have to unlearn as you try to scale up?

And this is where I feel like I should stop listening to you, because
you assume I've never managed large scale systems. I've done 500+
nodes with Puppet, and 2,500+ systems under management in bodgy
semi-manual ways.

If there are better ways to do it with Chef, I'm open to them, but I
do have plenty of experience in this field, and so far my
experiences are telling me that the way Chef does it is a monumental
pain in the arse at scale. However, I'm willing to learn that I'm
wrong, so point me at the documentation that explains clearly and
simply why the Chef way works better.

Maybe your problem with CM systems isn't with the systems themselves, but with the way you're trying to use them -- or maybe misuse them?

Perhaps. Feel free to show me where I'm wrong, but with specifics,
not platitudes.

  • Matt

On Aug 31, 2011, at 7:47 PM, Matt Palmer wrote:

Configuration management doesn't imply reliability on a giant central
server infrastructure that's going to have to be scaled and managed
itself.

You don't necessarily need a giant central server infrastructure to support a good Infrastructure CM, whether that's Puppet, Chef, or any other such tool. If these tools are doing their job, and if they are being used appropriately, they should be able to be used to manage large numbers of servers without themselves needing a great deal of horsepower to accomplish that job.

Of course, a lot depends on what you're asking them to do and how you're asking them to do that. But Chef is using RabbitMQ in Erlang to handle some of the most timing-critical message passing and work queue handling, and that is extremely efficient and very low-latency, in addition to being very high reliability. That kind of stuff can scale up about as big as anything on the Internet, and without a great deal of its own internal overhead.

My system configuration is all data... "this is what I want to
happen". A list of packages is no more code or data than the fact
that I want those packages installed, and not removed. I want to
revision control it all.

You're installing a large list of packages, each of which has it's own major and minor number that also need to be tracked. You want to update a big hairy long list of code every single time one of those packages has been updated and you want to push that out to all your machines? And do you maintain different versions of this code for different platforms that might need slightly different sets of versions of the packages that are installed?

If you want to do it that way, I guess that's possible.

Personally, I would consider that to be quite painful as compared to updating the information in a databag and have the remote chef clients figure out which systems are impacted by a major or minor version update for one of the packages it might or might not be using. And I'd have different lists of packages in the databag for each set of production, development, QA/test systems, etc.... So, my production list of packages would not change very often at all, but my development or QA/test sets of packages might change more often -- and I'd have the same recipe running on all these sets of machines, with each type of machine knowing that it needs to pull different data out of the databag based on the role that particular machine was filling.

But maybe that's just me.

The installation script is simple and easy to read, if the syntax is
appropriate.

If you've got a long hairy list of packages to install that is included inside the code that is supposed to install those packages, I wouldn't call that simple or easy to read. But again, maybe that's just me.

nd this is where I feel like I should stop listening to you, because
you assume I've never managed large scale systems. I've done 500+
nodes with Puppet, and 2,500+ systems under management in bodgy
semi-manual ways.

My large scale experience goes back to AOL in the mid-90s. At that time, tools like Chef and Puppet didn't exist. The only thing we had was a very early version of cfengine, and that was seriously painful. Fortunately for me, I didn't have to maintain it, and I was only personally responsible for maintaining 100+ systems that made minimal use of the CM system, versus the many thousands of other servers that we had throughout the service.

My more recent experience with Infrastructure CM systems comes from using cobbled together Kickstart/Jumpstart scripts front-ended with m4 pre-processing, when I was working at UT Austin a couple of years ago. We were replacing all that stuff with bcfg2, and again I was one of the early adopters for the projects I was working on, but again they were just going to be CM clients, and only a couple dozen at that. I was a few levels removed from having to support the 50K+ students, the 20K+ faculty & staff, and I didn't have much in the way of responsibilities for helping to do management on the other few hundred servers that we had based on those cobbled-together Jumpstart/Kickstart scripts.

My experience here is even smaller, at least to date. We're starting with a couple of small VMs for the next-generation back-end server infrastructure, but we want to be able to easily scale our systems up to supporting one or more hardware appliance (or software equivalent) in every single household throughout the country and ultimately the world, so I think that puts us on the scale of at least hundreds of millions of appliance installs. Chef would not be used to manage the appliance installs directly, at least not initially. We want to get experience with using it to support our back-end server infrastructure before we start looking at the really big fish.

If there are better ways to do it with Chef, I'm open to them, but I
do have plenty of experience in this field, and so far my
experiences are telling me that the way Chef does it is a monumental
pain in the arse at scale. However, I'm willing to learn that I'm
wrong, so point me at the documentation that explains clearly and
simply why the Chef way works better.

And this is where I have to step back myself, because I do not yet know enough about Chef in this particular respect to be able to provide any further guidance. I will be very interested to see/hear what you find out.

--
Brad Knowles bknowles@ihiji.com

If you want a one-liner...

%w{ foo bar baz }.each do |pkg| { package pkg }

install is the default action; no need to explicitly declare it.

On Wed, Aug 31, 2011 at 6:53 PM, Matt Palmer matt.palmer@freelancer.comwrote:

I'm getting my first chef recipe in order, a simple one to prep a
workstation -- which is mostly "install this giant list of packages".

I was surprised to find that the simple, obvious way to do this:

package %w{foo bar baz} do
action :install
end

Didn't work. Resources don't accept an array as the namevar to do the
obvious thing. Instead, I've got to do either:

package "foo" do
action :install
end

package "bar" do
action :install
end

package "baz" do
action :install
end

Which is ridiculously verbose, or else:

%w{foo bar baz}.each do |pkg|
package pkg do
action :install
end
end

Which doesn't do anything for my "but you don't need to learn much
Ruby" claims to the rest of the team, and still isn't as compact and
clean as it could be.

Am I missing something obvious, or is one of the above options really
the recommended way to create big lists of resources?

Thanks,

  • Matt

Ergh. Got my syntax wrong there; do is replaced by the braces.

%w{ foo bar baz }.each { |pkg| package pkg }

On Wed, Aug 31, 2011 at 8:27 PM, Charles Duffy charles@dyfis.net wrote:

If you want a one-liner...

%w{ foo bar baz }.each do |pkg| { package pkg }

install is the default action; no need to explicitly declare it.

On Wed, Aug 31, 2011 at 6:53 PM, Matt Palmer matt.palmer@freelancer.comwrote:

I'm getting my first chef recipe in order, a simple one to prep a
workstation -- which is mostly "install this giant list of packages".

I was surprised to find that the simple, obvious way to do this:

package %w{foo bar baz} do
action :install
end

Didn't work. Resources don't accept an array as the namevar to do the
obvious thing. Instead, I've got to do either:

package "foo" do
action :install
end

package "bar" do
action :install
end

package "baz" do
action :install
end

Which is ridiculously verbose, or else:

%w{foo bar baz}.each do |pkg|
package pkg do
action :install
end
end

Which doesn't do anything for my "but you don't need to learn much
Ruby" claims to the rest of the team, and still isn't as compact and
clean as it could be.

Am I missing something obvious, or is one of the above options really
the recommended way to create big lists of resources?

Thanks,

  • Matt

To answer your original question, you use a loop, not pass an array to the resource.

The only reason that the namevar being an array is "the obvious thing" is because you are a previous puppet user :slight_smile:

I'm not going to get involved in what appears to be a kind of crazy CM trench war for the rest of the thread. Suffice it to say, in my opinion and experience, you are likely to benefit from approaching things in Chef differently than you did in puppet. The big ones:

  1. Much more of what you do will be data driven, and use the way we compose node attributes to your benefit. This will be less common in Solo, but even then, it's very handy.

  2. Several of the big bottlenecks for scaling Puppet are solved with Chef Server - namely, we don't execute any arbitrary code or deal with compiling the catalog, rendering templates, or anything else on the server. We basically pass data in and back out again. That said, rock on with whatever way works best for you.

  3. Chef solo does support data bags as of 0.10.4.

Best,
Adam

Opscode, Inc.
Adam Jacob, Chief Product Officer
T: (206) 619-7151 E: adam@opscode.com

On Wednesday, August 31, 2011 at 5:47 PM, Matt Palmer wrote:

On Thu, Sep 1, 2011 at 10:34 AM, Brad Knowles <bknowles@ihiji.com (mailto:bknowles@ihiji.com)> wrote:

On Aug 31, 2011, at 7:08 PM, Matt Palmer wrote:

There's also resilience issues, a strong aversion to centralisation,
and too many painful memories of Puppet scaling nightmares to get
over.

Well, all CM systems are about centralization, regularization, categorization, and management of information, so I don't think you're going to get away from that. In the case of Chef, what you're doing is trying to get all this information about your internal systems & network infrastructure recorded into a reliable and version-controlled CM system.

Configuration management doesn't imply reliability on a giant central
server infrastructure that's going to have to be scaled and managed
itself.

Even without that, though, I'm having trouble working out how it's
better to have a list of packages in one place, and a resource
specification that installs those packages somewhere else. I can
almost convince myself that putting attributes in an external JSON
file makes sense for roles (although I think it's codifying the same
mistake that practically everyone makes using Puppet, where you define
a pile of global variables and cross your fingers that everything
works, rather than having locally-passed parameters that define how
you want to use something Here and Now), but making a list somewhere
external just so I can avoid having to walk an array is insane. My
recipe says "this is how you configure a workstation", and the list of
packages you have to install in order to do that should be in that
recipe.

I think it comes down to a separation of "code" from "data". You should put your "code" into a code repository, but when the data that the code is going to be operating needs to be changed, you shouldn't necessarily have to change the code just to accommodate the change in the data.

My system configuration is all data... "this is what I want to
happen". A list of packages is no more code or data than the fact
that I want those packages installed, and not removed. I want to
revision control it all.

The installation script should be simple and easy to read, regardless of how many packages are being installed -- that information should come from the database. And when all that is changing is the data, it should work as you want with the existing code that is already in place on all your nodes.

The installation script is simple and easy to read, if the syntax is
appropriate.

When you're talking about a small number of nodes to be managed, I'm not sure that this makes such a difference. But as you try to scale up, it's going to become more and more important that you keep this separation between code & data.

So, do you want to learn the right way from Day One, or do you want to learn a single-file method that you will have to unlearn as you try to scale up?

And this is where I feel like I should stop listening to you, because
you assume I've never managed large scale systems. I've done 500+
nodes with Puppet, and 2,500+ systems under management in bodgy
semi-manual ways.

If there are better ways to do it with Chef, I'm open to them, but I
do have plenty of experience in this field, and so far my
experiences are telling me that the way Chef does it is a monumental
pain in the arse at scale. However, I'm willing to learn that I'm
wrong, so point me at the documentation that explains clearly and
simply why the Chef way works better.

Maybe your problem with CM systems isn't with the systems themselves, but with the way you're trying to use them -- or maybe misuse them?

Perhaps. Feel free to show me where I'm wrong, but with specifics,
not platitudes.

  • Matt

On Sep 1, 2011, at 7:54 AM, Adam Jacob wrote:

To answer your original question, you use a loop, not pass an array to the resource.

and I think this is how it should be. this is dsl vs. language. using chef is programming ruby, using puppet is using the puppet dsl (ok, things are changing)

package %w{foo bar baz} do
action :install
end

might be a good shortcut in the dsl, but this comes with syntax clutter

imagine you would like to pass in exact versions for each package, then package would have to
support arrays of arrays (in this implementation), and if it does not you have to patch the resource

package [["foo","0.3"], ["bar","0.4.5"], ["baz",nil]]} do
action :install
end

if you allow for ruby in the recipe you can leave that up to the user:

[["foo","0.3"], ["bar","0.4.5"], ["baz",nil]].each do |pkg,pkg_version|
package pkg do
action :install
version pkg_version if pkg_version
end
end

all ruby, no special chef syntax

this is one reason why I favor the chef way

The only reason that the namevar being an array is "the obvious thing" is because you are a previous puppet user :slight_smile:

I'm not going to get involved in what appears to be a kind of crazy CM trench war for the rest of the thread. Suffice it to say, in my opinion and experience, you are likely to benefit from approaching things in Chef differently than you did in puppet. The big ones:

  1. Much more of what you do will be data driven, and use the way we compose node attributes to your benefit. This will be less common in Solo, but even then, it's very handy.

  2. Several of the big bottlenecks for scaling Puppet are solved with Chef Server - namely, we don't execute any arbitrary code or deal with compiling the catalog, rendering templates, or anything else on the server. We basically pass data in and back out again. That said, rock on with whatever way works best for you.

  3. Chef solo does support data bags as of 0.10.4.

Best,
Adam

Opscode, Inc.
Adam Jacob, Chief Product Officer
T: (206) 619-7151 E: adam@opscode.com

On Wednesday, August 31, 2011 at 5:47 PM, Matt Palmer wrote:

On Thu, Sep 1, 2011 at 10:34 AM, Brad Knowles <bknowles@ihiji.com (mailto:bknowles@ihiji.com)> wrote:

On Aug 31, 2011, at 7:08 PM, Matt Palmer wrote:

There's also resilience issues, a strong aversion to centralisation,
and too many painful memories of Puppet scaling nightmares to get
over.

Well, all CM systems are about centralization, regularization, categorization, and management of information, so I don't think you're going to get away from that. In the case of Chef, what you're doing is trying to get all this information about your internal systems & network infrastructure recorded into a reliable and version-controlled CM system.

Configuration management doesn't imply reliability on a giant central
server infrastructure that's going to have to be scaled and managed
itself.

Even without that, though, I'm having trouble working out how it's
better to have a list of packages in one place, and a resource
specification that installs those packages somewhere else. I can
almost convince myself that putting attributes in an external JSON
file makes sense for roles (although I think it's codifying the same
mistake that practically everyone makes using Puppet, where you define
a pile of global variables and cross your fingers that everything
works, rather than having locally-passed parameters that define how
you want to use something Here and Now), but making a list somewhere
external just so I can avoid having to walk an array is insane. My
recipe says "this is how you configure a workstation", and the list of
packages you have to install in order to do that should be in that
recipe.

I think it comes down to a separation of "code" from "data". You should put your "code" into a code repository, but when the data that the code is going to be operating needs to be changed, you shouldn't necessarily have to change the code just to accommodate the change in the data.

My system configuration is all data... "this is what I want to
happen". A list of packages is no more code or data than the fact
that I want those packages installed, and not removed. I want to
revision control it all.

The installation script should be simple and easy to read, regardless of how many packages are being installed -- that information should come from the database. And when all that is changing is the data, it should work as you want with the existing code that is already in place on all your nodes.

The installation script is simple and easy to read, if the syntax is
appropriate.

When you're talking about a small number of nodes to be managed, I'm not sure that this makes such a difference. But as you try to scale up, it's going to become more and more important that you keep this separation between code & data.

So, do you want to learn the right way from Day One, or do you want to learn a single-file method that you will have to unlearn as you try to scale up?

And this is where I feel like I should stop listening to you, because
you assume I've never managed large scale systems. I've done 500+
nodes with Puppet, and 2,500+ systems under management in bodgy
semi-manual ways.

If there are better ways to do it with Chef, I'm open to them, but I
do have plenty of experience in this field, and so far my
experiences are telling me that the way Chef does it is a monumental
pain in the arse at scale. However, I'm willing to learn that I'm
wrong, so point me at the documentation that explains clearly and
simply why the Chef way works better.

Maybe your problem with CM systems isn't with the systems themselves, but with the way you're trying to use them -- or maybe misuse them?

Perhaps. Feel free to show me where I'm wrong, but with specifics,
not platitudes.

  • Matt

--
DI Edmund Haselwanter, edmund@haselwanter.com, http://edmund.haselwanter.com/
http://www.iteh.at | Facebook | http://at.linkedin.com/in/haselwanteredmund

One thing I noticed when starting with Chef is this tendency to
install a bunch of packages together. I suspect its because it is a
simple/common resource so you start there and figure “let Chef handle
the package and I’ll add more logic for fancy config stuff later”.
This is probably a reasonable sentiment, but you end up with a big
list of packages to install which I suspect will not be what you want
at the end of the day. Once you really get into it, a package isn’t
very interesting. It probably needs some config files tweaked and
maybe a service started up after you install it before it can do
something useful.

So you have your big loop of package installations. Now are you going
to start putting additional config bits in the recipe after that loop?
You then end up with one big huge recipe doing a bunch of stuff. That
doesn’t seem good. The recipe has no clarity of purpose.

So what if you make task specific recipes to do all the config
tweaking? But they all depend on having the package present so your’e
still referring back to the recipe that runs your big loop. That
doesn’t seem much better.

Like I said, this happened to us starting out. So what we did instead
was refactor and get rid of that big loop and instead create different
recipes for the various services. Each recipe installs the packages it
needs and then configures them all together. Keeps everything within
one recipe limited to accomplishing a single task. Then you can
include the recipes into a container recipe or role to make them easy
to apply all together if you want. This makes future maintenance of
the code much easier.

Now at the end, we did still have a recipe called “packages” where we
have a loop of packages that really were just things we wanted to have
around on the systems but required no configuration or other tweaks of
any kind. However, by the time we’d refactored everything, the list
was very small, like 3-4 packages.

KC

If you'd like to be less verbose, you can take advantage of the fact
that you don't have to specify a resource's default action, which in
the case of package, is :install.

package "foo"
package "bar"
package "bazz"
package "buzz"
package "bizz"
package "bim"
package "bam"
package "bop"

-s

On Wed, Aug 31, 2011 at 7:53 PM, Matt Palmer matt.palmer@freelancer.com wrote:

I'm getting my first chef recipe in order, a simple one to prep a
workstation -- which is mostly "install this giant list of packages".

I was surprised to find that the simple, obvious way to do this:

package %w{foo bar baz} do
action :install
end

Didn't work. Resources don't accept an array as the namevar to do the
obvious thing. Instead, I've got to do either:

package "foo" do
action :install
end

package "bar" do
action :install
end

package "baz" do
action :install
end

Which is ridiculously verbose, or else:

%w{foo bar baz}.each do |pkg|
package pkg do
action :install
end
end

Which doesn't do anything for my "but you don't need to learn much
Ruby" claims to the rest of the team, and still isn't as compact and
clean as it could be.

Am I missing something obvious, or is one of the above options really
the recommended way to create big lists of resources?

Thanks,

  • Matt

The "iterate over a list of packages" approach seems to be pretty standard,
regardless of whether the list itself be hard-coded, from an attribute, or
from a data bag.

If you don't like having the explicit loop inside your recipes, you could
wrap it in an LWRP, maybe something like this:

package_list "my-package-list" do
packages %w{foo bar baz}
action :install
end

The provider could look something like this:

[:install, :upgrade, :remove, :purge].each do |act|
action act do
new_resource.packages.each do |pkg|
package pkg do
action act
end
end
end

Of course, you'd need more code if you wanted it to be more generally
useful, to handle attributes, notifies, etc.

It seems overkillish to me, though. I find lines like "%w.each
{ |p| package p }" to be pretty short, sweet and clear.

On Wednesday, August 31, 2011, Matt Palmer matt.palmer@freelancer.com
wrote:

I'm getting my first chef recipe in order, a simple one to prep a
workstation -- which is mostly "install this giant list of packages".

I was surprised to find that the simple, obvious way to do this:

package %w{foo bar baz} do
action :install
end

Didn't work. Resources don't accept an array as the namevar to do the
obvious thing. Instead, I've got to do either:

package "foo" do
action :install
end

package "bar" do
action :install
end

package "baz" do
action :install
end

Which is ridiculously verbose, or else:

%w{foo bar baz}.each do |pkg|
package pkg do
action :install
end
end

Which doesn't do anything for my "but you don't need to learn much
Ruby" claims to the rest of the team, and still isn't as compact and
clean as it could be.

Am I missing something obvious, or is one of the above options really
the recommended way to create big lists of resources?

Thanks,

  • Matt

On Thu, Sep 1, 2011 at 5:51 PM, KC Braunschweig
kcbraunschweig@gmail.com wrote:

One thing I noticed when starting with Chef is this tendency to
install a bunch of packages together. I suspect its because it is a
simple/common resource so you start there and figure "let Chef handle
the package and I'll add more logic for fancy config stuff later".
This is probably a reasonable sentiment, but you end up with a big
list of packages to install which I suspect will not be what you want
at the end of the day. Once you really get into it, a package isn't
very interesting. It probably needs some config files tweaked and
maybe a service started up after you install it before it can do
something useful.

Of the 47 packages I've got in my "install these onto new
workstations" list, none of them require any sort of post-installation
configuration or service initialisation. They are all packages that
provide additional functionality (like a full version of vim, strace,
etc) that just need to be "on the system".

Yes, sometimes you really do just need a big list of packages.

  • Matt