New user feedback and questions


#1

Hello,

I’ve been dabbling a bit with Chef, and I thought I would give a bit of
new-user feedback, in case you’re interested :slight_smile:

(1) Something which confused me at first was the use of code blocks. In Ruby
these are usually callbacks to be executed zero or more times at some point
in the future (look at how Rake uses code blocks, for example).

But in Chef, it took me a while to realise that these are just executed
immediately as syntactic sugar for setting parameters. Take this example:

service "apache" do
  action :enable
end

directory "/tmp/something" do
  recursive true
  action :delete
end

It becomes much clearer (to me) if I translate this mentally to:

service "apache", :action => :enable

directory "/tmp/something",
  :recursive => true,
  :action => :delete

Now I see these as just a series of actions with parameters which are
executed one after the other, not tasks to be deferred. Therefore the issues
about ordering of actions vanish: if you want action A to be executed before
action B, you just write A before B :slight_smile:

It’s now clear to me (but wasn’t at first) that there are no dependencies
apart from those which you set up with notifies/subscribes. For example: if
you define a task using ‘execute’, it will be run every time your recipe
is run, unless you set up a precondition like ‘creates’ or ‘only_if’

I also found the concept of ‘resource’ a bit unclear at first. For example,
I may want to create a file using a script. But in that case, the 'resource’
is not a ‘file’ (the thing I want), it is an ‘execute’ (the way to create
it).

(2) Based on the above, I think there’s a gap in the documentation between
installing/getting started (which installs your first cookbook without
really understanding it) and the detail of the individual components. But
maybe I was just being dumb.

I found the detailed documentation on resources/providers etc was fine, and
in any case it’s easy enough to dig into the code once you get down to that
level.

I found another gap at
http://wiki.opscode.com/display/chef/Anatomy+of+a+Chef+Run
where it talks about “saving the state of a node”. What exactly is saved?
For example, does every ‘file’ resource keep a backup of the given file?
Or is it just a dump of the node state from Ohai?

(Right now I’m just using chef solo rather than a server)

(3) In the environment where I work, the most likely need for chef is in
distributing config files [*].

So I’m looking as to ways we can make this as safe and simple as possible
for the sysadmins to use.

Have you any pointers to best practices on testing new recipes? I don’t
really want to set up a whole dev/test environment just for developing and
testing each recipe change - it would be too cumbersome. But equally I don’t
want invalid recipes being pushed out, or worse, ones which destroy data.

One possibility would be to distribute new versions of cookbooks to
machines, but only activate them in a dry-run way (like make -n). Only once
the expected actions are checked would I then enable the new cookbook to be
run live.

If this sort of functionality already exists, I’d be very happy to receive a
pointer to it. Otherwise, how are other people handling this?

Anyway, that’s about it. This looks like a cool framework for machine
management, so thanks for releasing it!

Regards,

Brian.

[*] This is because most of the other functions are handled directly or
indirectly by the existing package manager. RH Satellite Server gets the
right packages installed on the boxes, and the packages themselves have
post-install scripts which create uids/gids and data directories with the
correct permissions and ownership. I don’t want or need to duplicate this
logic in chef recipes.

However I can also see a need for a resource to fetch a particular svn/git
tag of source code, compile it, and install it. This would make it much
easier to deploy a head or near-head version of a rapidly developing
application such as CouchDB, without having to keep re-packaging it.


#2

On Mon, Sep 7, 2009 at 3:57 AM, Brian CandlerB.Candler@pobox.com wrote:

Hello,

I’ve been dabbling a bit with Chef, and I thought I would give a bit of
new-user feedback, in case you’re interested :slight_smile:

Hell yes we’re interested! :slight_smile: Thank you so much for taking the time, Brian.

(1) Something which confused me at first was the use of code blocks. In Ruby
these are usually callbacks to be executed zero or more times at some point
in the future (look at how Rake uses code blocks, for example).

Or they are just closures/lambda’s, which is what we’re using them for.

But in Chef, it took me a while to realise that these are just executed
immediately as syntactic sugar for setting parameters. Take this example:

service “apache” do
action :enable
end

directory “/tmp/something” do
recursive true
action :delete
end

It becomes much clearer (to me) if I translate this mentally to:

service “apache”, :action => :enable

directory “/tmp/something”,
:recursive => true,
:action => :delete

Now I see these as just a series of actions with parameters which are
executed one after the other, not tasks to be deferred. Therefore the issues
about ordering of actions vanish: if you want action A to be executed before
action B, you just write A before B :slight_smile:

The tricky part is, they are being executed later. When you write:

service “apache” do
action :enable
end

You are creating a new resource, and adding it to the
ResourceCollection. It’s more accurate to think of it this way:

r = service "apache"
r.action :enable
collection << r

The block that follows is just being evaluated in the context of the
resource created by ‘service “apache”’. It then pushes the new
resource onto the ResourceCollection. Chef then walks the
ResourceCollection and executes every action, sends notifications,
etc. This two-phase action (compiling the ResourceCollection, then
executing it) allows you a crazy high degree of flexibility (you can
go back and inspect the resource collection, for example, or modify a
resource later, etc.)

Also, when you are in the block you are executing code in the context
of an instance of the resource class - which means you can insert any
arbitrary code you might want in there. (And people do - for example,
using a case statement to set an attribute to two different values.)

I took that approach early on in the design of Chef’s recipe syntax,
and I found I much preferred the block syntax for a few reasons:

  1. It’s fewer characters
  2. It doesn’t require comma’s after every attribute
  3. It lets me use arbitrary ruby to set attributes

We’re always open to changing things up, but I for one strongly prefer
the current syntax.

It’s now clear to me (but wasn’t at first) that there are no dependencies
apart from those which you set up with notifies/subscribes. For example: if
you define a task using ‘execute’, it will be run every time your recipe
is run, unless you set up a precondition like ‘creates’ or ‘only_if’

Right - Chef does what you tell it to do, in the order you tell it to
do it. Idempotency (whether a thing should/should not be done)
happens at the Resource level. For most resources, we take care of
this for you (we detect whether a package is installed and at the
correct version, say). For some it’s impossible - things like execute
or script resources, or you may know better what the conditions are.
In that case, you use not_if/only_if.

I also found the concept of ‘resource’ a bit unclear at first. For example,
I may want to create a file using a script. But in that case, the 'resource’
is not a ‘file’ (the thing I want), it is an ‘execute’ (the way to create
it).

Right - the resource abstraction provides a way to easily declare that
you want a particular thing to be in a particular state at a given
point in the Chef run.

(2) Based on the above, I think there’s a gap in the documentation between
installing/getting started (which installs your first cookbook without
really understanding it) and the detail of the individual components. But
maybe I was just being dumb.

I found the detailed documentation on resources/providers etc was fine, and
in any case it’s easy enough to dig into the code once you get down to that
level.

Have any suggestions on where in the documentation we should insert
some more knowledge for folks?

I found another gap at
http://wiki.opscode.com/display/chef/Anatomy+of+a+Chef+Run
where it talks about “saving the state of a node”. What exactly is saved?
For example, does every ‘file’ resource keep a backup of the given file?
Or is it just a dump of the node state from Ohai?

The attributes and run list of the node. That means data from Ohai,
but also any other attributes you set on the node. This lets you do
some pretty fun things - like have a recipe that upgrades your kernel,
sets an attribute, reboots, and takes specific action after the
reboot.

(Right now I’m just using chef solo rather than a server)

So the above does not apply. :slight_smile:

(3) In the environment where I work, the most likely need for chef is in
distributing config files [*].

So I’m looking as to ways we can make this as safe and simple as possible
for the sysadmins to use.

Have you any pointers to best practices on testing new recipes? I don’t
really want to set up a whole dev/test environment just for developing and
testing each recipe change - it would be too cumbersome. But equally I don’t
want invalid recipes being pushed out, or worse, ones which destroy data.

Testing recipes without taking the actions described is basically an
impossible task to get 100% correct. You can do dry runs, but any
time you check the run-time state of the system and that state could
have been modified previously in the run, you’re going to get a false
positive.

We’ve been working on ways to let you write real integration tests,
which would go a long way towards allowing you to do real testing in a
VM. As it stands, the best practice is to have a staging environment
that gets your new changes first, and only prop changes to production
when you are comfortable.

One possibility would be to distribute new versions of cookbooks to
machines, but only activate them in a dry-run way (like make -n). Only once
the expected actions are checked would I then enable the new cookbook to be
run live.

This is on the road-map, but see what I said above for the fact that
it might not actually tell you the whole truth.

Anyway, that’s about it. This looks like a cool framework for machine
management, so thanks for releasing it!

You are most welcome. Thank you for taking the time to write such
detailed feedback, and I look forward to hearing more about your
experiences with Chef!

However I can also see a need for a resource to fetch a particular svn/git
tag of source code, compile it, and install it. This would make it much
easier to deploy a head or near-head version of a rapidly developing
application such as CouchDB, without having to keep re-packaging it.

A source package provider is on the way, and an SCM resource with SVN
and Git providers are also in the works.

Best,
Adam


Opscode, Inc.
Adam Jacob, CTO
T: (206) 508-7449 E: adam@opscode.com


#3

On Mon, Sep 7, 2009 at 11:12 AM, Adam Jacob adam@opscode.com wrote:

On Mon, Sep 7, 2009 at 3:57 AM, Brian CandlerB.Candler@pobox.com wrote:

(3) In the environment where I work, the most likely need for chef is in
distributing config files [*].

So I’m looking as to ways we can make this as safe and simple as possible
for the sysadmins to use.

Have you any pointers to best practices on testing new recipes? I don’t
really want to set up a whole dev/test environment just for developing
and
testing each recipe change - it would be too cumbersome. But equally I
don’t
want invalid recipes being pushed out, or worse, ones which destroy data.

Testing recipes without taking the actions described is basically an
impossible task to get 100% correct. You can do dry runs, but any
time you check the run-time state of the system and that state could
have been modified previously in the run, you’re going to get a false
positive.

We’ve been working on ways to let you write real integration tests,
which would go a long way towards allowing you to do real testing in a
VM. As it stands, the best practice is to have a staging environment
that gets your new changes first, and only prop changes to production
when you are comfortable.

vmware has been really helpful to me here.

Joe


#4

On Mon, Sep 07, 2009 at 11:12:02AM -0700, Adam Jacob wrote:

Now I see these as just a series of actions with parameters which are
executed one after the other, not tasks to be deferred. Therefore the issues
about ordering of actions vanish: if you want action A to be executed before
action B, you just write A before B :slight_smile:

The tricky part is, they are being executed later. When you write:

service “apache” do
action :enable
end

You are creating a new resource, and adding it to the
ResourceCollection.

OK - and at the end, the ResourceCollection is executed in order.

This makes sense. You don’t want to start executing potentially destructive
actions, only to find part way through that something is missing a
parameter, say.

Also, when you are in the block you are executing code in the context
of an instance of the resource class - which means you can insert any
arbitrary code you might want in there. (And people do - for example,
using a case statement to set an attribute to two different values.)

Although of course you can insert arbitrary code in a normal method call
with parameters, e.g.

foo “bar”,
:attr => (case wibble
when 1
"baz"
end),

or

myattr = …
foo “bar”, :attr => myattr

Arguably less pretty though.

Have any suggestions on where in the documentation we should insert
some more knowledge for folks?

I think at “Cooking School”:
http://wiki.opscode.com/display/chef/Cooking+with+Chef
or else under “Cookbooks”.

Some of this info already exists under “Recipes”, like the fact that
resources are executed in order - it would be good to make it more explicit
at this point, just as you described above (i.e. resources are created,
setup using the block, added to a collection, and then the collection is
executed in sequence).

However by the time you get here you’ve already drilled down to a pretty low
level, and the tree is quite wide, so it’s very easy to miss this:

                                     +- attributes
                 +- nodes            +- definitions
                 +- roles            +- files
                 +- cookbooks -------+ 

Cooking with Chef —+ ± libraries
± resources ± recipes **!here!
± providers ± templates
± search indexes ± metadata

So I think an overview document which explains how recipes are parsed and
run would be useful. Also including some info from and/or link to “Anatomy
of a Chef Run”, which isn’t in this subtree at all.

I think that the “recipe” is the level at which most people will start
working and hence need to understand what’s happening: at first they’ll use
existing resources and providers, and won’t need to deal with nodes, roles,
definitions etc. At least, that’s how it was for me.

I’m still not clear exactly how “attributes” and “recipes” are parsed and
run, since it’s clear that setting an attribute can affect a recipe, but
also the examples show how attributes can add recipes to the collection (is
the collection empty at this point, which means these recipes will be
executed first, or has it already been parsed?) That would also be useful to
include.

Since the concepts of “attributes” and “recipes” are defined under
"cookbooks", then perhaps “cookbooks” is the right place to describe how
these elements interact with each other.

(Right now I’m just using chef solo rather than a server)

So the above does not apply. :slight_smile:

Sure, I was just trying to avoid the “why don’t you try it and see?” answer.
I need to set up a VM test environment for this, as I’m a bit scared of the
amount of changes it appears chef server will make to my system by
bootstrapping itself.

We’ve been working on ways to let you write real integration tests,
which would go a long way towards allowing you to do real testing in a
VM. As it stands, the best practice is to have a staging environment
that gets your new changes first, and only prop changes to production
when you are comfortable.

OK, that was the conclusion I was coming to anyway.

Many thanks for your prompt and detailed answers.

Regards,

Brian.


#5

Hey All.

I just installed chef yesterday after trying puppet and automateit and
if anybody is interested in the first impressions of an absolute
beginner here they are.

The chef server install is not accurate and is highly disruptive if
you already have rails sites. Here are the issues I remember.

It undoes your passenger.conf (in my case it added passenger.conf and
left my mod_rails.conf). This not too bad of an issue.
It rewrites your default.conf. In my case this was disruptive.
It rewrites ports.conf and apache2.conf setting the user to www-data
(in my case that was disruptive the current way to do is to use the
$APACHE_RUN_USER env variable).

A better way to muck with apache would be to drop files into the
/etc/apache2/conf.d directory IMHO

It doesn’t set the proper permissions in the /var/log/chef directory
(should be chef, chef).

I don’t like the fact that the default place for cookbooks is
/srv/chef but I am sure I can get used it. It seems to me /etc/chef or
/var/lib/chef are more appropriate.

Speaking of which I guess I don’t see the need for the runit
dependency either but maybe I’ll like it and use it for other things
too. Same goes for couchdb (why not dbdb?)

The client install did not put any of the executables in the path.
They are all in /var/lib/gems… directory. Not that big of a deal but
the documentation should reflect that. I am going to try the apt
repos today and see if they put the binaries in /usr/local/bin

I think the process of publishing cookbooks is cumbersome. You write
them on your machine. Check them out on the server and then push them
to the server (do I have to restart apache?). This process has too
many steps. Maybe the rake install should be smarter?

Speaking of which the documentation is too scattered (same can be said
of puppet BTW). The wiki style documentation doesn’t seem to lend
itself well to these kinds of projects. Something more linear and
hierarchical would be better for me (I realize everybody is
different).

Yesterday I got the server up and the client (in a VM) going. Today I
am going to see if I can push a couple of cookbooks. Let’s see what
happens.


#6

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello Tim!

On Sep 8, 2009, at 3:55 PM, Tim Uckun wrote:

The chef server install is not accurate and is highly disruptive if
you already have rails sites. Here are the issues I remember.

It undoes your passenger.conf (in my case it added passenger.conf and
left my mod_rails.conf). This not too bad of an issue.
It rewrites your default.conf. In my case this was disruptive.
It rewrites ports.conf and apache2.conf setting the user to www-data
(in my case that was disruptive the current way to do is to use the
$APACHE_RUN_USER env variable).

A better way to muck with apache would be to drop files into the
/etc/apache2/conf.d directory IMHO

The chef-server installation/bootstrap is meant to be done on a
dedicated type server system, not on an existing Rails application
server. The instructions mention pre-installation planning, where you
pick your ‘server’, and all other systems are ‘clients’. This would
include existing Rails app servers. You could of course manage your
Rails applications with Chef.

It doesn’t set the proper permissions in the /var/log/chef directory
(should be chef, chef).

When using the RubyGems installation method and the bootstrap
cookbook, my chef-server has:

drwxrwxr-x 2 chef chef 4.0K 2009-06-28 21:19 /var/log/chef

When using the Debian/Ubuntu package based installation:

drwxr-xr-x 2 root root 4096 2009-09-06 06:25 /var/log/chef

The Debian packages do not create or otherwise manage a Chef user
[yet], so the ownership is root.

I don’t like the fact that the default place for cookbooks is
/srv/chef but I am sure I can get used it. It seems to me /etc/chef or
/var/lib/chef are more appropriate.

Here’s the rationale, in a nutshell:

The /srv/chef location, as well as /etc/chef or /var/lib/chef are all
appropriately “FHS Compliant.” The idea of the /srv location is “files
served by this system.” In the case of a chef-server, the cookbooks
and associated assets (templates, remote files/directories, roles,
etc) are served by the chef-server. Per FHS, /etc/chef would be where
the chef-related configuration files are located, which is what the
directory is used for - the client, server and other config files as
appropriate but not the configuration cookbooks for your client nodes.
The /var/lib/chef location is appropriate for many of the file paths
used by Chef, and this is reflected in the Debian packaging. This may
take preference in the future.

Speaking of which I guess I don’t see the need for the runit
dependency either but maybe I’ll like it and use it for other things
too. Same goes for couchdb (why not dbdb?)

Runit is a dependency as a cross platform way to manage services. When
Chef was released, we didn’t have init scripts, and we (Opscode)
really like runit as a service management platform. It works very
well, and we recommend it. That said, we do have Debian and Red Hat
style init scripts in the distro directory of the main Chef source and
we’re working on updating the bootstrap to account for the option of
using init scripts instead of runit.

The client install did not put any of the executables in the path.
They are all in /var/lib/gems… directory. Not that big of a deal but
the documentation should reflect that. I am going to try the apt
repos today and see if they put the binaries in /usr/local/bin

This is one of the reasons why we recommend installing RubyGems from
source, which is mentioned on the installation documentation. If you
used the Debian/Ubuntu rubygems package, it does not install binaries
in /usr/bin for “FHS Compliance” reasons. See,

http://pkg-ruby-extras.alioth.debian.org/rubygems.html

For the Debian position on RubyGems.

I think the process of publishing cookbooks is cumbersome. You write
them on your machine. Check them out on the server and then push them
to the server (do I have to restart apache?). This process has too
many steps. Maybe the rake install should be smarter?

This workflow isn’t required, but we follow it because we always
manage our cookbooks in a version control system. The workflow
reflects a commonly cited best practice, where sysadmins store all
system configuration files in a VCS/SCM.

You don’t need to restart the chef-server (or apache w/ Passenger) for
the new cookbook changes to be effective, but you do need to restart
it if you created or updated any roles using the Ruby DSL or JSON
roles files.

Speaking of which the documentation is too scattered (same can be said
of puppet BTW). The wiki style documentation doesn’t seem to lend
itself well to these kinds of projects. Something more linear and
hierarchical would be better for me (I realize everybody is
different).

Please open ticket(s) about documentation structural improvements.
While it is a wiki that you can edit yourself if you’re logged in,
large sweeping changes should probably be reviewed before
implementing, since the ‘structure’ is fairly well known at this point.

Yesterday I got the server up and the client (in a VM) going. Today I
am going to see if I can push a couple of cookbooks. Let’s see what
happens.

Cool! We’d love to hear about your successes with Chef. If you have
any issues, post to the list, or ask on IRC.

Thank you for using Chef, and for the feedback!


Opscode, Inc
Joshua Timberman, Senior Solutions Engineer
C: 720.878.4322 E: joshua@opscode.com

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAkqnF1AACgkQO97WSdVpzT2j3ACgiAPt4V12cA3egS+bgvA0npyx
XzYAnRWP3kKa5+kcA1ZblJdgUJSutDZH
=s5Rt
-----END PGP SIGNATURE-----


#7

Cool! We’d love to hear about your successes with Chef. If you have any
issues, post to the list, or ask on IRC.

Today I got the client set up from the apt repository using this
bootstrap script

sudo echo “deb http://apt.opscode.com/ jaunty universe” >
/etc/apt/sources.list.d/opscode.list
wget -qO - http://apt.opscode.com/packages@opscode.com.gpg.key | sudo
apt-key add -
apt-get update
apt-get -y install ohai chef

chef-client --server https://chef.myserver.com

That installed all the dependencies and put chef-client and ohai in /usr/bin.

I then downloaded the opscode cookbooks to study them and wrote my own
"chef-client" cookbook just to try it. I copied the opscode cookbook
and hacked away at it all day.

Being an ornery sort of person I decided to really change the way the
recipe was written to see what would happen and how chef would react
to really broken cookbooks.

Long story short…

I was able to get the client and the server talking each other.
I was able to deploy my cookbooks
My cookbooks all passed the syntax check.
My cookbook is failing to run on the client with a cryptic message
which I have yet to hunt down.

/usr/lib/ruby/1.8/chef/mixin/template.rb:33:in render_template': undefined local variable or methodclient’ for
#Erubis::Context:0xb76d2010 (Chef::Mixin::Template::TemplateError)
from /usr/lib/ruby/1.8/chef/provider/template.rb:102:in action_create' from /usr/lib/ruby/1.8/chef/runner.rb:85:insend’
from /usr/lib/ruby/1.8/chef/runner.rb:85:in converge' from /usr/lib/ruby/1.8/chef/runner.rb:83:ineach’
from /usr/lib/ruby/1.8/chef/runner.rb:83:in converge' from /usr/lib/ruby/1.8/chef/resource_collection.rb:58:ineach’
from /usr/lib/ruby/1.8/chef/resource_collection.rb:57:in each' from /usr/lib/ruby/1.8/chef/runner.rb:61:inconverge’
from /usr/lib/ruby/1.8/chef/client.rb:382:in converge' from /usr/lib/ruby/1.8/chef/client.rb:82:inrun’
from /usr/lib/ruby/1.8/chef/application/client.rb:186:in run_application' from /usr/lib/ruby/1.8/chef/application/client.rb:178:inloop’
from /usr/lib/ruby/1.8/chef/application/client.rb:178:in run_application' from /usr/lib/ruby/1.8/chef/application.rb:57:inrun’
from /usr/bin/chef-client:25

Obviously one of my templates has an error in it.

BTW can chef email me the error? I would prefer that every time a
client had a problem it email me the message.


#8

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sep 8, 2009, at 10:50 PM, Tim Uckun wrote:

I was able to get the client and the server talking each other.
I was able to deploy my cookbooks
My cookbooks all passed the syntax check.
My cookbook is failing to run on the client with a cryptic message
which I have yet to hunt down.

/usr/lib/ruby/1.8/chef/mixin/template.rb:33:in render_template': undefined local variable or methodclient’ for
#Erubis::Context:0xb76d2010 (Chef::Mixin::Template::TemplateError)

Obviously one of my templates has an error in it.

Yes, that’s an error from the template generator (Erubis) in parsing
the template itself. Can you paste your template resource from the
recipe, and the template contents (.erb) either in a gist/pastie or in
a response?

BTW can chef email me the error? I would prefer that every time a
client had a problem it email me the message.

You could use Ruby’s Net::SMTP in a cookbook library to do this, or
set up a log parsing tool that reads the client output for messages
and emails you when it finds something interesting. For example I’ve
used logsurfer+ in the past for similar functionality (though not w/
Chef).

Net::SMTP RDoc:
http://www.ruby-doc.org/stdlib/libdoc/net/smtp/rdoc/index.html

Logsurfer+
http://www.crypt.gen.nz/logsurfer/


Opscode, Inc
Joshua Timberman, Senior Solutions Engineer
C: 720.878.4322 E: joshua@opscode.com

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAkqnztAACgkQO97WSdVpzT2dDwCcCQjYPRsVYQ51Ncw9+mNtcmvF
drcAnisUoaxb1ahos/w7RtJcHIcK0pTb
=CTax
-----END PGP SIGNATURE-----


#9

On Sep 7, 2009, at 10:05 PM, Joe Van Dyk wrote:

On Mon, Sep 7, 2009 at 11:12 AM, Adam Jacob adam@opscode.com wrote:
On Mon, Sep 7, 2009 at 3:57 AM, Brian CandlerB.Candler@pobox.com
wrote:

(3) In the environment where I work, the most likely need for chef
is in
distributing config files [*].

So I’m looking as to ways we can make this as safe and simple as
possible
for the sysadmins to use.

Have you any pointers to best practices on testing new recipes?
I don’t
really want to set up a whole dev/test environment just for
developing and
testing each recipe change - it would be too cumbersome. But
equally I don’t
want invalid recipes being pushed out, or worse, ones which
destroy data.

Testing recipes without taking the actions described is basically an
impossible task to get 100% correct. You can do dry runs, but any
time you check the run-time state of the system and that state could
have been modified previously in the run, you’re going to get a false
positive.

We’ve been working on ways to let you write real integration tests,
which would go a long way towards allowing you to do real testing in a
VM. As it stands, the best practice is to have a staging environment
that gets your new changes first, and only prop changes to production
when you are comfortable.

vmware has been really helpful to me here.

Ditto.

Our chef setup and workflow at Fotopedia is basically the following:

  • the production grid is 100% chef-managed, on several instances with
    some servers having the same role for scalability/redundancy.

  • the testing grid is very similar, with a lower instance count as
    scalability/redundancy is not a problem

  • everybody in the team runs an “infrabox” which is basically the
    whole infrastructure stack running in a VMWare instance. This is
    useful both for regular development of bits and pieces of our app, and
    chef/infrastructure work

    • when running in infrabox, the cookbooks attributes are slightly
      tweaked (avoid port conflicts, development mode for some components,
      some coobooks omitted). Some components are checked out by chef as in
      the testing/prod grids (through chef-deploy), and some can be manually
      switched to a shared folder for manual control (and textmate
      editing :slight_smile: ). Ditto for chef cookbooks (edited on the mac, tested on
      the VM).
    • changing things in chef on the infrabox is the first level of
      testing/dev, after we check it’s working on the testing grid in real
      multi-server mode and then prod.

We might write a blog post about this some day (but we first need to
update our 0.6 setup to 0.7).

Ol.

Fotonauts
Director, Server Software