Chef Environments - Logical vs. Physical

My company has a number of “Physical” environments (“clusters” ?), by
which I mean a group of VMs with a copy of the major company application on
for isolated use by a team/purpose - e.g. “Prod”, “Staging”, “Regression”,
“NFT”, “DevTeam1”, “DevTeam2”. They have some slight differences such as
domains, passwords, urls, base software policy, which I want as attributes
for cookbooks to consume. Additionally, most chef searches for other
nodes, e.g. to auto-find the database server, will need to be constrained
to the current physical environment.

Hence I started off using chef Environment objects & their attributes for
this, but have come across advice that the one-to-one mapping between what
most non-chef people think of as an “environment” and the chef concept is
not such a good thing. So I want to explore how to try using fewer
"Logical" or chef environments like “Live”, “Test” and “Dev”, so I can flip
individual VMs between them as required by the cluster/physical
environment’s users. Logical environments would probably have only version
constraints, and few attributes if any.

  1. Anybody tried the same shift to Logical environments & care to share
    experiences?
  2. How to represent the physical environment a node is in - e.g. a
    "cluster" attribute?
  3. How to represent the extra attributes - a databag keyed by cluster name?
    A cookbook which sets attributes via code (a lot of them are very similar
    except for urls containing the cluster name)?

Thanks,

Tristan.

On Wednesday, January 22, 2014 at 3:16 AM, Tristan Keen wrote:

My company has a number of "Physical" environments ("clusters" ?), by which I mean a group of VMs with a copy of the major company application on for isolated use by a team/purpose - e.g. "Prod", "Staging", "Regression", "NFT", "DevTeam1", "DevTeam2". They have some slight differences such as domains, passwords, urls, base software policy, which I want as attributes for cookbooks to consume. Additionally, most chef searches for other nodes, e.g. to auto-find the database server, will need to be constrained to the current physical environment.

Hence I started off using chef Environment objects & their attributes for this, but have come across advice that the one-to-one mapping between what most non-chef people think of as an "environment" and the chef concept is not such a good thing. So I want to explore how to try using fewer "Logical" or chef environments like "Live", "Test" and "Dev", so I can flip individual VMs between them as required by the cluster/physical environment's users. Logical environments would probably have only version constraints, and few attributes if any.

  1. Anybody tried the same shift to Logical environments & care to share experiences?
  2. How to represent the physical environment a node is in - e.g. a "cluster" attribute?
  3. How to represent the extra attributes - a databag keyed by cluster name? A cookbook which sets attributes via code (a lot of them are very similar except for urls containing the cluster name)?

Thanks,

Tristan.

The original intention of environments was that they’d map to things like “dev”, “stage”, “prod”. That said, there’s a fundamental assumption in the design that a given cookbook version will work correctly for all of the different server types you have in those environments. For stuff that’s relatively self-contained and maps well to the “logical” environment (centralized logging, user accounts, etc.), this is fine. However, for cookbooks that you combine to build a working application service, the interactions between different cookbooks (attribute names, assumptions about where files are, etc.) are often more important than whether a cookbook “works” in isolation. This means that you need to ensure that an updated cookbook works with all of the server types that belong to a chef environment when you promote that cookbook.

Depending on your workflow, the above might be fine, but you might want to also consider mapping environments to systems’ functional roles plus logical environment, e.g., loadbalancer+prodution. This blog post describes how to do that with berkshelf, but you could adapt the ideas to your own workflow if you prefer: Logdown - Site Maintenance

--
Daniel DeLeo

Thanks for your pointers. We'd have some trouble with the full Environment
Cookbook pattern as we host multiple lightweight service apps per large
(windows) server, so don't have one-node/server per application and one app
repo to put the Env Cookbook in. I also think the name "Environment
Cookbook" seems an odd choice - it seems more a "Role" cookbook to me, it
just acts on chef environments & is the only one allowed to contain the
exact version data.

However you're both basically saying that just three sets of version
numbers (dev/stage/prod) aren't enough, and control versions per server
type/role as well as per logical environment (dev/stage/prod). This I
might accept as battle-hardened advice, if I'd worry that human caution
would allow ancient versions of low-level cookbook to continue to be used,
leading to issues when they finally have to integrate the latest code.

Mechanically, one way to implement this is the micro-environment - e.g.
"loadbalancer+prod", alongside "loadbalancer+stage", "dbserver+prod", etc.,
with exact version locks set via "berks apply" from the Berksfile.lock of
the top level cookbook. Given I have Chef Server (lots of Berkshelf seems
to help chef-solo functionality), would I not get the same by setting exact
version numbers on all dependencies in my top level cookbook's depends
metadata, i.e. not using environments at all?

Tristan.

On 24 January 2014 17:32, Daniel DeLeo dan@kallistec.com wrote:

On Wednesday, January 22, 2014 at 3:16 AM, Tristan Keen wrote:

My company has a number of "Physical" environments ("clusters" ?), by
which I mean a group of VMs with a copy of the major company application on
for isolated use by a team/purpose - e.g. "Prod", "Staging", "Regression",
"NFT", "DevTeam1", "DevTeam2". They have some slight differences such as
domains, passwords, urls, base software policy, which I want as attributes
for cookbooks to consume. Additionally, most chef searches for other
nodes, e.g. to auto-find the database server, will need to be constrained
to the current physical environment.

Hence I started off using chef Environment objects & their attributes
for this, but have come across advice that the one-to-one mapping between
what most non-chef people think of as an "environment" and the chef concept
is not such a good thing. So I want to explore how to try using fewer
"Logical" or chef environments like "Live", "Test" and "Dev", so I can flip
individual VMs between them as required by the cluster/physical
environment's users. Logical environments would probably have only version
constraints, and few attributes if any.

  1. Anybody tried the same shift to Logical environments & care to share
    experiences?
  2. How to represent the physical environment a node is in - e.g. a
    "cluster" attribute?
  3. How to represent the extra attributes - a databag keyed by cluster
    name? A cookbook which sets attributes via code (a lot of them are very
    similar except for urls containing the cluster name)?

Thanks,

Tristan.

The original intention of environments was that they’d map to things like
“dev”, “stage”, “prod”. That said, there’s a fundamental assumption in the
design that a given cookbook version will work correctly for all of the
different server types you have in those environments. For stuff that’s
relatively self-contained and maps well to the “logical” environment
(centralized logging, user accounts, etc.), this is fine. However, for
cookbooks that you combine to build a working application service, the
interactions between different cookbooks (attribute names, assumptions
about where files are, etc.) are often more important than whether a
cookbook “works” in isolation. This means that you need to ensure that an
updated cookbook works with all of the server types that belong to a chef
environment when you promote that cookbook.

Depending on your workflow, the above might be fine, but you might want to
also consider mapping environments to systems’ functional roles plus
logical environment, e.g., loadbalancer+prodution. This blog post describes
how to do that with berkshelf, but you could adapt the ideas to your own
workflow if you prefer:
Logdown - Site Maintenance

--
Daniel DeLeo

Hi Tristan,

the way many(?) people use chef environmnents today is counter-intuitive
and does not map to "prod", "staging", "dev", etc., unfortunately.

If you start using chef environments for locking the whole dependency graph
of cookbook versions that is wrong IMHO. Unfortunately that's what many
people do and what tools like berkshelf propagate via berks apply and the
environment cookbook pattern. It solves the problem of locking cookbook
dependencies, but imho it does so with the wrong chef primitive, and thus
takes you the ability to use environments as they were initially thought of.

There was a controversial discussion about the environment cookbook pattern
within the blog post mentioned above. The disqus comments no longer show
up, but I have restored the conversation here:

If you follow the conversation that might give you some new ideas.

I firmly believe that the environment cookbook pattern is a smell and there
is a better solution for it, i.e. simply using the cookbook's metadata.rb.
In essence, the idea is:

  1. to use one "top-level" cookbook per node which locks all its transitive
    dependencies
  2. in the environment you specify only the top-level cookbook's version

This gives you some nice benefits, e.g.:

  • you can use environments for environments (e.g. "dev", "staging",
    "prod"), not micro-environments just for locking cookbook versions
  • environments are not polluted with "implementation details" (i.e. all the
    transitive cookbooks and versions)
  • promoting a cookbook in an environment means flipping exactly one
    version constraint in your environment
  • you can have multiple nodes with different versions of a dependent
    cookbook in the same environment (e.g. myql cookbook v3.0.12 on one node
    and v4.0.20 on another node)

The only downside is that berkshelf (which is an awesome tool btw) won't
support you very much with that(*)

Cheers,
Torben

(*) but you can easily use berks list to get the whole dependency graph
from Berksfile.lock and transform that into something you can paste into
your top-level cookbook's metadata.rb

On Sun, Jan 26, 2014 at 12:59 PM, Tristan Keen tristan.keen@gmail.comwrote:

Thanks for your pointers. We'd have some trouble with the full
Environment Cookbook pattern as we host multiple lightweight service apps
per large (windows) server, so don't have one-node/server per application
and one app repo to put the Env Cookbook in. I also think the name
"Environment Cookbook" seems an odd choice - it seems more a "Role"
cookbook to me, it just acts on chef environments & is the only one allowed
to contain the exact version data.

However you're both basically saying that just three sets of version
numbers (dev/stage/prod) aren't enough, and control versions per server
type/role as well as per logical environment (dev/stage/prod). This I
might accept as battle-hardened advice, if I'd worry that human caution
would allow ancient versions of low-level cookbook to continue to be used,
leading to issues when they finally have to integrate the latest code.

Mechanically, one way to implement this is the micro-environment - e.g.
"loadbalancer+prod", alongside "loadbalancer+stage", "dbserver+prod", etc.,
with exact version locks set via "berks apply" from the Berksfile.lock of
the top level cookbook. Given I have Chef Server (lots of Berkshelf seems
to help chef-solo functionality), would I not get the same by setting exact
version numbers on all dependencies in my top level cookbook's depends
metadata, i.e. not using environments at all?

Tristan.

On 24 January 2014 17:32, Daniel DeLeo dan@kallistec.com wrote:

On Wednesday, January 22, 2014 at 3:16 AM, Tristan Keen wrote:

My company has a number of "Physical" environments ("clusters" ?), by
which I mean a group of VMs with a copy of the major company application on
for isolated use by a team/purpose - e.g. "Prod", "Staging", "Regression",
"NFT", "DevTeam1", "DevTeam2". They have some slight differences such as
domains, passwords, urls, base software policy, which I want as attributes
for cookbooks to consume. Additionally, most chef searches for other
nodes, e.g. to auto-find the database server, will need to be constrained
to the current physical environment.

Hence I started off using chef Environment objects & their attributes
for this, but have come across advice that the one-to-one mapping between
what most non-chef people think of as an "environment" and the chef concept
is not such a good thing. So I want to explore how to try using fewer
"Logical" or chef environments like "Live", "Test" and "Dev", so I can flip
individual VMs between them as required by the cluster/physical
environment's users. Logical environments would probably have only version
constraints, and few attributes if any.

  1. Anybody tried the same shift to Logical environments & care to share
    experiences?
  2. How to represent the physical environment a node is in - e.g. a
    "cluster" attribute?
  3. How to represent the extra attributes - a databag keyed by cluster
    name? A cookbook which sets attributes via code (a lot of them are very
    similar except for urls containing the cluster name)?

Thanks,

Tristan.

The original intention of environments was that they’d map to things like
“dev”, “stage”, “prod”. That said, there’s a fundamental assumption in the
design that a given cookbook version will work correctly for all of the
different server types you have in those environments. For stuff that’s
relatively self-contained and maps well to the “logical” environment
(centralized logging, user accounts, etc.), this is fine. However, for
cookbooks that you combine to build a working application service, the
interactions between different cookbooks (attribute names, assumptions
about where files are, etc.) are often more important than whether a
cookbook “works” in isolation. This means that you need to ensure that an
updated cookbook works with all of the server types that belong to a chef
environment when you promote that cookbook.

Depending on your workflow, the above might be fine, but you might want
to also consider mapping environments to systems’ functional roles plus
logical environment, e.g., loadbalancer+prodution. This blog post describes
how to do that with berkshelf, but you could adapt the ideas to your own
workflow if you prefer:
Logdown - Site Maintenance

--
Daniel DeLeo

Thanks Torben - that does give me a few new ideas, but I hope Chef 12 comes
out soon with better primitives to help the confusion. In the mean time
I'll probably make up a cookbook to hold the physical-env-like data (e.g.
address of "production" mail server"). This is as I'm hearing talk from
others in the company of add-on mini-environments with nodes that want to
use a env data but not be in it (for search purposes) and other "special
cases" which makes me think using the chef environment primitive will be
too strict a boundary to keep DRY.

Tristan.

On 26 January 2014 12:59, Torben Knerr ukio@gmx.de wrote:

Hi Tristan,

the way many(?) people use chef environmnents today is counter-intuitive
and does not map to "prod", "staging", "dev", etc., unfortunately.

If you start using chef environments for locking the whole dependency
graph of cookbook versions that is wrong IMHO. Unfortunately that's what
many people do and what tools like berkshelf propagate via berks apply
and the environment cookbook pattern. It solves the problem of locking
cookbook dependencies, but imho it does so with the wrong chef primitive,
and thus takes you the ability to use environments as they were initially
thought of.

There was a controversial discussion about the environment cookbook
pattern within the blog post mentioned above. The disqus comments no longer
show up, but I have restored the conversation here:
Restored comments from DISQUS discussion about the environment cookbook pattern: http://vialstudios.logdown.com/posts/166848-the-environment-cookbook-pattern · GitHub

If you follow the conversation that might give you some new ideas.

I firmly believe that the environment cookbook pattern is a smell and
there is a better solution for it, i.e. simply using the cookbook's
metadata.rb. In essence, the idea is:

  1. to use one "top-level" cookbook per node which locks all its transitive
    dependencies
  2. in the environment you specify only the top-level cookbook's version

This gives you some nice benefits, e.g.:

  • you can use environments for environments (e.g. "dev", "staging",
    "prod"), not micro-environments just for locking cookbook versions
  • environments are not polluted with "implementation details" (i.e. all
    the transitive cookbooks and versions)
  • promoting a cookbook in an environment means flipping exactly one
    version constraint in your environment
  • you can have multiple nodes with different versions of a dependent
    cookbook in the same environment (e.g. myql cookbook v3.0.12 on one node
    and v4.0.20 on another node)

The only downside is that berkshelf (which is an awesome tool btw) won't
support you very much with that(*)

Cheers,
Torben

(*) but you can easily use berks list to get the whole dependency graph
from Berksfile.lock and transform that into something you can paste into
your top-level cookbook's metadata.rb

On Sun, Jan 26, 2014 at 12:59 PM, Tristan Keen tristan.keen@gmail.comwrote:

Thanks for your pointers. We'd have some trouble with the full
Environment Cookbook pattern as we host multiple lightweight service apps
per large (windows) server, so don't have one-node/server per application
and one app repo to put the Env Cookbook in. I also think the name
"Environment Cookbook" seems an odd choice - it seems more a "Role"
cookbook to me, it just acts on chef environments & is the only one allowed
to contain the exact version data.

However you're both basically saying that just three sets of version
numbers (dev/stage/prod) aren't enough, and control versions per server
type/role as well as per logical environment (dev/stage/prod). This I
might accept as battle-hardened advice, if I'd worry that human caution
would allow ancient versions of low-level cookbook to continue to be used,
leading to issues when they finally have to integrate the latest code.

Mechanically, one way to implement this is the micro-environment - e.g.
"loadbalancer+prod", alongside "loadbalancer+stage", "dbserver+prod", etc.,
with exact version locks set via "berks apply" from the Berksfile.lock of
the top level cookbook. Given I have Chef Server (lots of Berkshelf seems
to help chef-solo functionality), would I not get the same by setting exact
version numbers on all dependencies in my top level cookbook's depends
metadata, i.e. not using environments at all?

Tristan.

On 24 January 2014 17:32, Daniel DeLeo dan@kallistec.com wrote:

On Wednesday, January 22, 2014 at 3:16 AM, Tristan Keen wrote:

My company has a number of "Physical" environments ("clusters" ?), by
which I mean a group of VMs with a copy of the major company application on
for isolated use by a team/purpose - e.g. "Prod", "Staging", "Regression",
"NFT", "DevTeam1", "DevTeam2". They have some slight differences such as
domains, passwords, urls, base software policy, which I want as attributes
for cookbooks to consume. Additionally, most chef searches for other
nodes, e.g. to auto-find the database server, will need to be constrained
to the current physical environment.

Hence I started off using chef Environment objects & their attributes
for this, but have come across advice that the one-to-one mapping between
what most non-chef people think of as an "environment" and the chef concept
is not such a good thing. So I want to explore how to try using fewer
"Logical" or chef environments like "Live", "Test" and "Dev", so I can flip
individual VMs between them as required by the cluster/physical
environment's users. Logical environments would probably have only version
constraints, and few attributes if any.

  1. Anybody tried the same shift to Logical environments & care to share
    experiences?
  2. How to represent the physical environment a node is in - e.g. a
    "cluster" attribute?
  3. How to represent the extra attributes - a databag keyed by cluster
    name? A cookbook which sets attributes via code (a lot of them are very
    similar except for urls containing the cluster name)?

Thanks,

Tristan.

The original intention of environments was that they’d map to things
like “dev”, “stage”, “prod”. That said, there’s a fundamental assumption in
the design that a given cookbook version will work correctly for all of the
different server types you have in those environments. For stuff that’s
relatively self-contained and maps well to the “logical” environment
(centralized logging, user accounts, etc.), this is fine. However, for
cookbooks that you combine to build a working application service, the
interactions between different cookbooks (attribute names, assumptions
about where files are, etc.) are often more important than whether a
cookbook “works” in isolation. This means that you need to ensure that an
updated cookbook works with all of the server types that belong to a chef
environment when you promote that cookbook.

Depending on your workflow, the above might be fine, but you might want
to also consider mapping environments to systems’ functional roles plus
logical environment, e.g., loadbalancer+prodution. This blog post describes
how to do that with berkshelf, but you could adapt the ideas to your own
workflow if you prefer:
Logdown - Site Maintenance

--
Daniel DeLeo