Environments vs. Metadata vs. Policyfile for locking cookbook dependencies

Hi everybody,

we recently started a discussion on the different ways you can lock the
cookbook dependency graph for a given node:

  1. use Chef environments
  2. use metadata.rb
  3. use Policyfile (work in progress)

The discussion of environments vs. metadata started here on the list...
http://lists.opscode.com/sympa/arc/chef/2014-05/msg00324.html

...then continued on a github with @danielsdeleo, proposing a new mechanism
called Policyfile:

Looking at the Policyfile approach, I like how the sketched terminal
session reads [1], but I'm also afraid that it will add even more
possibilities on how to center your workflow around Chef. It would be the
next competitor to "roles" vs. "environment cookbooks" vs. "metadata" vs.
"policyfile".

So I'm wondering: can't we solve the problem with the tools at hand or do
we have to invent something new for it?

Almost everything that is described in the desgin principles [2] could be
easily solved by using cookbooks + metadata.rb today. You just have to make
sure that the contents of Berksfile.lock get translated into metadata.rb
depends statements. That's what I was calling "top-level" cookbook but you
could also call it "policy cookbook" if you will.

Think of this:

  • for each node you have 1 top-level cookbook
  • this top-level cookbook has all the pinned versions from Berksfile.lock
    via "depends" statements in its metadata.rb
  • in the the node's run_list is just the default recipe (or any other) of
    the top-level cookbook, which defines the actual "run_list" that you would
    otherwise have in roles via "include_recipe"
  • your top-level cookbook for sure has a version, as any other cookbook too
  • in your environments you only pin the top-level cookbook's version (all
    transitive dependencies are pinned via the top-level cookbook's metadata)

What would be missing from this approach?

The only new thing proposed in [2] was the ability to uniquely identify
non-released development versions. The proposal suggests to use a hash over
the cookbook contents instead of name + version. I see the need for
properly identifying in-development versions as well, but IMO using a hash
would just obscure things. Instead I would rather see prerelease and build
identifiers being supported by the Chef cookbook versioning scheme (see
CHEF-4027 [3]) -> as we have for Ruby gems today.

Just saying: let's start simplifying and improve the tools around the
concepts we currently have, not inventing additional and competing concepts
that make everything more complex. Reuse the existing concepts, establish
conventions, and foster them by making sure the tools we use promote them.

That was quite much to digest... hope that makes still sense :slight_smile:

Cheers,
Torben

[1]


[2]

[3] https://tickets.opscode.com/browse/CHEF-4027

On Friday, June 13, 2014 at 7:19 AM, Torben Knerr wrote:

Hi everybody,

we recently started a discussion on the different ways you can lock the cookbook dependency graph for a given node:

  1. use Chef environments
  2. use metadata.rb
  3. use Policyfile (work in progress)

The discussion of environments vs. metadata started here on the list...
chef - [chef] Re: Re: Re: Re: Re: Re: More on Cookbook Design Patterns

...then continued on a github with @danielsdeleo, proposing a new mechanism called Policyfile:
Add `berks apply_metadata` command · Issue #724 · berkshelf/berkshelf · GitHub

Looking at the Policyfile approach, I like how the sketched terminal session reads [1], but I'm also afraid that it will add even more possibilities on how to center your workflow around Chef. It would be the next competitor to "roles" vs. "environment cookbooks" vs. "metadata" vs. "policyfile".
Long term, Policyfile should be the default workflow. It eliminates roles vs. role cookbooks as a point of contention by solving the primary flaw of roles, which is the fact that you can mutate them on live production nodes accidentally.

So I'm wondering: can't we solve the problem with the tools at hand or do we have to invent something new for it?

Almost everything that is described in the desgin principles [2] could be easily solved by using cookbooks + metadata.rb today. You just have to make sure that the contents of Berksfile.lock get translated into metadata.rb depends statements. That's what I was calling "top-level" cookbook but you could also call it "policy cookbook" if you will.

Think of this:

  • for each node you have 1 top-level cookbook
  • this top-level cookbook has all the pinned versions from Berksfile.lock via "depends" statements in its metadata.rb
  • in the the node's run_list is just the default recipe (or any other) of the top-level cookbook, which defines the actual "run_list" that you would otherwise have in roles via "include_recipe"
  • your top-level cookbook for sure has a version, as any other cookbook too
  • in your environments you only pin the top-level cookbook's version (all transitive dependencies are pinned via the top-level cookbook's metadata)

What would be missing from this approach?
You give up a significant amount of control over the order in which cookbooks are run. Everyone using role cookbooks eventually hits the problem where computed attributes in attributes files have incorrect values because role cookbooks need dependencies to be loaded after themselves. This is unavoidable because cookbooks aren’t roles. That said, I won’t say that role cookbooks are bad per se, I just think Chef doesn’t give you the tools to compose behaviors without exposing you to unnecessary risk. Policyfiles fix that.

The only new thing proposed in [2] was the ability to uniquely identify non-released development versions. The proposal suggests to use a hash over the cookbook contents instead of name + version. I see the need for properly identifying in-development versions as well, but IMO using a hash would just obscure things. Instead I would rather see prerelease and build identifiers being supported by the Chef cookbook versioning scheme (see CHEF-4027 [3]) -> as we have for Ruby gems today.
What you’re saying isn’t actually how people really use rubygems today. If I need to run chef-client in development with a development version of ohai, I don’t release Ohai-7.5.0-dans_crazy_idea.5 to rubygems.org (also, you don’t have push rights for ohai on rubygems.org so you can’t do that), I point my Gemfile at a path or git source. You should be able to do the same with Chef and let Chef figure out how to put the pieces together on a remote node.

As I replied on the ticket, we understand that some random hex string is pretty meaningless to a human, which is why the Policyfile.lock will contain a fair amount of contextual information about cookbooks. This includes the source (local disk, community site, chef server, github), the semver version of the cookbook, and where relevant, git info such as the commit SHA, whether or not the repo is dirty, the git remote, and whether commits are synchronized to the remote. Before you apply this Policyfile.lock to any nodes, you can review a diff of all of this.

On the ticket, you said 'to me it sounds still like a "build identifier”’, which it is. chef-server and chef-client will talk to each other in terms of build identifiers. You also said, 'When talking to my colleagues I'd personally rather talk about apache-2.0.0 rather than f59ee7a5bca6a4e606b67f7f856b768d847c39bb’. You can totally have both. Part of the workflow we’re designing is that you compute the lockfile and can look at it, commit it to source control if desired, etc. before you apply it to anything. And the lockfile contains more than enough information for you to talk about the cookbook in human-understandable terms. For example, look at the lockfiles in the tests: Initial specification of policyfile builder by danielsdeleo · Pull Request #53 · chef-boneyard/chef-dk · GitHub (there is a tiny bit of indirection there to make the tests more maintainable). You can talk with your colleagues about “apache 2.0.0 that came from this github repo” or a cookbook that was uploaded but not committed to source control, or you can see in a diff that you changed from the mainline version of a cookbook to your own fork.

And all of this happens automatically. You don’t have to spend your own time editing version numbers to encode this information. That’s both crap work and error prone.

Just saying: let's start simplifying and improve the tools around the concepts we currently have, not inventing additional and competing concepts that make everything more complex. Reuse the existing concepts, establish conventions, and foster them by making sure the tools we use promote them.

I don’t think the existing concepts are good enough. I say this as someone who helped to design them. The way people used and thought about chef, cookbooks, etc. when we created environments, for example, was totally different than the understanding we have today. Based on what we’ve learned from other people and new tools, we can now see a way to make chef and chef-server work in a better, safer, and more humane way. So we should.

That was quite much to digest... hope that makes still sense :slight_smile:

Cheers,
Torben

[1] https://github.com/danielsdeleo/chef-workflow2-prototype/blob/master/docs/demo-script.txt
[2] https://github.com/danielsdeleo/chef-workflow2-prototype/blob/master/docs/design-principles.md
[3] https://tickets.opscode.com/browse/CHEF-4027

--
Daniel DeLeo

On Fri, Jun 13, 2014 at 5:06 PM, Daniel DeLeo dan@kallistec.com wrote:

On Friday, June 13, 2014 at 7:19 AM, Torben Knerr wrote:

Hi everybody,

we recently started a discussion on the different ways you can lock the
cookbook dependency graph for a given node:

  1. use Chef environments
  2. use metadata.rb
  3. use Policyfile (work in progress)

The discussion of environments vs. metadata started here on the list...
chef - [chef] Re: Re: Re: Re: Re: Re: More on Cookbook Design Patterns

...then continued on a github with @danielsdeleo, proposing a new
mechanism called Policyfile:
Add `berks apply_metadata` command · Issue #724 · berkshelf/berkshelf · GitHub

Looking at the Policyfile approach, I like how the sketched terminal
session reads [1], but I'm also afraid that it will add even more
possibilities on how to center your workflow around Chef. It would be the
next competitor to "roles" vs. "environment cookbooks" vs. "metadata" vs.
"policyfile".
Long term, Policyfile should be the default workflow. It eliminates roles
vs. role cookbooks as a point of contention by solving the primary flaw of
roles, which is the fact that you can mutate them on live production nodes
accidentally.

​Sounds good.

Still I can not see the full picture of the Policyfile yet, so more
questions following inline...

So I'm wondering: can't we solve the problem with the tools at hand or
do we have to invent something new for it?

Almost everything that is described in the desgin principles [2] could
be easily solved by using cookbooks + metadata.rb today. You just have to
make sure that the contents of Berksfile.lock get translated into
metadata.rb depends statements. That's what I was calling "top-level"
cookbook but you could also call it "policy cookbook" if you will.

Think of this:

  • for each node you have 1 top-level cookbook
  • this top-level cookbook has all the pinned versions from
    Berksfile.lock via "depends" statements in its metadata.rb
  • in the the node's run_list is just the default recipe (or any other)
    of the top-level cookbook, which defines the actual "run_list" that you
    would otherwise have in roles via "include_recipe"
  • your top-level cookbook for sure has a version, as any other cookbook
    too
  • in your environments you only pin the top-level cookbook's version
    (all transitive dependencies are pinned via the top-level cookbook's
    metadata)

What would be missing from this approach?
You give up a significant amount of control over the order in which
cookbooks are run. Everyone using role cookbooks eventually hits the
problem where computed attributes in attributes files have incorrect values
because role cookbooks need dependencies to be loaded after themselves.
This is unavoidable because cookbooks aren’t roles. That said, I won’t say
that role cookbooks are bad per se, I just think Chef doesn’t give you the
tools to compose behaviors without exposing you to unnecessary risk.
Policyfiles fix that.

​Nice catch. Indeed it looks like you have full control over the odering
of cookbooks by just include_recipe them in the order you want, but this
is just the run order, not the load order that is relevant for the computed
attributes.

In the meantime I know how to fix computed attributes (see
http://docs.opscode.com/chef/essentials_cookbook_recipes.html#reload-attributes)
but it's both a) not nice and b) a sure pitfall...

The only new thing proposed in [2] was the ability to uniquely identify
non-released development versions. The proposal suggests to use a hash over
the cookbook contents instead of name + version. I see the need for
properly identifying in-development versions as well, but IMO using a hash
would just obscure things. Instead I would rather see prerelease and build
identifiers being supported by the Chef cookbook versioning scheme (see
CHEF-4027 [3]) -> as we have for Ruby gems today.
What you’re saying isn’t actually how people really use rubygems today. If
I need to run chef-client in development with a development version of
ohai, I don’t release Ohai-7.5.0-dans_crazy_idea.5 to rubygems.org (also,
you don’t have push rights for ohai on rubygems.org so you can’t do
that), I point my Gemfile at a path or git source. You should be able to do
the same with Chef and let Chef figure out how to put the pieces together
on a remote node.


You are talking about the dependencies of a "top-level" thing here (still
have no better name). In the PR linked below these would be foo, bar,
baz and dep_of_bar which are dependencies of basic_example. These
might be your own or other people's community cookbooks. And yes, you will
likely make modifications to them that are not released or published to
rubygems.org and just live in your fork of the git repo.

It's still unclear to me how the Policyfile itself would be versioned,
published and being referenced (e.g. from within an environment).

The Policyfile before compilation looks like a combination of Berksfile +
Role (see

)
After compilation the Policyfile.lock has everything we need to uniquely
identify these dependencies, looking quite similar to Berksfile.lock + Role

Can you say that Policyfile ~= Roles + Versioned Dependencies?

As I replied on the ticket, we understand that some random hex string is

pretty meaningless to a human, which is why the Policyfile.lock will
contain a fair amount of contextual information about cookbooks. This
includes the source (local disk, community site, chef server, github), the
semver version of the cookbook, and where relevant, git info such as the
commit SHA, whether or not the repo is dirty, the git remote, and whether
commits are synchronized to the remote. Before you apply this
Policyfile.lock to any nodes, you can review a diff of all of this.

​As a human I'm rather concerned on how to edit the uncompiled Policyfile.
I guess it's like in a Berksfile where you specify {name + version +
source} but then the compiled Policyfile.lock contains​ all the additional
info (e.g. the locked git rev for example)?

On the ticket, you said 'to me it sounds still like a "build identifier”’,
which it is. chef-server and chef-client will talk to each other in terms
of build identifiers. You also said, 'When talking to my colleagues I'd
personally rather talk about apache-2.0.0 rather than
f59ee7a5bca6a4e606b67f7f856b768d847c39bb’. You can totally have both. Part
of the workflow we’re designing is that you compute the lockfile and can
look at it, commit it to source control if desired, etc. before you apply
it to anything. And the lockfile contains more than enough information for
you to talk about the cookbook in human-understandable terms. For example,
look at the lockfiles in the tests:
Initial specification of policyfile builder by danielsdeleo · Pull Request #53 · chef-boneyard/chef-dk · GitHub
(there is a tiny bit of indirection there to make the tests more
maintainable). You can talk with your colleagues about “apache 2.0.0 that
came from this github repo” or a cookbook that was uploaded but not
committed to source control, or you can see in a diff that you changed from
the mainline version of a cookbook to your own fork.

And all of this happens automatically. You don’t have to spend your own
time editing version numbers to encode this information. That’s both crap
work and error prone.

Thanks, that answered a lot.

In summary, what's still unclear to me is this:

  1. How do you specify a specific version of a cookbook dependency in the
    (uncompiled) Policyfile? Simply {name + version + source} like in a
    Berksfile or work with hashes here already?

  2. How would you specify a version for the top-level basic_example thing?
    Can you assign a dotted numeric X.Y.Z version like for cookbooks, or will
    it get a computed hash as well?

  3. How are the Policyfiles versioned, published and being referenced?

  4. Can you reference a Policyfile from within an environment?

Just saying: let's start simplifying and improve the tools around the
concepts we currently have, not inventing additional and competing concepts
that make everything more complex. Reuse the existing concepts, establish
conventions, and foster them by making sure the tools we use promote them.

I don’t think the existing concepts are good enough. I say this as someone
who helped to design them. The way people used and thought about chef,
cookbooks, etc. when we created environments, for example, was totally
different than the understanding we have today. Based on what we’ve learned
from other people and new tools, we can now see a way to make chef and
chef-server work in a better, safer, and more humane way. So we should.

​+1

I assume (even though not explicitly mentioned) that the new Policyfile
mechanism would work for chef-solo as well, does it?

Thanks for all the lengthy details, the picture gets clearer... :slight_smile:

Cheers,
Torben

That was quite much to digest... hope that makes still sense :slight_smile:

Cheers,
Torben

[1]
https://github.com/danielsdeleo/chef-workflow2-prototype/blob/master/docs/demo-script.txt
[2]
https://github.com/danielsdeleo/chef-workflow2-prototype/blob/master/docs/design-principles.md
[3] https://tickets.opscode.com/browse/CHEF-4027

--
Daniel DeLeo

On Friday, June 13, 2014 at 9:16 AM, Torben Knerr wrote:

On Fri, Jun 13, 2014 at 5:06 PM, Daniel DeLeo <dan@kallistec.com (mailto:dan@kallistec.com)> wrote:

On Friday, June 13, 2014 at 7:19 AM, Torben Knerr wrote:

Hi everybody,

we recently started a discussion on the different ways you can lock the cookbook dependency graph for a given node:

  1. use Chef environments
  2. use metadata.rb
  3. use Policyfile (work in progress)

The discussion of environments vs. metadata started here on the list...
chef - [chef] Re: Re: Re: Re: Re: Re: More on Cookbook Design Patterns

...then continued on a github with @danielsdeleo, proposing a new mechanism called Policyfile:
Add `berks apply_metadata` command · Issue #724 · berkshelf/berkshelf · GitHub

Looking at the Policyfile approach, I like how the sketched terminal session reads [1], but I'm also afraid that it will add even more possibilities on how to center your workflow around Chef. It would be the next competitor to "roles" vs. "environment cookbooks" vs. "metadata" vs. "policyfile".
Long term, Policyfile should be the default workflow. It eliminates roles vs. role cookbooks as a point of contention by solving the primary flaw of roles, which is the fact that you can mutate them on live production nodes accidentally.

​Sounds good.

Still I can not see the full picture of the Policyfile yet, so more questions following inline...

So I'm wondering: can't we solve the problem with the tools at hand or do we have to invent something new for it?

Almost everything that is described in the desgin principles [2] could be easily solved by using cookbooks + metadata.rb today. You just have to make sure that the contents of Berksfile.lock get translated into metadata.rb depends statements. That's what I was calling "top-level" cookbook but you could also call it "policy cookbook" if you will.

Think of this:

  • for each node you have 1 top-level cookbook
  • this top-level cookbook has all the pinned versions from Berksfile.lock via "depends" statements in its metadata.rb
  • in the the node's run_list is just the default recipe (or any other) of the top-level cookbook, which defines the actual "run_list" that you would otherwise have in roles via "include_recipe"
  • your top-level cookbook for sure has a version, as any other cookbook too
  • in your environments you only pin the top-level cookbook's version (all transitive dependencies are pinned via the top-level cookbook's metadata)

What would be missing from this approach?
You give up a significant amount of control over the order in which cookbooks are run. Everyone using role cookbooks eventually hits the problem where computed attributes in attributes files have incorrect values because role cookbooks need dependencies to be loaded after themselves. This is unavoidable because cookbooks aren’t roles. That said, I won’t say that role cookbooks are bad per se, I just think Chef doesn’t give you the tools to compose behaviors without exposing you to unnecessary risk. Policyfiles fix that.

​Nice catch. Indeed it looks like you have full control over the odering of cookbooks by just include_recipe them in the order you want, but this is just the run order, not the load order that is relevant for the computed attributes.

In the meantime I know how to fix computed attributes (see http://docs.opscode.com/chef/essentials_cookbook_recipes.html#reload-attributes) but it's both a) not nice and b) a sure pitfall...

The only new thing proposed in [2] was the ability to uniquely identify non-released development versions. The proposal suggests to use a hash over the cookbook contents instead of name + version. I see the need for properly identifying in-development versions as well, but IMO using a hash would just obscure things. Instead I would rather see prerelease and build identifiers being supported by the Chef cookbook versioning scheme (see CHEF-4027 [3]) -> as we have for Ruby gems today.
What you’re saying isn’t actually how people really use rubygems today. If I need to run chef-client in development with a development version of ohai, I don’t release Ohai-7.5.0-dans_crazy_idea.5 to rubygems.org (http://rubygems.org) (also, you don’t have push rights for ohai on rubygems.org (http://rubygems.org) so you can’t do that), I point my Gemfile at a path or git source. You should be able to do the same with Chef and let Chef figure out how to put the pieces together on a remote node.


You are talking about the dependencies of a "top-level" thing here (still have no better name). In the PR linked below these would be foo, bar, baz and dep_of_bar which are dependencies of basic_example. These might be your own or other people's community cookbooks. And yes, you will likely make modifications to them that are not released or published to rubygems.org (http://rubygems.org) and just live in your fork of the git repo.

It's still unclear to me how the Policyfile itself would be versioned, published and being referenced (e.g. from within an environment).

The Policyfile before compilation looks like a combination of Berksfile + Role (see https://github.com/danielsdeleo/chef-workflow2-prototype/blob/master/docs/demo-script.txt#L20-37)
After compilation the Policyfile.lock has everything we need to uniquely identify these dependencies, looking quite similar to Berksfile.lock + Role

Can you say that Policyfile ~= Roles + Versioned Dependencies?
Policyfiles provide one of the features of roles, which is to delegate a node’s run list to another object. In the uncompiled Policyfile.rb, you’ll be able to specify a run list including roles if you want to use roles to define composable bits of functionality. These will be expanded when the compiled Policyfile.lock is generated which protects you from breaking unrelated systems when updating roles.

As I replied on the ticket, we understand that some random hex string is pretty meaningless to a human, which is why the Policyfile.lock will contain a fair amount of contextual information about cookbooks. This includes the source (local disk, community site, chef server, github), the semver version of the cookbook, and where relevant, git info such as the commit SHA, whether or not the repo is dirty, the git remote, and whether commits are synchronized to the remote. Before you apply this Policyfile.lock to any nodes, you can review a diff of all of this.

​As a human I'm rather concerned on how to edit the uncompiled Policyfile. I guess it's like in a Berksfile where you specify {name + version + source} but then the compiled Policyfile.lock contains​ all the additional info (e.g. the locked git rev for example)?
It’s just like a Berksfile, you ask for cookbooks using the concepts that make sense for the systems that store them. For a system that stores released artifacts according to a name and X.Y.Z version number, that’s version constraints, for git it’s SHA/branch/tag, for your filesystem it’s a path, etc.

On the ticket, you said 'to me it sounds still like a "build identifier”’, which it is. chef-server and chef-client will talk to each other in terms of build identifiers. You also said, 'When talking to my colleagues I'd personally rather talk about apache-2.0.0 rather than f59ee7a5bca6a4e606b67f7f856b768d847c39bb’. You can totally have both. Part of the workflow we’re designing is that you compute the lockfile and can look at it, commit it to source control if desired, etc. before you apply it to anything. And the lockfile contains more than enough information for you to talk about the cookbook in human-understandable terms. For example, look at the lockfiles in the tests: Initial specification of policyfile builder by danielsdeleo · Pull Request #53 · chef-boneyard/chef-dk · GitHub (there is a tiny bit of indirection there to make the tests more maintainable). You can talk with your colleagues about “apache 2.0.0 that came from this github repo” or a cookbook that was uploaded but not committed to source control, or you can see in a diff that you changed from the mainline version of a cookbook to your own fork.

And all of this happens automatically. You don’t have to spend your own time editing version numbers to encode this information. That’s both crap work and error prone.

Thanks, that answered a lot.

In summary, what's still unclear to me is this:

  1. How do you specify a specific version of a cookbook dependency in the (uncompiled) Policyfile? Simply {name + version + source} like in a Berksfile or work with hashes here already?
    You use version constraint operators in the uncompiled Policyfile.rb, just like with berks.
  1. How would you specify a version for the top-level basic_example thing? Can you assign a dotted numeric X.Y.Z version like for cookbooks, or will it get a computed hash as well?
    Policies are probably not going to have explicit versioning support. We want to have a strong relation between nodes and policies where a node unambiguously belongs to exactly one policy, which when combined with the strong relation between policies and cookbooks will make it a lot easier to answer questions like “what code will node ‘foo’ run when I run chef-client?”. If you want to apply an updated policy to some nodes, you can either have an explicit canary group in your process or create a new container and migrate nodes from the old one to the new one.
  1. How are the Policyfiles versioned, published and being referenced?
    Policies won’t have versions (aside from what you do on your own in your version control system). The exact mechanism by which a node is assigned to a policy hasn’t been decided yet. We have some prototype code that uses data bags to store the policies, see:

In the current prototype, you’d name your policies something like $functional_role-$deployment_group where $functional_role is something like appserver/load balancer/database/etc. and $deployment group could map to your environments, groups within environments (for example, if you deploy to production on a cluster-by-cluster basis, you could have prod-cluster-a, prod-cluster-b, etc.) or whatever makes sense to you. Whether or not the deployment group concept becomes a bit more first class when we implement the APIs is something we haven’t decided yet.

  1. Can you reference a Policyfile from within an environment?
    With policies, all the environment version specification stuff goes away. We may keep them around as a place to store attributes, and it’s possible that nodes will associate to policies by policyname plus environment. Contrarily, we might use a different name for policy containers, or design the system in such a way that the “containers” are implicit.

Just saying: let's start simplifying and improve the tools around the concepts we currently have, not inventing additional and competing concepts that make everything more complex. Reuse the existing concepts, establish conventions, and foster them by making sure the tools we use promote them.

I don’t think the existing concepts are good enough. I say this as someone who helped to design them. The way people used and thought about chef, cookbooks, etc. when we created environments, for example, was totally different than the understanding we have today. Based on what we’ve learned from other people and new tools, we can now see a way to make chef and chef-server work in a better, safer, and more humane way. So we should.

​+1

I assume (even though not explicitly mentioned) that the new Policyfile mechanism would work for chef-solo as well, does it?
Exactly how it works with chef-solo is to be determined. Since chef-solo gets cookbooks from local disk, the question of supporting multiple versions with solo is pretty awkward.

Thanks for all the lengthy details, the picture gets clearer... :slight_smile:

Cheers,
Torben

That was quite much to digest... hope that makes still sense :slight_smile:

Cheers,
Torben

[1] https://github.com/danielsdeleo/chef-workflow2-prototype/blob/master/docs/demo-script.txt
[2] https://github.com/danielsdeleo/chef-workflow2-prototype/blob/master/docs/design-principles.md
[3] https://tickets.opscode.com/browse/CHEF-4027

--
Daniel DeLeo

--
Daniel DeLeo

Hi Daniel,

thanks for all the explanation, that makes my picture about Policyfile much
clearer.

Few comments inline...

On Mon, Jun 16, 2014 at 5:20 PM, Daniel DeLeo dan@kallistec.com wrote:

  1. How are the Policyfiles versioned, published and being referenced?
    Policies won’t have versions (aside from what you do on your own in your
    version control system). The exact mechanism by which a node is assigned to
    a policy hasn’t been decided yet. We have some prototype code that uses
    data bags to store the policies, see:

https://github.com/opscode/chef/blob/master/lib/chef/policy_builder/policyfile.rb

https://github.com/opscode/chef/blob/master/spec/unit/policy_builder/policyfile_spec.rb
https://github.com/opscode/chef/blob/master/lib/chef/config.rb#L343-351

In the current prototype, you’d name your policies something like
$functional_role-$deployment_group where $functional_role is something like
appserver/load balancer/database/etc. and $deployment group could map to
your environments, groups within environments (for example, if you deploy
to production on a cluster-by-cluster basis, you could have prod-cluster-a,
prod-cluster-b, etc.) or whatever makes sense to you. Whether or not the
deployment group concept becomes a bit more first class when we implement
the APIs is something we haven’t decided yet.

​Does that mean that when the policy is named "appserver-prod" it would
automatically apply the "prod" environment to the node?

Or alternatively: will it be possible to bootstrap a node with a Policyfile
AND a Chef environment (e.g. knife bootstrap --policy "appservers" --environment "prod" ...)?​

  1. Can you reference a Policyfile from within an environment?
    With policies, all the environment version specification stuff goes away.
    We may keep them around as a place to store attributes, and it’s possible
    that nodes will associate to policies by policyname plus environment.
    Contrarily, we might use a different name for policy containers, or design
    the system in such a way that the “containers” are implicit.

​So basically that means deprecating the cookbook and cookbook_versions
from environments, right?

I would still vote for keeping environments though, because they allow you
to set common attributes across a set of arbitrary nodes which might have
totally different policies. ​

I assume (even though not explicitly mentioned) that the new Policyfile
mechanism would work for chef-solo as well, does it?
Exactly how it works with chef-solo is to be determined. Since chef-solo
gets cookbooks from local disk, the question of supporting multiple
versions with solo is pretty awkward.

​As a long time, happy Chef solo user​ I would hope that it would work in a
similar way like with a Berksfile today: just as berks install collects
all cookbook versions from Berksfile.lock and puts them into a separate
directory so it can be used as the cookbook repo for Chef solo, I would
expect that with Policyfile its working in a similar way.

My main use case is Chef solo with Vagrant plus the awesome vagrant-omnibus
and vagrant-berkshelf plugins.

Btw: what's the role of Berkshelf with Policyfile? Will it still be used
for resolving the dependency graph?

Cheers,
Torben

On Thursday, June 19, 2014 at 5:54 AM, Torben Knerr wrote:

Hi Daniel,

thanks for all the explanation, that makes my picture about Policyfile much clearer.

Few comments inline...

On Mon, Jun 16, 2014 at 5:20 PM, Daniel DeLeo <dan@kallistec.com (mailto:dan@kallistec.com)> wrote:

  1. How are the Policyfiles versioned, published and being referenced?
    Policies won’t have versions (aside from what you do on your own in your version control system). The exact mechanism by which a node is assigned to a policy hasn’t been decided yet. We have some prototype code that uses data bags to store the policies, see:

https://github.com/opscode/chef/blob/master/lib/chef/policy_builder/policyfile.rb
https://github.com/opscode/chef/blob/master/spec/unit/policy_builder/policyfile_spec.rb
https://github.com/opscode/chef/blob/master/lib/chef/config.rb#L343-351

In the current prototype, you’d name your policies something like $functional_role-$deployment_group where $functional_role is something like appserver/load balancer/database/etc. and $deployment group could map to your environments, groups within environments (for example, if you deploy to production on a cluster-by-cluster basis, you could have prod-cluster-a, prod-cluster-b, etc.) or whatever makes sense to you. Whether or not the deployment group concept becomes a bit more first class when we implement the APIs is something we haven’t decided yet.

​Does that mean that when the policy is named "appserver-prod" it would automatically apply the "prod" environment to the node?

Or alternatively: will it be possible to bootstrap a node with a Policyfile AND a Chef environment (e.g. knife bootstrap --policy "appservers" --environment "prod" ...)?​
In the current “compatibility mode” implementation, you cannot use environments and Policyfiles at the same time. You must identify your policies with a single string, so you’d name them as $functional_role-$deployment_stage. This decision was forced by the requirement to use existing data bags as the storage mechanism and won’t necessarily be what we do in the final implementation.

  1. Can you reference a Policyfile from within an environment?
    With policies, all the environment version specification stuff goes away. We may keep them around as a place to store attributes, and it’s possible that nodes will associate to policies by policyname plus environment. Contrarily, we might use a different name for policy containers, or design the system in such a way that the “containers” are implicit.

​So basically that means deprecating the cookbook and cookbook_versions from environments, right?
At the minimum, yes. But the deprecation process is going to be a long one. This means there’s a long period where it could be confusing if you set version constraints in an environment and they’re completely ignored.

I would still vote for keeping environments though, because they allow you to set common attributes across a set of arbitrary nodes which might have totally different policies. ​
I understand the use case. Again, we haven’t made a decision here, but the things we’re thinking about are:

  • Environments are global, so any change to them immediately affects all nodes in an environment. Is this a good thing?
  • How exactly do policies get promoted from stage to stage? Are they completely static or can they be customized as they’re promoted?
  • Reducing the number of ways you can set attributes would make Chef easier to understand and debug.
  • Can we do something other than environments that provides flexibility for use cases that require it? For example, some users need to customize data or behavior by both deployment stage and data center. If you make environments that are the conjunction of both (e.g., production-us-east), that causes a lot of the same problems that you have with “micro environments” (duplicated data, etc.).

I assume (even though not explicitly mentioned) that the new Policyfile mechanism would work for chef-solo as well, does it?
Exactly how it works with chef-solo is to be determined. Since chef-solo gets cookbooks from local disk, the question of supporting multiple versions with solo is pretty awkward.

​As a long time, happy Chef solo user​ I would hope that it would work in a similar way like with a Berksfile today: just as berks install collects all cookbook versions from Berksfile.lock and puts them into a separate directory so it can be used as the cookbook repo for Chef solo, I would expect that with Policyfile its working in a similar way.

My main use case is Chef solo with Vagrant plus the awesome vagrant-omnibus and vagrant-berkshelf plugins.
We’re planning to integrate ChefDK with chef-metal, which will do all the same things and provide a tunneled connection to a local chef-zero server.

Btw: what's the role of Berkshelf with Policyfile? Will it still be used for resolving the dependency graph?
We’re integrating Berkshelf’s code into ChefDK, so the command line will be chef, but much of the underlying code is the same.

Cheers,
Torben

--
Daniel DeLeo

I am not sure I understand correctly the Policy system but we clearly work
by environments. Actually we wanted all our cookbooks pinned by environment
so we went all the way to have a separate chef server and the
Berksfile.lock of the git branch corresponds to the pinned cookbooks.

This way promoting is simply a release of our development branch to
preprod/prod.

We really don't want to have heterogeneity within the environment. If we
upgrade a basic cookbook like apt it must go everywhere. This is because we
consider our chef repo as a specific app within the company that gets
released just like any other app. This is actually very different than use
cases where a role is tied to an app and you want to upgrade the app's apt
cookbook without impacting other apps.

Not sure if I make sense here.
On Jun 20, 2014 6:18 PM, "Daniel DeLeo" dan@kallistec.com wrote:

On Thursday, June 19, 2014 at 5:54 AM, Torben Knerr wrote:

Hi Daniel,

thanks for all the explanation, that makes my picture about Policyfile
much clearer.

Few comments inline...

On Mon, Jun 16, 2014 at 5:20 PM, Daniel DeLeo <dan@kallistec.com
(mailto:dan@kallistec.com)> wrote:

  1. How are the Policyfiles versioned, published and being referenced?
    Policies won’t have versions (aside from what you do on your own in
    your version control system). The exact mechanism by which a node is
    assigned to a policy hasn’t been decided yet. We have some prototype code
    that uses data bags to store the policies, see:

https://github.com/opscode/chef/blob/master/lib/chef/policy_builder/policyfile.rb

https://github.com/opscode/chef/blob/master/spec/unit/policy_builder/policyfile_spec.rb

https://github.com/opscode/chef/blob/master/lib/chef/config.rb#L343-351

In the current prototype, you’d name your policies something like
$functional_role-$deployment_group where $functional_role is something like
appserver/load balancer/database/etc. and $deployment group could map to
your environments, groups within environments (for example, if you deploy
to production on a cluster-by-cluster basis, you could have prod-cluster-a,
prod-cluster-b, etc.) or whatever makes sense to you. Whether or not the
deployment group concept becomes a bit more first class when we implement
the APIs is something we haven’t decided yet.

​Does that mean that when the policy is named "appserver-prod" it would
automatically apply the "prod" environment to the node?

Or alternatively: will it be possible to bootstrap a node with a
Policyfile AND a Chef environment (e.g. knife bootstrap --policy "appservers" --environment "prod" ...)?​
In the current “compatibility mode” implementation, you cannot use
environments and Policyfiles at the same time. You must identify your
policies with a single string, so you’d name them as
$functional_role-$deployment_stage. This decision was forced by the
requirement to use existing data bags as the storage mechanism and won’t
necessarily be what we do in the final implementation.

  1. Can you reference a Policyfile from within an environment?
    With policies, all the environment version specification stuff goes
    away. We may keep them around as a place to store attributes, and it’s
    possible that nodes will associate to policies by policyname plus
    environment. Contrarily, we might use a different name for policy
    containers, or design the system in such a way that the “containers” are
    implicit.

​So basically that means deprecating the cookbook and
cookbook_versions from environments, right?
At the minimum, yes. But the deprecation process is going to be a long
one. This means there’s a long period where it could be confusing if you
set version constraints in an environment and they’re completely ignored.

I would still vote for keeping environments though, because they allow
you to set common attributes across a set of arbitrary nodes which might
have totally different policies. ​
I understand the use case. Again, we haven’t made a decision here, but the
things we’re thinking about are:

  • Environments are global, so any change to them immediately affects all
    nodes in an environment. Is this a good thing?
  • How exactly do policies get promoted from stage to stage? Are they
    completely static or can they be customized as they’re promoted?
  • Reducing the number of ways you can set attributes would make Chef
    easier to understand and debug.
  • Can we do something other than environments that provides flexibility
    for use cases that require it? For example, some users need to customize
    data or behavior by both deployment stage and data center. If you make
    environments that are the conjunction of both (e.g., production-us-east),
    that causes a lot of the same problems that you have with “micro
    environments” (duplicated data, etc.).

I assume (even though not explicitly mentioned) that the new
Policyfile mechanism would work for chef-solo as well, does it?
Exactly how it works with chef-solo is to be determined. Since
chef-solo gets cookbooks from local disk, the question of supporting
multiple versions with solo is pretty awkward.

​As a long time, happy Chef solo user​ I would hope that it would work
in a similar way like with a Berksfile today: just as berks install
collects all cookbook versions from Berksfile.lock and puts them into a
separate directory so it can be used as the cookbook repo for Chef solo, I
would expect that with Policyfile its working in a similar way.

My main use case is Chef solo with Vagrant plus the awesome
vagrant-omnibus and vagrant-berkshelf plugins.
We’re planning to integrate ChefDK with chef-metal, which will do all the
same things and provide a tunneled connection to a local chef-zero server.

Btw: what's the role of Berkshelf with Policyfile? Will it still be used
for resolving the dependency graph?
We’re integrating Berkshelf’s code into ChefDK, so the command line will
be chef, but much of the underlying code is the same.

Cheers,
Torben

--
Daniel DeLeo

On Friday, June 20, 2014 at 10:36 AM, Maxime Brugidou wrote:

I am not sure I understand correctly the Policy system but we clearly work by environments. Actually we wanted all our cookbooks pinned by environment so we went all the way to have a separate chef server and the Berksfile.lock of the git branch corresponds to the pinned cookbooks.
This way promoting is simply a release of our development branch to preprod/prod.
We really don't want to have heterogeneity within the environment. If we upgrade a basic cookbook like apt it must go everywhere. This is because we consider our chef repo as a specific app within the company that gets released just like any other app. This is actually very different than use cases where a role is tied to an app and you want to upgrade the app's apt cookbook without impacting other apps.
Not sure if I make sense here.

Is there a specific requirement that leads you to consider the entirety of the chef repo as the shippable artifact that you release? The idea of the policyfile is that you have shippable artifacts at a functional role level of granularity.

That said, you can still create a workflow and process that results in homogeneity at the environment level out of a set of tools that allow heterogeneity at the environment level, but trying to do the opposite doesn’t work well.

Also, we’re exposing all of the building blocks that the high level policyfile experience is built on so you can create something totally different if that works for you. For example, it would be relatively easy to just convert your chef-repo into a policy lock, upload it and move the exact cookbook versions around between your deployment stages.

--
Daniel DeLeo

On Fri, Jun 20, 2014 at 6:18 PM, Daniel DeLeo dan@kallistec.com wrote:

In the current prototype, you’d name your policies something like
$functional_role-$deployment_group where $functional_role is something like
appserver/load balancer/database/etc. and $deployment group could map to
your environments, groups within environments (for example, if you deploy
to production on a cluster-by-cluster basis, you could have prod-cluster-a,
prod-cluster-b, etc.) or whatever makes sense to you. Whether or not the
deployment group concept becomes a bit more first class when we implement
the APIs is something we haven’t decided yet.

​Does that mean that when the policy is named "appserver-prod" it would
automatically apply the "prod" environment to the node?

Or alternatively: will it be possible to bootstrap a node with a
Policyfile AND a Chef environment (e.g. knife bootstrap --policy "appservers" --environment "prod" ...)?​
In the current “compatibility mode” implementation, you cannot use
environments and Policyfiles at the same time. You must identify your
policies with a single string, so you’d name them as
$functional_role-$deployment_stage. This decision was forced by the
requirement to use existing data bags as the storage mechanism and won’t
necessarily be what we do in the final implementation.

​Guess that means ​that you are using environments via berks apply for
locking a Policy's dependency graph under the hood?

I would still vote for keeping environments though, because they allow
you to set common attributes across a set of arbitrary nodes which might
have totally different policies. ​
I understand the use case. Again, we haven’t made a decision here, but the
things we’re thinking about are:

  • Environments are global, so any change to them immediately affects all
    nodes in an environment. Is this a good thing?

​IMO yes. If something in the environment changes (e.g. ntp server) it
affects all nodes in that environment. That should also be the criteria
when deciding whether to put something into an environments vs. into a
wrapper cookbook (which requires re-releasing the cookbook for changing
attributes). ​

  • How exactly do policies get promoted from stage to stage? Are they
    completely static or can they be customized as they’re promoted?

​From my understanding the Policy should is the static part which describes
the functional role as a coherent whole, but not any environmental stuff
surrounding it. They should be self-contained and independent of the
environmentals IMO.

  • Reducing the number of ways you can set attributes would make Chef
    easier to understand and debug.
  • Can we do something other than environments that provides flexibility
    for use cases that require it? For example, some users need to customize
    data or behavior by both deployment stage and data center. If you make
    environments that are the conjunction of both (e.g., production-us-east),
    that causes a lot of the same problems that you have with “micro
    environments” (duplicated data, etc.).

​For me environments (for environmental stuff affecting all nodes in the
env) and policies (functional roles with a locked dep graph) are two
completely separate an orthogonal concepts. We should keep them selective
and not mix them up, then​ they will be composable.

​One way to prevent "micro environments" would be to allow for a single
node to be in multiple environments (e.g. --environments prod,us-east,foo), i.e. extending the current 1:1 relation to 1:N.

That would btw also let you use environments as the underlying mechanism
for locking the dependency graph (think berks apply) while still
providing the ability to express "deployment groups" like "prod" and
"us-east" via the environment mechanism.

Making nodes:environments a 1:N relation and deprecating
cookbook_versions and cookbook from environments (combined with
Policyfile for sure) would probably be the most minimal change that would
solve all the problems and use cases we discussed here and on github.

What do you think?

Cheers,
Torben

The requirement of having the entirety of the chef repo as deployable
artifact comes naturally when you have it in git. You want to ship that
specific branch/commit.

This actually led us to have multiple chef repos (and chef servers, called
"perimeters") because we needed to deal with large teams and different
deployments.

So basically we are enforcing the policy system to the extreme by having
one chef repo/server (we currently have 6 of them (+preprods) and try to
not add more unless a team is dedicated to maintain it).

I don't see how we could work it out with proper git workflow if we didn't
split the repos. The best solution would have been using chef organizations
(not open source) to have a single chef server.

The current workflow is actually very satisfying, we just have a simple
internal tool using former knife-essentials to sync all objects from git to
chef. We could have gone with pure role cookbooks and locking all versions
but that seemed way too dangerous. We needed the guarantee of segregating
at least perimeters.

We never encounter dependency management issue. Just need upgrading the
Berksfile.lock.
On Jun 20, 2014 8:41 PM, "Daniel DeLeo" dan@kallistec.com wrote:

On Friday, June 20, 2014 at 10:36 AM, Maxime Brugidou wrote:

I am not sure I understand correctly the Policy system but we clearly
work by environments. Actually we wanted all our cookbooks pinned by
environment so we went all the way to have a separate chef server and the
Berksfile.lock of the git branch corresponds to the pinned cookbooks.
This way promoting is simply a release of our development branch to
preprod/prod.
We really don't want to have heterogeneity within the environment. If we
upgrade a basic cookbook like apt it must go everywhere. This is because we
consider our chef repo as a specific app within the company that gets
released just like any other app. This is actually very different than use
cases where a role is tied to an app and you want to upgrade the app's apt
cookbook without impacting other apps.
Not sure if I make sense here.

Is there a specific requirement that leads you to consider the entirety of
the chef repo as the shippable artifact that you release? The idea of the
policyfile is that you have shippable artifacts at a functional role level
of granularity.

That said, you can still create a workflow and process that results in
homogeneity at the environment level out of a set of tools that allow
heterogeneity at the environment level, but trying to do the opposite
doesn’t work well.

Also, we’re exposing all of the building blocks that the high level
policyfile experience is built on so you can create something totally
different if that works for you. For example, it would be relatively easy
to just convert your chef-repo into a policy lock, upload it and move the
exact cookbook versions around between your deployment stages.

--
Daniel DeLeo

On Friday, June 20, 2014 at 2:21 PM, Torben Knerr wrote:

On Fri, Jun 20, 2014 at 6:18 PM, Daniel DeLeo <dan@kallistec.com (mailto:dan@kallistec.com)> wrote:

In the current prototype, you’d name your policies something like $functional_role-$deployment_group where $functional_role is something like appserver/load balancer/database/etc. and $deployment group could map to your environments, groups within environments (for example, if you deploy to production on a cluster-by-cluster basis, you could have prod-cluster-a, prod-cluster-b, etc.) or whatever makes sense to you. Whether or not the deployment group concept becomes a bit more first class when we implement the APIs is something we haven’t decided yet.

​Does that mean that when the policy is named "appserver-prod" it would automatically apply the "prod" environment to the node?

Or alternatively: will it be possible to bootstrap a node with a Policyfile AND a Chef environment (e.g. knife bootstrap --policy "appservers" --environment "prod" ...)?​
In the current “compatibility mode” implementation, you cannot use environments and Policyfiles at the same time. You must identify your policies with a single string, so you’d name them as $functional_role-$deployment_stage. This decision was forced by the requirement to use existing data bags as the storage mechanism and won’t necessarily be what we do in the final implementation.

​Guess that means ​that you are using environments via berks apply for locking a Policy's dependency graph under the hood?
No. chef-client gets a static list of cookbook versions and its run list from a data bag. The server does not apply any dependency constraints whatsoever, and the client never calls any API that looks at any environment. To minimize confusion while the feature is experimental, chef-client will refuse to run if you have an environment configured and enable the policyfile mode (but note that this doesn’t imply any decision has been made on how it will work in the end). See: https://github.com/opscode/chef/blob/master/lib/chef/policy_builder/policyfile.rb

I would still vote for keeping environments though, because they allow you to set common attributes across a set of arbitrary nodes which might have totally different policies. ​
I understand the use case. Again, we haven’t made a decision here, but the things we’re thinking about are:

  • Environments are global, so any change to them immediately affects all nodes in an environment. Is this a good thing?

​IMO yes. If something in the environment changes (e.g. ntp server) it affects all nodes in that environment. That should also be the criteria when deciding whether to put something into an environments vs. into a wrapper cookbook (which requires re-releasing the cookbook for changing attributes).
The flip side of this is that there’s no way to make certain changes without impacting every single node at once. Suppose you now have two NTP servers so you change the attribute from a string to an array of strings. All your chef-client runs fail. We could, instead, bake environmental attributes into the policyfile with some switching mechanism in the client. The more you bake in when the policyfile is compiled, the more certainty you can have that it’ll work when you promote it to production.

And if you really do need globally mutable config data, you can use data bag items.

Just to be clear, I’m explaining the sort of tradeoffs we’ve got in the back of our minds that we’ll consider when the time comes to make a decision. We want to learn from real-world usage where the rough edges and pick the best set of tradeoffs based on experience.

  • How exactly do policies get promoted from stage to stage? Are they completely static or can they be customized as they’re promoted?

​From my understanding the Policy should is the static part which describes the functional role as a coherent whole, but not any environmental stuff surrounding it. They should be self-contained and independent of the environmentals IMO.

  • Reducing the number of ways you can set attributes would make Chef easier to understand and debug.
  • Can we do something other than environments that provides flexibility for use cases that require it? For example, some users need to customize data or behavior by both deployment stage and data center. If you make environments that are the conjunction of both (e.g., production-us-east), that causes a lot of the same problems that you have with “micro environments” (duplicated data, etc.).

​For me environments (for environmental stuff affecting all nodes in the env) and policies (functional roles with a locked dep graph) are two completely separate an orthogonal concepts. We should keep them selective and not mix them up, then​ they will be composable.

​One way to prevent "micro environments" would be to allow for a single node to be in multiple environments (e.g. --environments prod,us-east,foo), i.e. extending the current 1:1 relation to 1:N.

That would btw also let you use environments as the underlying mechanism for locking the dependency graph (think berks apply) while still providing the ability to express "deployment groups" like "prod" and "us-east" via the environment mechanism.

Making nodes:environments a 1:N relation and deprecating cookbook_versions and cookbook from environments (combined with Policyfile for sure) would probably be the most minimal change that would solve all the problems and use cases we discussed here and on github.

What do you think?
We do want to do something that has the same end result as that. Personally, I’d like to use a more abstract name than environments, since people have conflicting ideas of what makes an environment (and a cluster-id or datacenter don’t really fit that definition anyway). And again, we will look very hard at the tradeoffs associated with how changes propagate to nodes once we’re in a position to try things in real world scenarios.

Cheers,
Torben

--
Daniel DeLeo

On Sat, Jun 21, 2014 at 1:05 AM, Daniel DeLeo dan@kallistec.com wrote:

On Friday, June 20, 2014 at 2:21 PM, Torben Knerr wrote:

On Fri, Jun 20, 2014 at 6:18 PM, Daniel DeLeo <dan@kallistec.com
(mailto:dan@kallistec.com)> wrote:

In the current prototype, you’d name your policies something like
$functional_role-$deployment_group where $functional_role is something like
appserver/load balancer/database/etc. and $deployment group could map to
your environments, groups within environments (for example, if you deploy
to production on a cluster-by-cluster basis, you could have prod-cluster-a,
prod-cluster-b, etc.) or whatever makes sense to you. Whether or not the
deployment group concept becomes a bit more first class when we implement
the APIs is something we haven’t decided yet.

​Does that mean that when the policy is named "appserver-prod" it
would automatically apply the "prod" environment to the node?

Or alternatively: will it be possible to bootstrap a node with a
Policyfile AND a Chef environment (e.g. knife bootstrap --policy "appservers" --environment "prod" ...)?​
In the current “compatibility mode” implementation, you cannot use
environments and Policyfiles at the same time. You must identify your
policies with a single string, so you’d name them as
$functional_role-$deployment_stage. This decision was forced by the
requirement to use existing data bags as the storage mechanism and won’t
necessarily be what we do in the final implementation.

​Guess that means ​that you are using environments via berks apply for
locking a Policy's dependency graph under the hood?
No. chef-client gets a static list of cookbook versions and its run list
from a data bag. The server does not apply any dependency constraints
whatsoever, and the client never calls any API that looks at any
environment. To minimize confusion while the feature is experimental,
chef-client will refuse to run if you have an environment configured and
enable the policyfile mode (but note that this doesn’t imply any decision
has been made on how it will work in the end). See:
https://github.com/opscode/chef/blob/master/lib/chef/policy_builder/policyfile.rb

​Interesting. Thought that using environments ​were not supported because
they were already used for locking the dep graph, but I guess it's really
only to prevent people setting cookbook_versions in the environment which
might conflict with the Policyfile...

I would still vote for keeping environments though, because they
allow you to set common attributes across a set of arbitrary nodes which
might have totally different policies. ​
I understand the use case. Again, we haven’t made a decision here, but
the things we’re thinking about are:

  • Environments are global, so any change to them immediately affects
    all nodes in an environment. Is this a good thing?

​IMO yes. If something in the environment changes (e.g. ntp server) it
affects all nodes in that environment. That should also be the criteria
when deciding whether to put something into an environments vs. into a
wrapper cookbook (which requires re-releasing the cookbook for changing
attributes).
The flip side of this is that there’s no way to make certain changes
without impacting every single node at once. Suppose you now have two NTP
servers so you change the attribute from a string to an array of strings.
All your chef-client runs fail. We could, instead, bake environmental
attributes into the policyfile with some switching mechanism in the client.
The more you bake in when the policyfile is compiled, the more certainty
you can have that it’ll work when you promote it to production.

And if you really do need globally mutable config data, you can use data
bag items.

​Good point. I never found a really good rule on when it's better to use
environments vs. data bags​. My rule of thumb as of today is: environmental
stuff with a more static nature goes into environments (e.g. external
services we are reliant on), the more dynamic non-environmental stuff where
frequent changes are expected into data bags (e.g. users, deployment
artifacts, etc.).

Just to be clear, I’m explaining the sort of tradeoffs we’ve got in the
back of our minds that we’ll consider when the time comes to make a
decision. We want to learn from real-world usage where the rough edges and
pick the best set of tradeoffs based on experience.

  • How exactly do policies get promoted from stage to stage? Are they
    completely static or can they be customized as they’re promoted?

​From my understanding the Policy should is the static part which
describes the functional role as a coherent whole, but not any
environmental stuff surrounding it. They should be self-contained and
independent of the environmentals IMO.

  • Reducing the number of ways you can set attributes would make Chef
    easier to understand and debug.
  • Can we do something other than environments that provides
    flexibility for use cases that require it? For example, some users need to
    customize data or behavior by both deployment stage and data center. If you
    make environments that are the conjunction of both (e.g.,
    production-us-east), that causes a lot of the same problems that you have
    with “micro environments” (duplicated data, etc.).

​For me environments (for environmental stuff affecting all nodes in the
env) and policies (functional roles with a locked dep graph) are two
completely separate an orthogonal concepts. We should keep them selective
and not mix them up, then​ they will be composable.

​One way to prevent "micro environments" would be to allow for a single
node to be in multiple environments (e.g. --environments prod,us-east,foo), i.e. extending the current 1:1 relation to 1:N.

That would btw also let you use environments as the underlying mechanism
for locking the dependency graph (think berks apply) while still
providing the ability to express "deployment groups" like "prod" and
"us-east" via the environment mechanism.

Making nodes:environments a 1:N relation and deprecating
cookbook_versions and cookbook from environments (combined with
Policyfile for sure) would probably be the most minimal change that would
solve all the problems and use cases we discussed here and on github.

What do you think?
We do want to do something that has the same end result as that.
Personally, I’d like to use a more abstract name than environments, since
people have conflicting ideas of what makes an environment (and a
cluster-id or datacenter don’t really fit that definition anyway). And
again, we will look very hard at the tradeoffs associated with how changes
propagate to nodes once we’re in a position to try things in real world
scenarios.

​"Deployment group" sounds like a much better name​ for it then. I was just
thinking that it would be even more confusing if we had two overlapping
concepts (environments and deplyoment groups) supported at the same time.

Excited to see how things will develop!

Thanks for all the patience and details :slight_smile:

Cheers,
Torben

On Friday, June 20, 2014 at 3:56 PM, Maxime Brugidou wrote:

The requirement of having the entirety of the chef repo as deployable artifact comes naturally when you have it in git. You want to ship that specific branch/commit.
This actually led us to have multiple chef repos (and chef servers, called "perimeters") because we needed to deal with large teams and different deployments.
So basically we are enforcing the policy system to the extreme by having one chef repo/server (we currently have 6 of them (+preprods) and try to not add more unless a team is dedicated to maintain it).

No. We at CHEF still have some stuff in the single chef-repo format, and lots of users do too. Single repo per cookbook is nice, but it can be hard to get there without disrupting everyone’s productivity (assuming you want to get there as an end goal, which maybe you don't). We’ll be paying special attention to making the policy stuff usable with a single chef repo.

I don't see how we could work it out with proper git workflow if we didn't split the repos. The best solution would have been using chef organizations (not open source) to have a single chef server.
The current workflow is actually very satisfying, we just have a simple internal tool using former knife-essentials to sync all objects from git to chef. We could have gone with pure role cookbooks and locking all versions but that seemed way too dangerous. We needed the guarantee of segregating at least perimeters.

Where does the danger come from? Are you using roles? Uploading cookbooks without freezing them? The policyfile guarantees you’ll have exactly the same cookbooks by assigning them identifiers based on the hash of the content in them. And it guarantees you’ll have exactly the same run list by “baking in” roles when the policyfile is compiled into the policyfile lock. Then you promote the lock through whatever stages you have in your lifecycle (e.g., integration test, dev, stage, prod, whatever).

We never encounter dependency management issue. Just need upgrading the Berksfile.lock.

--
Daniel DeLeo