Policyfiles and chef provision

Chris_Sibbitt · July 22, 2015, 5:29pm

I’ve been experimenting with policyfile support lately, and I’m hoping
someone can clarify some thinking around policyfile support in the “chef
provision” command.

chef provision POLICY_GROUP --policy-name POLICY_NAME lets me specify ONE
policyfile and run a provisioning recipe. Policyfiles define a run_list,
but one of my typical chef-provisioning recipes contains multiple machines
with different run_lists.

I’m not sure whether to take this as a suggestion that provisioning recipes
should only do one machine each, or whether the tooling is just not quite
meshing yet (I’m aware it’s all very new and beta), or whether there is
something conceptual missing from my thinking.

Anyone else experimenting with this combination yet?

Maxime_Brugidou · July 22, 2015, 6:16pm

Hey,

Sorry might be wrong but from what I understand a Policyfile based workflow
is pretty much like versioning a role with an associated Berksfile.lock.

I am not sure exactly how things are intended to be used but the associated
provision cookbook should probably provision nodes from the given Policy
name and nothing else. Each Policyfile would have their own provision
cookbook (and maybe even git repo since they are versionned separately).
This seem a bit extreme to me but could be greatly improved if we leverage
named run lists in the Policyfile: then we can have multiple run_lists
under the same Policyfile. I haven't tested that yet.

Maxime
On Jul 22, 2015 7:30 PM, "Chris Sibbitt" csibbitt@thinkingphones.com
wrote:

I've been experimenting with policyfile support lately, and I'm hoping
someone can clarify some thinking around policyfile support in the "chef
provision" command.

chef provision POLICY_GROUP --policy-name POLICY_NAME lets me specify
ONE policyfile and run a provisioning recipe. Policyfiles define a
run_list, but one of my typical chef-provisioning recipes contains multiple
machines with different run_lists.

I'm not sure whether to take this as a suggestion that provisioning
recipes should only do one machine each, or whether the tooling is just not
quite meshing yet (I'm aware it's all very new and beta), or whether there
is something conceptual missing from my thinking.

Anyone else experimenting with this combination yet?

Christine_Draper · July 23, 2015, 12:57am

I had the same reaction re the chef provision command... it seems targeted
at letting me one or more 'identicalish' nodes, whereas my use of
provisioning is typically to setup a variety of nodes that form a working
system or solution.

Is there a way within a chef provisioning recipe to say 'set this machine
up using this policyfile'? I couldn't see one. Maybe we need a policyfile
resource (to load policies) and a policy attribute on the machine resource.

Regards,
Christine

On Wed, Jul 22, 2015 at 1:16 PM, Maxime Brugidou maxime.brugidou@gmail.com
wrote:

Hey,

Sorry might be wrong but from what I understand a Policyfile based
workflow is pretty much like versioning a role with an associated
Berksfile.lock.

I am not sure exactly how things are intended to be used but the
associated provision cookbook should probably provision nodes from the
given Policy name and nothing else. Each Policyfile would have their own
provision cookbook (and maybe even git repo since they are versionned
separately). This seem a bit extreme to me but could be greatly improved if
we leverage named run lists in the Policyfile: then we can have multiple
run_lists under the same Policyfile. I haven't tested that yet.

Maxime
On Jul 22, 2015 7:30 PM, "Chris Sibbitt" csibbitt@thinkingphones.com
wrote:

I've been experimenting with policyfile support lately, and I'm hoping
someone can clarify some thinking around policyfile support in the "chef
provision" command.

chef provision POLICY_GROUP --policy-name POLICY_NAME lets me specify
ONE policyfile and run a provisioning recipe. Policyfiles define a
run_list, but one of my typical chef-provisioning recipes contains multiple
machines with different run_lists.

I'm not sure whether to take this as a suggestion that provisioning
recipes should only do one machine each, or whether the tooling is just not
quite meshing yet (I'm aware it's all very new and beta), or whether there
is something conceptual missing from my thinking.

Anyone else experimenting with this combination yet?

--
ThirdWave Insights, LLC I (512) 971-8727 <%28512%29%20656-7724> I
www.ThirdWaveInsights.com I P.O. Box 500134 I Austin, TX 78750

kallistec · July 23, 2015, 5:01pm

On Wednesday, July 22, 2015 at 5:57 PM, Christine Draper wrote:

I had the same reaction re the chef provision command... it seems targeted at letting me one or more 'identicalish' nodes, whereas my use of provisioning is typically to setup a variety of nodes that form a working system or solution.

I did think about clustering scenarios when I was writing chef provision, but it turns out that it can be complicated depending on what the exact use case is. Do you want a “throwaway” cluster to integration-test your cookbooks as a whole? How do you keep different developers’ throwaway clusters from conflicting with each other? How do you know what machines (or other resources) to destroy when the user requests to destroy stuff? How much responsibility should be placed on the user if they update their provisioning code and then run the destroy operation (which could leave a stray EBS volume, for example)? None of these problems are insurmountable, but it will take a lot of thinking to do this in a way that provides a great experience.

So, for the policyfile focused stuff, I opted to make it work more like knife bootstrap and knife cloud create, except using using Chef Provisioning so hopefully you don’t have the CLI options issues (i.e., you need 15 options you can never remember) that those commands have.

That said, I also felt that, regardless of whether you’re using Policyfiles or not, it wasn’t easy to pass “argument” type information into Chef Provisioning with any of the existing methods (environment variables being the easiest, but they have some unhelpful failure modes when you typo things). So I added the chef provision --no-policy mode of operation which loads up Chef and Chef Provisioning and then just runs your recipes. If you’re already happily creating clusters with Chef Provisioning, this might be the best way for you to use chef provision right now.

Is there a way within a chef provisioning recipe to say 'set this machine up using this policyfile'? I couldn't see one. Maybe we need a policyfile resource (to load policies) and a policy attribute on the machine resource.
What chef provision does right now is, it adds policy_name and policy_group settings to the client.rb file via Chef Provisioning’s built-in option for this. You can do this manually as well.

At the moment this is the only way to tell chef-client which policyfile you want, because the node object doesn’t yet have these fields. In a future release of Chef Client and Chef Server, we will add these fields to the node object. After that’s done, it will be possible to add these to Chef Provisioning.

Policyfiles will be a bit trickier, since the process for generating the JSON and uploading everything to the server involves a lot of things (dependency resolution, local caching of cookbooks from supermarket, multiple cookbook uploads and then finally uploading the policy JSON) and there are some imperative operations involved (i.e., the user has to decide if they want to update dependencies or just take the lockfile as-is). I’m not ruling it out, but right now my thinking is there’s a bit of an impedance mismatch between the Chef Provisioning way of thinking and they way Policyfiles work otherwise. Maybe some more insight into user expectations here would be helpful

Regards,
Christine

On Wed, Jul 22, 2015 at 1:16 PM, Maxime Brugidou <maxime.brugidou@gmail.com (mailto:maxime.brugidou@gmail.com)> wrote:

Hey,
Sorry might be wrong but from what I understand a Policyfile based workflow is pretty much like versioning a role with an associated Berksfile.lock.
I am not sure exactly how things are intended to be used but the associated provision cookbook should probably provision nodes from the given Policy name and nothing else. Each Policyfile would have their own provision cookbook (and maybe even git repo since they are versionned separately). This seem a bit extreme to me but could be greatly improved if we leverage named run lists in the Policyfile: then we can have multiple run_lists under the same Policyfile. I haven't tested that yet.
Maxime
On Jul 22, 2015 7:30 PM, "Chris Sibbitt" <csibbitt@thinkingphones.com (mailto:csibbitt@thinkingphones.com)> wrote:

I've been experimenting with policyfile support lately, and I'm hoping someone can clarify some thinking around policyfile support in the "chef provision" command.

chef provision POLICY_GROUP --policy-name POLICY_NAME lets me specify ONE policyfile and run a provisioning recipe. Policyfiles define a run_list, but one of my typical chef-provisioning recipes contains multiple machines with different run_lists.

I'm not sure whether to take this as a suggestion that provisioning recipes should only do one machine each, or whether the tooling is just not quite meshing yet (I'm aware it's all very new and beta), or whether there is something conceptual missing from my thinking.

Anyone else experimenting with this combination yet?
I addressed this above, you can use the —no-policy option to skip the policyfile part, which is probably the best way to do clusters. You’d have to manage uploading the policies and such yourself via the command line.

--
ThirdWave Insights, LLC I (512) 971-8727 (tel:%28512%29%20656-7724) I www.ThirdWaveInsights.com (http://www.ThirdWaveInsights.com) I P.O. Box 500134 I Austin, TX 78750

--
Daniel DeLeo

Christine_Draper · July 23, 2015, 6:02pm

Dan,

I guess 'chef provision' is more like 'chef policy provision' than a
general interface for people wanting a simpler on-ramp to provisioning in
general?

The way I currently pass in arguments to chef provisioning is using the
chef-client -j option, i.e. I treat them as node attributes on the
provisioning node. I guess the --no-policy option would let me specify
them on the command line and access them through the opts object without
storing them as node attributes, which could be useful.

I'm still struggling to get a mental picture of using policyfiles with
provisioning, even for simple multi-node systems. Say something as simple
as an appserver and a dbserver that I want to provision multiple times when
a tester needs them. They need different runlists and attributes, but they
should be using the same set of cookbook versions. I want to bring up the
dbserver first so I can configure the appserver with its IP address. My
natural inclination is to write a provisioning recipe that brings up the
machines and sets their attributes/runlists and environment (to control
cookbook versions). What would the policyfile version of this scenario be?

Regards,
Christine

On Thu, Jul 23, 2015 at 12:01 PM, Daniel DeLeo dan@kallistec.com wrote:

On Wednesday, July 22, 2015 at 5:57 PM, Christine Draper wrote:

I had the same reaction re the chef provision command... it seems
targeted at letting me one or more 'identicalish' nodes, whereas my use of
provisioning is typically to setup a variety of nodes that form a working
system or solution.

I did think about clustering scenarios when I was writing chef provision, but it turns out that it can be complicated depending on what
the exact use case is. Do you want a “throwaway” cluster to
integration-test your cookbooks as a whole? How do you keep different
developers’ throwaway clusters from conflicting with each other? How do you
know what machines (or other resources) to destroy when the user requests
to destroy stuff? How much responsibility should be placed on the user if
they update their provisioning code and then run the destroy operation
(which could leave a stray EBS volume, for example)? None of these problems
are insurmountable, but it will take a lot of thinking to do this in a way
that provides a great experience.

So, for the policyfile focused stuff, I opted to make it work more like
knife bootstrap and knife cloud create, except using using Chef
Provisioning so hopefully you don’t have the CLI options issues (i.e., you
need 15 options you can never remember) that those commands have.

That said, I also felt that, regardless of whether you’re using
Policyfiles or not, it wasn’t easy to pass “argument” type information into
Chef Provisioning with any of the existing methods (environment variables
being the easiest, but they have some unhelpful failure modes when you typo
things). So I added the chef provision --no-policy mode of operation
which loads up Chef and Chef Provisioning and then just runs your recipes.
If you’re already happily creating clusters with Chef Provisioning, this
might be the best way for you to use chef provision right now.

Is there a way within a chef provisioning recipe to say 'set this
machine up using this policyfile'? I couldn't see one. Maybe we need a
policyfile resource (to load policies) and a policy attribute on the
machine resource.
What chef provision does right now is, it adds policy_name and
policy_group settings to the client.rb file via Chef Provisioning’s
built-in option for this. You can do this manually as well.

At the moment this is the only way to tell chef-client which policyfile
you want, because the node object doesn’t yet have these fields. In a
future release of Chef Client and Chef Server, we will add these fields to
the node object. After that’s done, it will be possible to add these to
Chef Provisioning.

Policyfiles will be a bit trickier, since the process for generating the
JSON and uploading everything to the server involves a lot of things
(dependency resolution, local caching of cookbooks from supermarket,
multiple cookbook uploads and then finally uploading the policy JSON) and
there are some imperative operations involved (i.e., the user has to decide
if they want to update dependencies or just take the lockfile as-is). I’m
not ruling it out, but right now my thinking is there’s a bit of an
impedance mismatch between the Chef Provisioning way of thinking and they
way Policyfiles work otherwise. Maybe some more insight into user
expectations here would be helpful

Regards,
Christine

On Wed, Jul 22, 2015 at 1:16 PM, Maxime Brugidou <
maxime.brugidou@gmail.com (mailto:maxime.brugidou@gmail.com)> wrote:

Hey,
Sorry might be wrong but from what I understand a Policyfile based
workflow is pretty much like versioning a role with an associated
Berksfile.lock.
I am not sure exactly how things are intended to be used but the
associated provision cookbook should probably provision nodes from the
given Policy name and nothing else. Each Policyfile would have their own
provision cookbook (and maybe even git repo since they are versionned
separately). This seem a bit extreme to me but could be greatly improved if
we leverage named run lists in the Policyfile: then we can have multiple
run_lists under the same Policyfile. I haven't tested that yet.
Maxime
On Jul 22, 2015 7:30 PM, "Chris Sibbitt" <csibbitt@thinkingphones.com
(mailto:csibbitt@thinkingphones.com)> wrote:

I've been experimenting with policyfile support lately, and I'm
hoping someone can clarify some thinking around policyfile support in the
"chef provision" command.

chef provision POLICY_GROUP --policy-name POLICY_NAME lets me
specify ONE policyfile and run a provisioning recipe. Policyfiles define a
run_list, but one of my typical chef-provisioning recipes contains multiple
machines with different run_lists.

I'm not sure whether to take this as a suggestion that provisioning
recipes should only do one machine each, or whether the tooling is just not
quite meshing yet (I'm aware it's all very new and beta), or whether there
is something conceptual missing from my thinking.

Anyone else experimenting with this combination yet?
I addressed this above, you can use the —no-policy option to skip the
policyfile part, which is probably the best way to do clusters. You’d have
to manage uploading the policies and such yourself via the command line.

--
ThirdWave Insights, LLC I (512) 971-8727 (tel:%28512%29%20656-7724) I
www.ThirdWaveInsights.com (http://www.ThirdWaveInsights.com) I P.O. Box
500134 I Austin, TX 78750

--
Daniel DeLeo

--
ThirdWave Insights, LLC I (512) 971-8727 <%28512%29%20656-7724> I
www.ThirdWaveInsights.com I P.O. Box 500134 I Austin, TX 78750

Maxime_Brugidou · July 23, 2015, 6:38pm

Thanks for explaining your way of thinking.

As a user I would love to use Policyfiles to independently
manage/version/release "subsets" of nodes where a subset is more of a
logical group of nodes than an actual identical run list. For example my
company would ship a new version for the Hadoop policy without touching
others. But Hadoop has many different kind of nodes.

The Chef provision system would definitely match this in this specific
case. I would have a separate git repo for each policy with a specific
provision cookbook for my Hadoop policy. What I was looking for is a way to
describe my policy (like Hadoop) in a provisioning recipe and decline this
across various environments (aka policy groups) like CI where a dedicated
vagrant cluster is built/destroyed, staging on AWS and prod on bare metal.
All leveraging the same provisioning recipe. I would need to add some logic
to manage the customizations across environments like instance types or
number of nodes, but all this is easily doable with ruby and it is why I
think this tool could be powerful.

Overall the duo of provisioning + Policyfile can be very interesting and
looks promising to me.
On Jul 23, 2015 7:01 PM, "Daniel DeLeo" dan@kallistec.com wrote:

On Wednesday, July 22, 2015 at 5:57 PM, Christine Draper wrote:

I had the same reaction re the chef provision command... it seems
targeted at letting me one or more 'identicalish' nodes, whereas my use of
provisioning is typically to setup a variety of nodes that form a working
system or solution.

I did think about clustering scenarios when I was writing chef provision, but it turns out that it can be complicated depending on what
the exact use case is. Do you want a “throwaway” cluster to
integration-test your cookbooks as a whole? How do you keep different
developers’ throwaway clusters from conflicting with each other? How do you
know what machines (or other resources) to destroy when the user requests
to destroy stuff? How much responsibility should be placed on the user if
they update their provisioning code and then run the destroy operation
(which could leave a stray EBS volume, for example)? None of these problems
are insurmountable, but it will take a lot of thinking to do this in a way
that provides a great experience.

So, for the policyfile focused stuff, I opted to make it work more like
knife bootstrap and knife cloud create, except using using Chef
Provisioning so hopefully you don’t have the CLI options issues (i.e., you
need 15 options you can never remember) that those commands have.

That said, I also felt that, regardless of whether you’re using
Policyfiles or not, it wasn’t easy to pass “argument” type information into
Chef Provisioning with any of the existing methods (environment variables
being the easiest, but they have some unhelpful failure modes when you typo
things). So I added the chef provision --no-policy mode of operation
which loads up Chef and Chef Provisioning and then just runs your recipes.
If you’re already happily creating clusters with Chef Provisioning, this
might be the best way for you to use chef provision right now.

Is there a way within a chef provisioning recipe to say 'set this
machine up using this policyfile'? I couldn't see one. Maybe we need a
policyfile resource (to load policies) and a policy attribute on the
machine resource.
What chef provision does right now is, it adds policy_name and
policy_group settings to the client.rb file via Chef Provisioning’s
built-in option for this. You can do this manually as well.

At the moment this is the only way to tell chef-client which policyfile
you want, because the node object doesn’t yet have these fields. In a
future release of Chef Client and Chef Server, we will add these fields to
the node object. After that’s done, it will be possible to add these to
Chef Provisioning.

Policyfiles will be a bit trickier, since the process for generating the
JSON and uploading everything to the server involves a lot of things
(dependency resolution, local caching of cookbooks from supermarket,
multiple cookbook uploads and then finally uploading the policy JSON) and
there are some imperative operations involved (i.e., the user has to decide
if they want to update dependencies or just take the lockfile as-is). I’m
not ruling it out, but right now my thinking is there’s a bit of an
impedance mismatch between the Chef Provisioning way of thinking and they
way Policyfiles work otherwise. Maybe some more insight into user
expectations here would be helpful

Regards,
Christine

On Wed, Jul 22, 2015 at 1:16 PM, Maxime Brugidou <
maxime.brugidou@gmail.com (mailto:maxime.brugidou@gmail.com)> wrote:

Hey,
Sorry might be wrong but from what I understand a Policyfile based
workflow is pretty much like versioning a role with an associated
Berksfile.lock.
I am not sure exactly how things are intended to be used but the
associated provision cookbook should probably provision nodes from the
given Policy name and nothing else. Each Policyfile would have their own
provision cookbook (and maybe even git repo since they are versionned
separately). This seem a bit extreme to me but could be greatly improved if
we leverage named run lists in the Policyfile: then we can have multiple
run_lists under the same Policyfile. I haven't tested that yet.
Maxime
On Jul 22, 2015 7:30 PM, "Chris Sibbitt" <csibbitt@thinkingphones.com
(mailto:csibbitt@thinkingphones.com)> wrote:

I've been experimenting with policyfile support lately, and I'm
hoping someone can clarify some thinking around policyfile support in the
"chef provision" command.

chef provision POLICY_GROUP --policy-name POLICY_NAME lets me
specify ONE policyfile and run a provisioning recipe. Policyfiles define a
run_list, but one of my typical chef-provisioning recipes contains multiple
machines with different run_lists.

I'm not sure whether to take this as a suggestion that provisioning
recipes should only do one machine each, or whether the tooling is just not
quite meshing yet (I'm aware it's all very new and beta), or whether there
is something conceptual missing from my thinking.

Anyone else experimenting with this combination yet?
I addressed this above, you can use the —no-policy option to skip the
policyfile part, which is probably the best way to do clusters. You’d have
to manage uploading the policies and such yourself via the command line.

--
ThirdWave Insights, LLC I (512) 971-8727 (tel:%28512%29%20656-7724) I
www.ThirdWaveInsights.com (http://www.ThirdWaveInsights.com) I P.O. Box
500134 I Austin, TX 78750

--
Daniel DeLeo

kallistec · July 23, 2015, 7:01pm

On Thursday, July 23, 2015 at 11:02 AM, Christine Draper wrote:

Dan,

I guess 'chef provision' is more like 'chef policy provision' than a general interface for people wanting a simpler on-ramp to provisioning in general?

The way I currently pass in arguments to chef provisioning is using the chef-client -j option, i.e. I treat them as node attributes on the provisioning node. I guess the --no-policy option would let me specify them on the command line and access them through the opts object without storing them as node attributes, which could be useful.
Yep, that’s the intention of the “no policy” mode. -j is probably fine in some cases, but if you’re changing the values a lot it’s probably not the best experience.

I'm still struggling to get a mental picture of using policyfiles with provisioning, even for simple multi-node systems. Say something as simple as an appserver and a dbserver that I want to provision multiple times when a tester needs them. They need different runlists and attributes, but they should be using the same set of cookbook versions.
Where does this constraint of "same set of cookbook versions” come from? I know that in a non-policyfile world, having this constraint makes a lot of sense, because environments can only have one set of cookbook constraints, therefore to have to versions of a cookbook in production at the same time, you have to get “creative” (e.g., environment cookbook pattern in berkshelf or other solutions). Policyfiles make it easy and safe to have mutliple versions in a given lifecycle stage at the same time, because each kind of machine gets the versioning information that’s baked in to the policy, and that never changes without you explicitly asking for that to happen.

That said, allowing multiple versions can be a double edged sword, because one app or team may get stuck on an older version for a long time (I call this “sandbagging”), creating a high level of tech debt that has to be paid all at once when some circumstance forces an upgrade (and that can be at a really inconvenient time, such as when trying to apply a security patch to the underlying software). Chef Delivery doesn’t allow you to sandbag for precisely this reason.

At this time, Policyfiles don’t have a general mechanism to enforce a “use the same versions throughout an environment” policy. If you’re using a monolithic repo, you can set default_source :chef_repo, “path/to/cookbooks” in your Policyfile.rb(s) and then run chef update against all your policies to force them to those versions. Once you bring community cookbooks into the mix, though, you could have different version constraints in your metadata that pull in different cookbooks from supermarket, though.

Side note, all of the policyfile commands take a path to a ruby policyfile, so you can have a policies/ directory in your monolithic repo and then have policies/database.rb policies/application.rb, etc.

I want to bring up the dbserver first so I can configure the appserver with its IP address. My natural inclination is to write a provisioning recipe that brings up the machines and sets their attributes/runlists and environment (to control cookbook versions). What would the policyfile version of this scenario be?

As Chef Provisioning doesn’t have any policyfile stuff yet, you’d first get your Policyfile.lock.json(s) into the correct state with chef update commands, then do a chef push POLICY_GROUP to upload the cookbooks. If you have a standardized place where these things live, you can automate that with execute resources.

From there, you need to set the policy_group and policy_name on the nodes via the client.rb (not sure how well this is documented, but you can pass configuration as a string with { chef_config: “your client.rb content” } as convergence_options). At the moment, "use_policyfile true” is also required to turn on policyfile mode in chef-client. Aside from that, you can just use Chef Provisioning as you would otherwise, and run chef provision in the no policy mode.

Any feedback on how that process works for you, how you’re using it, etc. would be helpful to help define the future of the feature.

Regards,
Christine

--
Daniel DeLeo

Christine_Draper · July 23, 2015, 8:19pm

Dan,

I didn't mean 'all nodes being used for testing need the same cookbook
versions'. I wanted to communicate:

A particular topology (dbserver + appserver) should have the same
cookbook versions
Different topologies may have different cookbook versions - depending on
what the tester is working on

That's actually part of why I'm interested in policyfiles - trying to
achieve this with environments would be painful.

The mechanics of the process sounds plausibly workable. However, I thought
policyfiles set the runlist and attributes on the node as well as
constraining cookbooks, so wouldn't that conflict with setting the runlist
and attributes in the provisioning recipe?

Regards,
Christine

On Thu, Jul 23, 2015 at 2:01 PM, Daniel DeLeo dan@kallistec.com wrote:

On Thursday, July 23, 2015 at 11:02 AM, Christine Draper wrote:

Dan,

I guess 'chef provision' is more like 'chef policy provision' than a
general interface for people wanting a simpler on-ramp to provisioning in
general?

The way I currently pass in arguments to chef provisioning is using the
chef-client -j option, i.e. I treat them as node attributes on the
provisioning node. I guess the --no-policy option would let me specify them
on the command line and access them through the opts object without storing
them as node attributes, which could be useful.
Yep, that’s the intention of the “no policy” mode. -j is probably fine
in some cases, but if you’re changing the values a lot it’s probably not
the best experience.

I'm still struggling to get a mental picture of using policyfiles with
provisioning, even for simple multi-node systems. Say something as simple
as an appserver and a dbserver that I want to provision multiple times when
a tester needs them. They need different runlists and attributes, but they
should be using the same set of cookbook versions.
Where does this constraint of "same set of cookbook versions” come from?
I know that in a non-policyfile world, having this constraint makes a lot
of sense, because environments can only have one set of cookbook
constraints, therefore to have to versions of a cookbook in production at
the same time, you have to get “creative” (e.g., environment cookbook
pattern in berkshelf or other solutions). Policyfiles make it easy and safe
to have mutliple versions in a given lifecycle stage at the same time,
because each kind of machine gets the versioning information that’s baked
in to the policy, and that never changes without you explicitly asking for
that to happen.

That said, allowing multiple versions can be a double edged sword, because
one app or team may get stuck on an older version for a long time (I call
this “sandbagging”), creating a high level of tech debt that has to be paid
all at once when some circumstance forces an upgrade (and that can be at a
really inconvenient time, such as when trying to apply a security patch to
the underlying software). Chef Delivery doesn’t allow you to sandbag for
precisely this reason.

At this time, Policyfiles don’t have a general mechanism to enforce a “use
the same versions throughout an environment” policy. If you’re using a
monolithic repo, you can set default_source :chef_repo, “path/to/cookbooks” in your Policyfile.rb(s) and then run chef update
against all your policies to force them to those versions. Once you bring
community cookbooks into the mix, though, you could have different version
constraints in your metadata that pull in different cookbooks from
supermarket, though.

Side note, all of the policyfile commands take a path to a ruby
policyfile, so you can have a policies/ directory in your monolithic repo
and then have policies/database.rb policies/application.rb, etc.

I want to bring up the dbserver first so I can configure the appserver
with its IP address. My natural inclination is to write a provisioning
recipe that brings up the machines and sets their attributes/runlists and
environment (to control cookbook versions). What would the policyfile
version of this scenario be?

As Chef Provisioning doesn’t have any policyfile stuff yet, you’d first
get your Policyfile.lock.json(s) into the correct state with chef update
commands, then do a chef push POLICY_GROUP to upload the cookbooks. If
you have a standardized place where these things live, you can automate
that with execute resources.

From there, you need to set the policy_group and policy_name on the nodes
via the client.rb (not sure how well this is documented, but you can pass
configuration as a string with { chef_config: “your client.rb content” } as
convergence_options). At the moment, "use_policyfile true” is also required
to turn on policyfile mode in chef-client. Aside from that, you can just
use Chef Provisioning as you would otherwise, and run chef provision in
the no policy mode.

Any feedback on how that process works for you, how you’re using it, etc.
would be helpful to help define the future of the feature.

Regards,
Christine

--
Daniel DeLeo

--
ThirdWave Insights, LLC I (512) 971-8727 <%28512%29%20656-7724> I
www.ThirdWaveInsights.com I P.O. Box 500134 I Austin, TX 78750

kallistec · July 23, 2015, 11:02pm

On Thursday, July 23, 2015 at 1:19 PM, Christine Draper wrote:

Dan,

I didn't mean 'all nodes being used for testing need the same cookbook versions'. I wanted to communicate:

A particular topology (dbserver + appserver) should have the same cookbook versions

Are you using community cookbooks at all? Would it violate a policy if the dbserver and app server used different versions of a cookbook like apt or yum or build-essentials? Just want to be sure I’m understanding your constraints.

Different topologies may have different cookbook versions - depending on what the tester is working on

That's actually part of why I'm interested in policyfiles - trying to achieve this with environments would be painful.

The mechanics of the process sounds plausibly workable. However, I thought policyfiles set the runlist and attributes on the node as well as constraining cookbooks, so wouldn't that conflict with setting the runlist and attributes in the provisioning recipe?
Node-specific attributes are not affected by policyfiles, they work the same as they have. Policyfiles allow you to set attributes in the same way as roles (defaults and overrides, which have the same precedence as role attributes). You can pick and choose between setting the attributes on the node via provisioning and default and override attributes in the policyfile as you like.

When you enable policyfile mode on a chef-client, it will wipe the run list from the node and replace it with the one from the policyfile. Whatever you set in Chef Provisioning would be silently ignored.

Regards,
Christine

HTH,

--
Daniel DeLeo

Chris_Sibbitt · July 24, 2015, 6:29pm

Thank you very much Christine and Daniel for your helpful discussion. I
think this line is really what answers my question:

So, for the policyfile focused stuff, I opted to make it work more like

knife bootstrap and knife cloud create

That means you specifically were looking at the case where a provisioning
recipe handles just one machine. I don't think I will bother with the "chef
provision" tool in that case, since I don't want to have to enforce that
"one machine per recipe" rule in our code, and the allure of
chef-provisioning for us is being able to use it to bring up whole "Stacks"
of related machines (what Christine called 'topologies' I think).

I did think about clustering scenarios when I was writing chef provision,

but it turns out that it can be complicated depending on what the exact use
case is. Do you want a “throwaway” cluster to integration-test your
cookbooks as a whole? How do you keep different developers’ throwaway
clusters from conflicting with each other?

So far my solution to this (though we haven't worked with it for long
enough to say if it's a good one long term) is to require
ENV['DEPLOYMENTID'] exist and use it in the name of every chef_provisioning
resource. That way, each deployment of a stack has it's own machine names
and can be managed as a distinct entity of related machines.

The workflow I've settled on is similar to how Daniel describes it:

Set up the policyfile
Install/Update/Push the policyfile
Write chef-provisioning code to set up machines to use a named policyfile
Run the provisioning recipes with something like chef-client -z -o my-provisioning::mystack

This thread was about whether I should be using 'chef provision' in step #4
because running chef-zero or chef-solo on my provisioning node feels weird.

For step 3, I've done a more cookbook-heavy version of Daniel's suggestion:

From there, you need to set the policy_group and policy_name on the nodes

via the client.rb (not sure how well this is documented, but you can pass
configuration as a string with { chef_config: “your client.rb content” } as
convergence_options)

Here's what I do:

default['my-provisioning']['node_attrs'] = {
:chef_client => {
:config => {
:use_policyfile => true,
:policy_document_native_api => true,
:policy_group => 'mypolicygroup',
:policy_name => 'mypolicy'
} } }

machine "mymachine-#{ENV['DEPLOYMENTID']}" do
run_list ['recipe[chef-client]', 'recipe[chef-client::config]']
attributes node['my-provisioning']['node_attrs'].merge(
:mycookbooks => {
:config => {
:stuff => 'values'
} } )
end

This uses the chef-client cookbook to set up the policyfile mode for the
node. Actual control of the run_list is in the policyfile.

There are some downsides to this:

You need to run a second converge on the node in order to get it running
the right run_list; so that can either happen automatically via chef-client
service config, or you can add the second converge right in your
chef-provisioning recipe.
IIRC, subsequent executions of the the provisioning recipe don't actually
run the real run_list, they just ensure that it's still set up for the
right policy
You need to serve a version of the chef-client cookbook without the
benefit of policyfile locking
Your run_list is separate from your machine definition, which is a bit
awkward, leaving all your machines with the same uninformative run_list
specification in your provisioning recipes

I think I will experiment with using 'chef_config: “your client.rb
content”' instead of the chef-client cookbook, it might cut down a bit on
complexity.

Thanks again, this is exactly what I was looking for from this thread. I
would be interested in hearing any additional thoughts about my workflow
and especially any insight to how you see the co-ordination between these
tools evolving in the next year or so.

On Thu, Jul 23, 2015 at 7:02 PM, Daniel DeLeo dan@kallistec.com wrote:

On Thursday, July 23, 2015 at 1:19 PM, Christine Draper wrote:

Dan,

I didn't mean 'all nodes being used for testing need the same cookbook
versions'. I wanted to communicate:

A particular topology (dbserver + appserver) should have the same
cookbook versions

Are you using community cookbooks at all? Would it violate a policy if the
dbserver and app server used different versions of a cookbook like apt or
yum or build-essentials? Just want to be sure I’m understanding your
constraints.

Different topologies may have different cookbook versions - depending
on what the tester is working on

That's actually part of why I'm interested in policyfiles - trying to
achieve this with environments would be painful.

The mechanics of the process sounds plausibly workable. However, I
thought policyfiles set the runlist and attributes on the node as well as
constraining cookbooks, so wouldn't that conflict with setting the runlist
and attributes in the provisioning recipe?
Node-specific attributes are not affected by policyfiles, they work the
same as they have. Policyfiles allow you to set attributes in the same way
as roles (defaults and overrides, which have the same precedence as role
attributes). You can pick and choose between setting the attributes on the
node via provisioning and default and override attributes in the policyfile
as you like.

When you enable policyfile mode on a chef-client, it will wipe the run
list from the node and replace it with the one from the policyfile.
Whatever you set in Chef Provisioning would be silently ignored.

Regards,
Christine

HTH,

--
Daniel DeLeo

--

Chris Sibbitt | Infrastructure Systems Architect

http://www.thinkingphones.com/

P: 613.686.1590
CSIBBITT@THINKINGPHONES.COM http://www.thinkingphones.com/[image:
LinkedIn] http://www.linkedin.com/company/thinking-phone-networks[image:
facebook] https://www.facebook.com/thinkingphones
http://www.linkedin.com/company/thinking-phone-networks[image: Twitter]
https://twitter.com/thinkingphones[image: RSS Feed]
http://thinkingphones.com/feed/

kallistec · July 27, 2015, 6:05pm

On Friday, July 24, 2015 at 11:29 AM, Chris Sibbitt wrote:

Thank you very much Christine and Daniel for your helpful discussion. I think this line is really what answers my question:

So, for the policyfile focused stuff, I opted to make it work more like knife bootstrap and knife cloud create

That means you specifically were looking at the case where a provisioning recipe handles just one machine. I don't think I will bother with the "chef provision" tool in that case, since I don't want to have to enforce that "one machine per recipe" rule in our code, and the allure of chef-provisioning for us is being able to use it to bring up whole "Stacks" of related machines (what Christine called 'topologies' I think).

I did think about clustering scenarios when I was writing chef provision, but it turns out that it can be complicated depending on what the exact use case is. Do you want a “throwaway” cluster to integration-test your cookbooks as a whole? How do you keep different developers’ throwaway clusters from conflicting with each other?

So far my solution to this (though we haven't worked with it for long enough to say if it's a good one long term) is to require ENV['DEPLOYMENTID'] exist and use it in the name of every chef_provisioning resource. That way, each deployment of a stack has it's own machine names and can be managed as a distinct entity of related machines.
I should have been a bit more clear on this. I definitely do want chef provision --no-policy to be the best way to run Chef Provisioning in the “I know what I’m doing, just let me do it” case, including when you’re creating full stacks/clusters/etc. Despite the name (which I’m now thinking is a bit misleading), you can definitely use chef provision this way to create machines that will be managed via policyfiles.

The part that I punted on was stitching one or more policyfile operations together with use cases where you would expect the tool to have some concept of “ephemeral clusters.” For example, if you have updates to the policy for both your HTTP application servers and your database servers, and you want to spin up a test cluster which you’ll later tear down. In that case, you might need an ephemeral policy group for the cluster, a cluster ID so you can have unique machine names, you’d have to push multiple policies to the ephemeral policy group, and you should have an easy way to clean everything up when you’re done (and in the ideal case, the tool should be robust to changes in your provisioning code so you don’t accidentally orphan resources that you’re paying for by the hour).

I’d really like to add such a feature in the future, but I think a lot of design is needed to make the experience delightful, so I opted to skip it in order to get the basic part of the feature shipped. Just to be clear about expectations, this advanced clustering isn’t a priority right now so it might not happen at all unless there’s a “pull” for it from a customer or the community.

The workflow I've settled on is similar to how Daniel describes it:

Set up the policyfile

Install/Update/Push the policyfile

Write chef-provisioning code to set up machines to use a named policyfile

Run the provisioning recipes with something like chef-client -z -o my-provisioning::mystack

This thread was about whether I should be using 'chef provision' in step #4 because running chef-zero or chef-solo on my provisioning node feels weird.
That’s exactly what chef provision --no-policy was designed to do, along with giving you more natural ways to pass input (like ENV['DEPLOYMENTID’] that you’re doing now) from the command line. For example, you can pass any arbitrary option to chef provision with -o key=value.

For step 3, I've done a more cookbook-heavy version of Daniel's suggestion:

From there, you need to set the policy_group and policy_name on the nodes via the client.rb (not sure how well this is documented, but you can pass configuration as a string with { chef_config: “your client.rb content” } as convergence_options)

Here's what I do:

default['my-provisioning']['node_attrs'] = {
:chef_client => {
:config => {
:use_policyfile => true,
:policy_document_native_api => true,
:policy_group => 'mypolicygroup',
:policy_name => 'mypolicy'
} } }

machine "mymachine-#{ENV['DEPLOYMENTID']}" do
run_list ['recipe[chef-client]', 'recipe[chef-client::config]']
attributes node['my-provisioning']['node_attrs'].merge(
:mycookbooks => {
:config => {
:stuff => 'values'
} } )
end

This uses the chef-client cookbook to set up the policyfile mode for the node. Actual control of the run_list is in the policyfile.

There are some downsides to this:

You need to run a second converge on the node in order to get it running the right run_list; so that can either happen automatically via chef-client service config, or you can add the second converge right in your chef-provisioning recipe.

IIRC, subsequent executions of the the provisioning recipe don't actually run the real run_list, they just ensure that it's still set up for the right policy

You need to serve a version of the chef-client cookbook without the benefit of policyfile locking

Using the chef_config option in provisioning should fix all of these.

Your run_list is separate from your machine definition, which is a bit awkward, leaving all your machines with the same uninformative run_list specification in your provisioning recipes

In the future, node objects will have fields for policy_name and policy_group, once that’s done then Chef Provisioning can learn to set those, which will solve this. That will also make the chef_config stuff redundant. This is towards the top of my backlog but I don’t have any ETA.

I think I will experiment with using 'chef_config: “your client.rb content”' instead of the chef-client cookbook, it might cut down a bit on complexity.

Thanks again, this is exactly what I was looking for from this thread. I would be interested in hearing any additional thoughts about my workflow and especially any insight to how you see the co-ordination between these tools evolving in the next year or so.

HTH,

--
Daniel DeLeo

loafy · November 6, 2015, 5:00pm

Hi,
Just wondering if anyone has tried solving the clustering problems using chef-provisioning and policyfiles since this thread was last updated. @Chris_Sibbitt, do you have any updates on how your workflow has turned out?
-Maciej

Topic		Replies	Views
Policyfile workflow/mindset questions Chef Infra (archive)	4	1342	January 9, 2016
Environments vs. Metadata vs. Policyfile for locking cookbook dependencies Chef Infra (archive)	12	1536	June 20, 2014
Policyfiles and cookbooks Chef Infra (archive)	3	1498	November 6, 2015
How to use Policyfiles on existing nodes? Chef Infra (archive)	3	2371	January 20, 2016
Single chef server vs. multiple chef servers - pros and cons? Chef Infra (archive)	12	719	October 24, 2014

Policyfiles and chef provision

Chris Sibbitt | Infrastructure Systems Architect

Related topics