Policyfile workflow/mindset questions

I’ve made a few casual, aborted attempts at updating our monolithic Chef repo to
be able to use Policyfiles, and I’m ready to question whether our Chef usage
patterns would be compatible with Policyfiles as they are currently implemented,
or whether I’m just approaching Policyfiles in the “wrong” way.

I’ve read multiple times in multiple places that Policyfiles are meant to enable
an opinionated workflow, and I’m willing to accept if that workflow just doesn’t
jive with how things are currently done around here, or equally willing to check
my assumptions at the door and adopt a new mindset if the resulting workflow
truly is better.

Current Chef usage

  • Team of one (me), first introduced infrastructure-as-code/automation at this
    company last July - previous infrastructure was all ad-hoc
  • Worked ~15 months with Chef at previous job and carried many (hopefully good)
    practices forward
  • Merging/splitting businesses over company history, along with varying levels
    of security and performance requirements, led to multiple AWS accounts with no
    peering or external-to-external connectivity allowed (at this time) as well as
    some bare-metal servers at a local data center
  • Currently have three separate Chef servers spread across these AWS accounts
    with their own single orgs and their own node inventory
  • Each Chef org/business division has some common needs and some unique needs
  • Each Chef org/business division has some combination of dev/staging/production
    nodes (both in actual real-world purpose and Chef node->environment assignments)
  • Majority of Chef recipe logic for the common needs can be shared as-is between
    nodes belonging to different Chef orgs, with small variances that are mostly
    done via attributes
  • Each Chef Server receives the full complement of our internal cookbooks, and
    most of the same data bags - to be quite honest, there are probably better
    isolation practices to be had here
  • Chef environments are currently respected via conditional logic in our
    internal cookbooks, rather than applying attributes, roles or run lists
    directly themselves
  • Internal cookbooks are version-pinned in production Chef environment for
    controlled roll-out
  • All existing Chef nodes are single purpose (no sharing a host between DB and
    app, etc.)
  • Nodes are provisioned for their intended purpose by composing base/shared
    recipes and single-purpose recipes via the run list
  • No Chef roles are used at this time (my mentor in Chef believed they were an
    anti-pattern, for many of the reasons which led to the development of
    Policyfiles as far as I understand)
  • Almost no community cookbooks are used as-is, but are wrapped first - some of
    the wrappers are merely setting attributes and then including the community
    cookbook
  • Behavioral variances between organizations that are not merely “change this
    attribute to another value” are done via recipes that extend or replace the
    base recipe
  • Merges to the master branch of our monolith Chef repo trigger an upload of
    cookbooks to all three Chef servers, via a root Berksfile/Berksfile.lock that
    walks the cookbooks directory for our internal cookbooks

Policyfile-related questions and thoughts

Some of the above makes Policyfiles seem appealing to me, but I am experiencing
some hurdles for adoption and am trying to determine how many of them are
technical and how many are due to incorrect or incomplete assumptions on my
part, or my continuing use of bad or outdated habits within a newer paradigm.

  • Are Policyfiles, as implemented, required to be the single entrypoint and
    source of truth for what Chef does to a given node? Is this likely to
    continue to be so?

  • If yes to the above, are there any ways to DRY up the inclusion of the
    base/shared recipes which do things like set up our OS user accounts, package
    manager defaults, etc. when using Policyfiles?

  • I currently deploy the same cookbook multiple times with
    dev/staging/production variants, often multiplied by the three Chef
    organizations we have that require slightly different configuration via
    attributes. Am I likely going to need a separate Policyfile per org and/or per
    environment?

  • What’s the workflow look like for promoting a given cookbook version from dev
    to staging to production, or whatever their equivalent is at your organization?

  • If you’ve implemented Policyfiles at your organization, could you describe
    what some of your project’s file structure and workflow looks like? I’m
    considering adopting a policies/$org/$purpose/Policyfile.rb structure or
    similar and storing them as part of our monolithic Chef repo if I pursue this
    seriously

  • Last I checked, it was not possible to add or override arbitrary attributes in
    TestKitchen when using the policyfile_zero provisioner, like you could with
    the other provisioners, but that might be added eventually. What about in
    "real-world" usage?

  • Has the story about other third-party tools (i.e. Terraform) gotten any better
    yet?

Closing Comments

In particular, I think a few of our simpler wrapper cookbooks could disappear
entirely in favor of Policyfiles.

The main things blocking my adoption, other than inexperience, are my widespread
use of chef-sugar in multiple internal cookbooks, which will require either an
update to that cookbook or a rewrite to use within a Policyfile due to
this issue.

The monolithic repo model is also proving to make it difficult if not impossible
to do any kind of gradual update and rollout of a shift to Policyfiles. This has
also caused me headaches with our trial of Delivery, so it may be time to beg
the powers that be for permission to create some 30-40 additional private Git
repos.

1 Like

I'm not entirely sure what you mean by this, but I can say that the Policyfile.lock.json that gets generated is the only place that chef-client will look for a run list, the attributes at the role-level precedence, and the cookbook version set. Currently, the only input into the process that creates lock JSON is one ruby file. In the future we're planning to have some kind of "snippet" mechanism where you can have multiple files as inputs to that process. That will be a post-1.0 kind of thing though.

Right now you would have to either accept the duplication or use role cookbook type approaches.

You would have a policy group for dev/stage/prod; the point of the policy group is you can have a different revision of your policy active in each group. What I've done to replace environment attributes is stick those in a data bag item and then copy that to the node via a cookbook. The other way to do that is to put the full matrix of attributes in your policyfile and then select the right set for the environment with logic in a cookbook.

I have a customer who uses a monolithic repo, with a single policies/ directory. You can name the policies whatever you want (e.g., policies/some-app-frontend.rb) and all the chef commands support an extra argument to reference the policyfile (e.g., chef install policies/some-app-frontend.rb). Though if you would be happier with shorter command lines and using cd to change contexts, what you've described is also ok.

TestKitchen will soon natively support policyfiles the same way it does berks (no custom provisioner required), and we'll address that then.

I'm not personally aware of the problem, but if you have the latest chef-client and chef server, you can set the policy name and policy group in the node json, if that helps.

That issue should only apply if you are using the "compatibility mode" which is where cookbooks get uploaded with zany version numbers to emulate the behavior of the cookbook_artifact endpoint that now exists in Chef Server. Unfortunately, Chef Zero doesn't support that API just yet, and Chef Zero is what you use in test kitchen (via the chef-client local mode feature), but that is also coming soon. That bug should go away once Chef Zero supports the new policyfile and cookbook artifact APIs.

You should be able to do full verticals by server type. As long as you don't turn on the compatibility mode, you can have some nodes using policyfiles and some not using them in the same Chef Server and organization. Likewise, storing your Policyfiles in a monolithic repo is an explicitly supported thing in Policyfiles, and you can use the default_source :chef_repo, "path" directive to pull cookbooks from a monolithic repo.

1 Like

Thanks for the very thorough reply, Daniel!

I currently deploy the same cookbook multiple times with
dev/staging/production variants, often multiplied by the three Chef
organizations we have that require slightly different configuration via
attributes

Different variants or different versions of the same cookbook? If it is
actually different variants then I feel this is better handled by
environments (or some equivalent like a structure in your policyfile to
supply environment variables) and let the logic in the cookbook code make
the changes for the environment based on those attributes. That way you’re
controlling a common set of cookbooks for all environments (per org).

I feel like this might simplify your implementation (it will at least
eliminate the need for having a separate Policyfile per environment, which
seems counter-intuitive to me), especially if the desire is to test a
Policyfile on an environment and then promote that same policy file to the
next environment as a way of controlling release/change.

The last time I looked at policyfiles (which was admittedly a while ago) it
seemed like environment support was a little lacking/difficult, but it also
isn’t necessarily something that is directly related to Policyfiles.
I think it could solve a few problems if the Policyfile structure could
somehow reference per environment (per policy group?) attributes, either
from a data structure in the policyfile or from an external file, but with
the output lock file containing all static values so it can still be a
versionable object.

I've done something like this with data bags:

  env_data = Chef::DataBagItem.load('application_environment_APPNAME', LIFECYCLE_STAGE).raw_data
  env_data.delete('id')
  # Use `env_default` because it will be unused in policyfile mode
  attributes.env_default = env_data

As you say, the data bag item could be updated out of band, but in this case that wasn't a concern because the data bag item is owned by the same CD system that is managing the cookbooks and policyfiles.

Open to hearing what others have done and adding sugar in Chef core for the most successful patterns.