Whyrun Mode


#1

Ohai chef devs!

Many of you have seen that we have a branch of chef called "whyrun"
that’s been receiving a lot of activity lately. I wanted to take some
time to give our community developers a little background, and explain
the impacts it might have to your custom providers. We expect to
merge these changes to master over the next couple-few weeks.

Whyrun exists to allow you to preview minor convergence updates to a
node before they are applied. Chef will evaluate your recipes, and
emit as much information as it can about the changes it would make to
the node along with the reasons for doing so. I specify “minor”, but
you can run it with larger change sets – however, the larger the
convergence, the less meaningful the results of whyrun execution are
likely to be.

A critical requirement of whyrun was to ensure that a whyrun mode
execution will not take any action to converge a node – it will only
report on the actions it would take. A major part of the work we’ve
done in the last few weeks has been to modify the bundled resource
providers to ensure this behavior – more on this later.

Provider action execution has been split into two discrete steps.
These steps occur in both whyrun and non-whyrun modes:

  • evaluation of provider assertions and assumptions
  • convergence

** Assertions and Assumptions

In whyrun mode, Chef tries to run optimistically by making reasonable
assumptions wherever possible – this allows it to continue with the
run in the case of missing resources and other non-fatal events. For
example: if your recipe declares a service resource but that service
is not present on the system, Chef will assume that the service would
have been installed earlier in the run; and will consider it as
installed but not running.

Each provider has its own set of assertions which determines the
conditions required for a successful convergence of the resource.
Each assertion also declares whether execution can continue in whyrun
mode if the assertion should fail. If an assertion fails in
non-whyrun mode, it will typically stop execution (for example: if the
service mentioned above is not actually installed when a recipe says
to start it).

Assertions are evaluated based on the action being performed, and only
after @current_resource is populated and available via completion of
load_current_resource

** Convergence

In normal execution mode, convergence operates the same as it always
has - it applies the changes necessary to further converge the node.
In whyrun mode, each resource provider documents the actions that it
would take to converge the resource, without actually executing those
actions.

The convergence step occurs after all assertions have been evaluated.
In order to reach convergence, all assertions must have either passed,
or (in whyrun mode) failed with whyrun descriptions and assumptions in
place.

** Recipe Notes

  • resource only_if and not_if blocks will be executed even in whyrun
    mode. These blocks should not be causing changes to occur to the
    system. Evaluation of them is necessary in order to provide useful
    whyrun output. In order to safely execute in whyrun mode, you will
    need to ensure that your not_if and only_if blocks take no action to
    change the system state.
    ruby code embedded in recipes will still execute normally, therefore
    the same warning applies: to safely run in whyrun mode, you will need
    to ensure your ruby code does not affect system state.

  • ruby block resources are whyrun compatible - they will not be
    executed in whyrun mode.
    resources created dynamically in your recipes are whyrun compatible.

  • compilation-time execution of resources may cause odd behaviors.
    These resources will not converge in whyrun mode - so if you are
    installing them as dependencies in order to allow subsequent recipes
    to evaluate correctly, the subsequent recipes will fail because no
    resource was actually installed. In the example output below, whyrun
    mode states that we would install a package; but then a subsequent
    attempt to use that package within the recipe fails because it was
    never actually installed:

WHY RUN: would install version 0.4.6 of package oauth
[Mon, 07 May 2012 23:42:30 +0000] DEBUG: Re-raising exception: LoadError - no such file to load – oauth

** LWRP and Custom Provider Compatibility

By default, Chef assumes that LWRPs and custom providers are not
compatible with whyrun mode. When an action is to be invoked, chef
will block the entire action and only document “would execution action
XYZ”. However, a LWRP or provider can declare whyrun compatibility -
in so doing, it is saying at minimum that invoking any action method
will cause no convergence to occur in whyrun mode. Chef has no way to
verify this, so it must trust any given provider that declares whyrun
mode support.

The chef provider framework has been extended to supply the tools
needed to ensure that providers are compatible with whyrun. Detailed
documentation will be made available at the time we merge into master.
For a sneak preview you can take a look at the following (some
changes will occur, but nothing major is likely prior to the merge
into master):

In the process of making the necessary changes to our providers, the
following patterns emerged:

  • Providers must fail early. As many failure conditions as possible
    should be handled up front, which means moving them out of your
    "action" methods and into the new "define_resource_requirements"
    method in the form of assertions. The more you can front-load your
    failure conditions, the more meaningful and accurate whyrun output for
    your provider will be.

  • When determining whether a failed assumption should stop execution
    in whyrun mode, it should pass this basic litmus test: “In a
    non-whyrun execution, would it be reasonable to assume some action
    would have previously occurred to prevent this assertion from
    failing?”.

  • When determining how to group actions together, try to group the
    change together logically. In the file provider, action_create
    actually contains two discrete changes: 1) create or modify file 2)
    set file permissions.

  • Converge actions should not generally be nested: it’s not safe to
    put a converge_by block inside of another converge_by block,
    especially across multiple providers. This means that you’ll need to
    be aware of any nested resource calls your provider makes.

  • Errors can occur in load_current_resource that would normally
    prevent the provider from proceeding. If you want to be able to
    continue in whyrun mode, the best way to do so is to capture the
    failure in load_current_resource using an instance variable and
    without raising an exception; then document it for whyrun by
    declaring an assertion against that variable.

** Feedback

We’re excited to be close to completion of this long-requested
feature, and look forward to your feedback. Because this is not yet a
released feature of the Chef platform, please refrain from filing
tickets against it. In the interim, this mailing list is the best
place for questions, feedback, issues and discussion.

Thanks!

Marc Paradise
Software Development Engineer
Opscode

Twitter: @MarcParadise
irc: mparadise
Skype: marc.paradise