Continuous Integration

How are people testing their cookbooks?

Who has continuous integration set up?

Anyone favor particular approaches, insightful recommendations or
scaffolding to share before I set off to reinvent a square wheel?

This hasn't been invented yet. Look forward to it :slight_smile:

On 8 Jul 2010 12:26, "Andrew Shafer" andrew@cloudscaling.com wrote:

How are people testing their cookbooks?

Who has continuous integration set up?

Anyone favor particular approaches, insightful recommendations or
scaffolding to share before I set off to reinvent a square wheel?

On Wed, Jul 7, 2010 at 5:27 PM, AJ Christensen aj@junglist.gen.nz wrote:

This hasn't been invented yet. Look forward to it :slight_smile:

On 8 Jul 2010 12:26, "Andrew Shafer" andrew@cloudscaling.com wrote:

How are people testing their cookbooks?

Who has continuous integration set up?

Anyone favor particular approaches, insightful recommendations or
scaffolding to share before I set off to reinvent a square wheel?

I've got some ideas. You can use SSH to perform remote assertions
(e.g. zero or nonzero return code) on the converged box. E.g. - check
that a package is installed, a service is running, etc. These tests
could be written in any test framework/language which allows you to
invoke SSH and check the return code (or parse SSH command output).

Thoughts? Anyone done anything along these lines?

-- Chad

I wonder if something like this could be useful (at least for file, dir, template, etc)

I also don't have it handy, but I think Jesse Newland at Rails Machine has some public projects for fully testing puppet recipes.

On Jul 7, 2010, at 9:24 PM, Chad Woolley thewoolleyman@gmail.com wrote:

On Wed, Jul 7, 2010 at 5:27 PM, AJ Christensen aj@junglist.gen.nz wrote:

This hasn't been invented yet. Look forward to it :slight_smile:

On 8 Jul 2010 12:26, "Andrew Shafer" andrew@cloudscaling.com wrote:

How are people testing their cookbooks?

Who has continuous integration set up?

Anyone favor particular approaches, insightful recommendations or
scaffolding to share before I set off to reinvent a square wheel?

I've got some ideas. You can use SSH to perform remote assertions
(e.g. zero or nonzero return code) on the converged box. E.g. - check
that a package is installed, a service is running, etc. These tests
could be written in any test framework/language which allows you to
invoke SSH and check the return code (or parse SSH command output).

Thoughts? Anyone done anything along these lines?

-- Chad

On Wed, Jul 7, 2010 at 8:09 PM, Erik Kastner kastner@gmail.com wrote:

I wonder if something like this could be useful (at least for file, dir, template, etc)
GitHub - fakefs/fakefs: A fake filesystem. Use it in your tests.

Ah, you are thinking unit testing, where I was thinking integration testing.

In my experience, unit testing Ops/OS/Deploy code is an exercise in
futility. You can faithfully test drive the implementation as you
THINK it should be, but you invariably find that it doesn't work as
you expected, because of some unexpected behavior/interaction of the
OS or system. So, you change the code to actually work on the real
system, then change your tests to match the code. That is pretty
pointless, it doesn't even buy you the regression safety net of normal
unit tests (because upgrades to the OS/System could break you at any
time, even if your chef code doesn't change). I suppose it does
provide syntax checking, but that's not worth the considerable effort,
in my opinion.

On the other hand, saying "after I run Chef, assert that a real remote
system should have the package X installed and service Y running" is
pretty useful.

-- Chad

It seems what we're all really talking about is some kind of BDD framework
for Chef cookbooks - unit testing (also related to --noop) has proven
unreliable for this kind of thing: you can't mock UNIX.

I know there was work done on this in the past - mikehale's chef-bdd
project GitHub - mikehale/chef-bdd: Cucumber for your chef cookbooks for example.

On 8 July 2010 15:55, Chad Woolley thewoolleyman@gmail.com wrote:

On Wed, Jul 7, 2010 at 8:09 PM, Erik Kastner kastner@gmail.com wrote:

I wonder if something like this could be useful (at least for file, dir,
template, etc)
GitHub - fakefs/fakefs: A fake filesystem. Use it in your tests.

Ah, you are thinking unit testing, where I was thinking integration
testing.

In my experience, unit testing Ops/OS/Deploy code is an exercise in
futility. You can faithfully test drive the implementation as you
THINK it should be, but you invariably find that it doesn't work as
you expected, because of some unexpected behavior/interaction of the
OS or system. So, you change the code to actually work on the real
system, then change your tests to match the code. That is pretty
pointless, it doesn't even buy you the regression safety net of normal
unit tests (because upgrades to the OS/System could break you at any
time, even if your chef code doesn't change). I suppose it does
provide syntax checking, but that's not worth the considerable effort,
in my opinion.

On the other hand, saying "after I run Chef, assert that a real remote
system should have the package X installed and service Y running" is
pretty useful.

-- Chad

On Wed, Jul 7, 2010 at 9:14 PM, AJ Christensen aj@junglist.gen.nz wrote:

It seems what we're all really talking about is some kind of BDD framework
for Chef cookbooks - unit testing (also related to --noop) has proven
unreliable for this kind of thing: you can't mock UNIX.
I know there was work done on this in the past - mikehale's chef-bdd
project GitHub - mikehale/chef-bdd: Cucumber for your chef cookbooks for example.

I don't get what that is for. To only test the filesystem aspects of
cookbooks? Does it somehow mock out other calls to run services and
such? What about stuff like the hostname reference:

Too bad there isn't a README...

AJ Christensen aj@junglist.gen.nz writes:

FWIW, R.I.Pienaar has done some work on this in his mcollective tools:

http://www.devco.net/archives/2010/03/27/infrastructure_testing_with_mcollective_and_cucumber.php

Regards,
Daniel

It seems what we're all really talking about is some kind of BDD framework
for Chef cookbooks - unit testing (also related to --noop) has proven
unreliable for this kind of thing: you can't mock UNIX.

I know there was work done on this in the past - mikehale's chef-bdd
project http:// GitHub - mikehale/chef-bdd: Cucumber for your chef cookbooks for example.

On 8 July 2010 15:55, Chad Woolley thewoolleyman@gmail.com wrote:

On Wed, Jul 7, 2010 at 8:09 PM, Erik Kastner kastner@gmail.com wrote: > >

I wonder if something like this could be useful (at least for file, dir,
template, etc) > GitHub - fakefs/fakefs: A fake filesystem. Use it in your tests.

Ah, you are thinking unit testing, where I was thinking integration testing.

In my experience, unit testing Ops/OS/Deploy code is an exercise in
futility. You can faithfully test drive the implementation as you THINK it
should be, but you invariably find that it doesn't work as you expected,
because of some unexpected behavior/interaction of the OS or system. So,
you change the code to actually work on the real system, then change your
tests to match the code. That is pretty pointless, it doesn't even buy you
the regression safety net of normal unit tests (because upgrades to the
OS/System could break you at any time, even if your chef code doesn't
change). I suppose it does provide syntax checking, but that's not worth
the considerable effort, in my opinion.

On the other hand, saying "after I run Chef, assert that a real remote
system should have the package X installed and service Y running" is pretty
useful.

-- Chad

--
✣ Daniel Pittman :email: daniel@rimspace.net :phone: +61 401 155 707
♽ made with 100 percent post-consumer electrons

We had some good conversations in this area at Velocity. What we came
down to is that possibly the path of least resistance approach is to
leverage the fact that chef recipes are (mostly) idempotent, so that
Chef itself knows if a cookbook has actually worked successfully.
Clearly there are a few resource types that's not true for - execute
(without not_if/only_if etc) is the major one - but in general a chef
run immediately after a successful chef run should result in zero
actions.
So a very quick Red/Green test would be just to run the compile stage,
not converge it, and check that nothing was expected to happen.
BDD - as was pointed out again at LdnDevOps last week - for cookbooks
could end up being just restating your cookbook in a different
language - there needs to be a compelling way of doing this that
doesn't result in (excuse stupid example):

recipe.rb

package "foo" do
action :install
end

file "/etc/foo" do
action :remove
end

bdd.rb

the package "foo" is installed
the file "/etc/foo" is not present

Cheers,
-T

On Thu, Jul 8, 2010 at 09:26, Daniel Pittman daniel@rimspace.net wrote:

AJ Christensen aj@junglist.gen.nz writes:

FWIW, R.I.Pienaar has done some work on this in his mcollective tools:

Infrastructure testing with MCollective and Cucumber | R.I.Pienaar

Regards,
Daniel

It seems what we're all really talking about is some kind of BDD framework
for Chef cookbooks - unit testing (also related to --noop) has proven
unreliable for this kind of thing: you can't mock UNIX.

I know there was work done on this in the past - mikehale's chef-bdd
project http:// GitHub - mikehale/chef-bdd: Cucumber for your chef cookbooks for example.

On 8 July 2010 15:55, Chad Woolley thewoolleyman@gmail.com wrote:

On Wed, Jul 7, 2010 at 8:09 PM, Erik Kastner kastner@gmail.com wrote: > >

I wonder if something like this could be useful (at least for file, dir,
template, etc) > GitHub - fakefs/fakefs: A fake filesystem. Use it in your tests.

Ah, you are thinking unit testing, where I was thinking integration testing.

In my experience, unit testing Ops/OS/Deploy code is an exercise in
futility. You can faithfully test drive the implementation as you THINK it
should be, but you invariably find that it doesn't work as you expected,
because of some unexpected behavior/interaction of the OS or system. So,
you change the code to actually work on the real system, then change your
tests to match the code. That is pretty pointless, it doesn't even buy you
the regression safety net of normal unit tests (because upgrades to the
OS/System could break you at any time, even if your chef code doesn't
change). I suppose it does provide syntax checking, but that's not worth
the considerable effort, in my opinion.

On the other hand, saying "after I run Chef, assert that a real remote
system should have the package X installed and service Y running" is pretty
useful.

-- Chad

--
✣ Daniel Pittman :email: daniel@rimspace.net :phone: +61 401 155 707
♽ made with 100 percent post-consumer electrons

This is something I have an active interest in too as I’m in the
planning/feasibility stages of such a project.

Initially I was thinking of driving the test with expect scripts to
test the system on the ‘first pass’ after it’s been converged by Chef.
That has now grown to include testing Cucumber and the like.

The plan is to automatically build each particular set of systems in
the production configuration tree in VMs (every night/week) and ensure
that the tests pass, that way we know of any issues pre-deployment

In addition to this we’d like to run the same tests post-deployment
and produce a report to act as a sign off for the customer/client.

If anyone gets any further with this or has real world experience or
wants to collaborate, I’d be most interested!

Joel


$ echo “kpfmAdpoofdufevq/dp/vl” | perl -pe ‘s/(.)/chr(ord($1)-1)/ge’

On Thu, Jul 8, 2010 at 12:36 AM, Chad Woolley thewoolleyman@gmail.com wrote:

On Wed, Jul 7, 2010 at 9:14 PM, AJ Christensen aj@junglist.gen.nz wrote:

It seems what we're all really talking about is some kind of BDD framework
for Chef cookbooks - unit testing (also related to --noop) has proven
unreliable for this kind of thing: you can't mock UNIX.
I know there was work done on this in the past - mikehale's chef-bdd
project GitHub - mikehale/chef-bdd: Cucumber for your chef cookbooks for example.

I don't get what that is for. To only test the filesystem aspects of
cookbooks? Does it somehow mock out other calls to run services and
such? What about stuff like the hostname reference:

chef-bdd/features/step_definitions/custom_steps.rb at master · mikehale/chef-bdd · GitHub

Too bad there isn't a README...

This code is very much a work in progress, and I haven't made any
progress in awhile. I started out trying to mock everything, but it
was pointed out to me that mocking UNIX is silly. My current thinking
is to combine some cucumber steps like those in the aforementioned
file with vagrant to auto provision a test cloud that matches a
description. So the cucumber might look something like this:

Given I have cluster X
When I have run chef-client on all the nodes
Then I should see "my load balanced website" on "http://example.com"

I'm thinking that the useful thing to test is NOT did chef install
some package or setup a user, but rather after chef has run can I
interact with the system as I would expect from an external
perspective. For example:

  • Is the website accessible?
  • Are unused ports blocked?
  • When I send an email thorough the website does it end up in my inbox?

Capybara (GitHub - teamcapybara/capybara: Acceptance test framework for web applications) enforces this external
perspective for webapp testing:

"Access to session, request and response from the test is not
possible. Maybe we’ll do response headers at some point in the future,
but the others really shouldn’t be touched in an integration test
anyway. "

They only let you interact with screen elements that a user could
interact with. It makes sense because the things that users interact
with are what provides the business value and if that interface
doesn't work it doesn't really matter.

On Thu, Jul 8, 2010 at 10:38 AM, mikehale mikehale@gmail.com wrote:

On Thu, Jul 8, 2010 at 12:36 AM, Chad Woolley thewoolleyman@gmail.com wrote:

On Wed, Jul 7, 2010 at 9:14 PM, AJ Christensen aj@junglist.gen.nz wrote:

It seems what we're all really talking about is some kind of BDD framework
for Chef cookbooks - unit testing (also related to --noop) has proven
unreliable for this kind of thing: you can't mock UNIX.
I know there was work done on this in the past - mikehale's chef-bdd
project GitHub - mikehale/chef-bdd: Cucumber for your chef cookbooks for example.

I don't get what that is for. To only test the filesystem aspects of
cookbooks? Does it somehow mock out other calls to run services and
such? What about stuff like the hostname reference:

chef-bdd/features/step_definitions/custom_steps.rb at master · mikehale/chef-bdd · GitHub

Too bad there isn't a README...

This code is very much a work in progress, and I haven't made any progress in awhile. I started out trying to mock everything, but it was pointed out to me that mocking UNIX is silly. My current thinking is to combine some cucumber steps like those in the aforementioned file with vagrant to auto provision a test cloud that matches a description. So the cucumber might look something like this:

Given I have cluster X
When I have run chef-client on all the nodes
Then I should see "my load balanced website" on "http://example.com"

On Thu, Jul 8, 2010 at 5:05 PM, Michael Hale mikehale@gmail.com wrote:

I'm thinking that the useful thing to test is NOT did chef install
some package or setup a user, but rather after chef has run can I
interact with the system as I would expect from an external
perspective. For example:

  • Is the website accessible?
  • Are unused ports blocked?
  • When I send an email thorough the website does it end up in my inbox?

Capybara (GitHub - teamcapybara/capybara: Acceptance test framework for web applications) enforces this external
perspective for webapp testing:

If you think about it, kinnda makes sense to call those tools like
capybara and webrat from within cucumber to help do the heavy lifting
for these very high level integration tests. At least to get things
going without spending hours and hours over it all.

Then whats still missing is a couple of things:

  1. Having some standard way of hooking into a chef run, so that once a
    change is commited to your git cookbooks, then a probably git post
    recieve hook can trigger the chef run, and the subsequent testing.

As far as i know integrity continuus integration server would be the
best tool for that job

  1. Secondly, having some common library of test helpers. These would
    be re-usable and perform the heavy lifting for several categories of
    common testing. For example take a nondescript TCP service. It might
    be a webserver. Or it could be anything running over TCP/UDP. Eg
    Samba, NFS, LDAP etc. We probably would want some low-level network
    service test, to assert that a given host is listening on TCP ports
    A,B,C + udp ports X,Y,Z.

In cucumber you would have to get the hostnames etc from the chef
attributes of your test network. So that might require querying the
chef api, i guess.

"Access to session, request and response from the test is not
possible. Maybe we’ll do response headers at some point in the future,
but the others really shouldn’t be touched in an integration test
anyway. "

They only let you interact with screen elements that a user could
interact with. It makes sense because the things that users interact
with are what provides the business value and if that interface
doesn't work it doesn't really matter.

On Thu, Jul 8, 2010 at 10:38 AM, mikehale mikehale@gmail.com wrote:

On Thu, Jul 8, 2010 at 12:36 AM, Chad Woolley thewoolleyman@gmail.com wrote:

On Wed, Jul 7, 2010 at 9:14 PM, AJ Christensen aj@junglist.gen.nz wrote:

It seems what we're all really talking about is some kind of BDD framework
for Chef cookbooks - unit testing (also related to --noop) has proven
unreliable for this kind of thing: you can't mock UNIX.
I know there was work done on this in the past - mikehale's chef-bdd
project GitHub - mikehale/chef-bdd: Cucumber for your chef cookbooks for example.

I don't get what that is for. To only test the filesystem aspects of
cookbooks? Does it somehow mock out other calls to run services and
such? What about stuff like the hostname reference:

chef-bdd/features/step_definitions/custom_steps.rb at master · mikehale/chef-bdd · GitHub

Too bad there isn't a README...

This code is very much a work in progress, and I haven't made any progress in awhile. I started out trying to mock everything, but it was pointed out to me that mocking UNIX is silly. My current thinking is to combine some cucumber steps like those in the aforementioned file with vagrant to auto provision a test cloud that matches a description. So the cucumber might look something like this:

Given I have cluster X
When I have run chef-client on all the nodes
Then I should see "my load balanced website" on "http://example.com"

On Thu, Jul 8, 2010 at 11:21 AM, Dreamcat4 dreamcat4@gmail.com wrote:

  1. Having some standard way of hooking into a chef run, so that once a
    change is commited to your git cookbooks, then a probably git post
    recieve hook can trigger the chef run, and the subsequent testing.

As far as i know integrity continuus integration server would be the
best tool for that job

Getting off topic, but I have to point out that git commit hooks are
fundamentally flawed for triggering CI builds. If the request gets
dropped, or your CI server is down, you fail to build for that commit.
If that commit broke the build, you never know until the next commit,
which could be much later. Polling source control repo log is
failsafe, superior, and supported by all CI tools.

-- Chad

All this capybara talk makes me think that we need some form of "chef_steps"
that can do things like ssh into a server and check that directories are
there or that services are running. I feel like that'd be pretty awesome
both for testing new cookbooks and for ensuring your system was provisioned
correctly.

On Thu, Jul 8, 2010 at 2:21 PM, Dreamcat4 dreamcat4@gmail.com wrote:

On Thu, Jul 8, 2010 at 5:05 PM, Michael Hale mikehale@gmail.com wrote:

I'm thinking that the useful thing to test is NOT did chef install
some package or setup a user, but rather after chef has run can I
interact with the system as I would expect from an external
perspective. For example:

  • Is the website accessible?
  • Are unused ports blocked?
  • When I send an email thorough the website does it end up in my inbox?

Capybara (GitHub - teamcapybara/capybara: Acceptance test framework for web applications) enforces this external
perspective for webapp testing:

If you think about it, kinnda makes sense to call those tools like
capybara and webrat from within cucumber to help do the heavy lifting
for these very high level integration tests. At least to get things
going without spending hours and hours over it all.

Then whats still missing is a couple of things:

  1. Having some standard way of hooking into a chef run, so that once a
    change is commited to your git cookbooks, then a probably git post
    recieve hook can trigger the chef run, and the subsequent testing.

As far as i know integrity continuus integration server would be the
best tool for that job

  1. Secondly, having some common library of test helpers. These would
    be re-usable and perform the heavy lifting for several categories of
    common testing. For example take a nondescript TCP service. It might
    be a webserver. Or it could be anything running over TCP/UDP. Eg
    Samba, NFS, LDAP etc. We probably would want some low-level network
    service test, to assert that a given host is listening on TCP ports
    A,B,C + udp ports X,Y,Z.

In cucumber you would have to get the hostnames etc from the chef
attributes of your test network. So that might require querying the
chef api, i guess.

"Access to session, request and response from the test is not
possible. Maybe we’ll do response headers at some point in the future,
but the others really shouldn’t be touched in an integration test
anyway. "

They only let you interact with screen elements that a user could
interact with. It makes sense because the things that users interact
with are what provides the business value and if that interface
doesn't work it doesn't really matter.

On Thu, Jul 8, 2010 at 10:38 AM, mikehale mikehale@gmail.com wrote:

On Thu, Jul 8, 2010 at 12:36 AM, Chad Woolley thewoolleyman@gmail.com
wrote:

On Wed, Jul 7, 2010 at 9:14 PM, AJ Christensen aj@junglist.gen.nz
wrote:

It seems what we're all really talking about is some kind of BDD
framework
for Chef cookbooks - unit testing (also related to --noop) has proven
unreliable for this kind of thing: you can't mock UNIX.
I know there was work done on this in the past - mikehale's chef-bdd
project GitHub - mikehale/chef-bdd: Cucumber for your chef cookbooks for example.

I don't get what that is for. To only test the filesystem aspects of
cookbooks? Does it somehow mock out other calls to run services and
such? What about stuff like the hostname reference:

chef-bdd/features/step_definitions/custom_steps.rb at master · mikehale/chef-bdd · GitHub

Too bad there isn't a README...

This code is very much a work in progress, and I haven't made any
progress in awhile. I started out trying to mock everything, but it was
pointed out to me that mocking UNIX is silly. My current thinking is to
combine some cucumber steps like those in the aforementioned file with
vagrant to auto provision a test cloud that matches a description. So the
cucumber might look something like this:

Given I have cluster X
When I have run chef-client on all the nodes
Then I should see "my load balanced website" on "http://example.com"

Thom May thom@clearairturbulence.org writes:

We had some good conversations in this area at Velocity. What we came down
to is that possibly the path of least resistance approach is to leverage the
fact that chef recipes are (mostly) idempotent, so that Chef itself knows if
a cookbook has actually worked successfully.

[...]

BDD - as was pointed out again at LdnDevOps last week - for cookbooks could
end up being just restating your cookbook in a different language

nod I will confess, if I was going to look at this in production I would be
aiming for a set of network-wide "does it work" tests ... and have a view to
hooking them to my monitoring system to run routinely, not just when the CMS
made a change.

I don't see substantial value in anything much beyond watching the outcome
of a run after changes, to make sure it did what was expected, good
monitoring, to catch what isn't expected, and good syntax checks.

    Daniel

--
✣ Daniel Pittman :email: daniel@rimspace.net :phone: +61 401 155 707
♽ made with 100 percent post-consumer electrons

Here’s my thinking at this point… which could be wrong on every level.

There is really no good way to TDD/BDD configuration management for several
reasons:
The recipes are already relatively declarative
Mocking is useless because it may not reflect 'ground truth’
The cycle times to really test convergence are relatively long

Trying to test if a package is installed or not is testing the framework,
not the recipe IMHO.

I agree with the general sentiment that the functional service is the true
test.

I’m leaning towards ‘testing’ at that level, ideally with (a superset of?)
what should be used for the production monitoring system.

So the CI builds services, runs all the checks in test, green can go live
and that’s that.

+1 for your summary

On Fri, Jul 9, 2010 at 12:30 PM, Andrew Shafer andrew@cloudscaling.com wrote:

Here's my thinking at this point... which could be wrong on every level.
There is really no good way to TDD/BDD configuration management for several
reasons:
The recipes are already relatively declarative
Mocking is useless because it may not reflect 'ground truth'
The cycle times to really test convergence are relatively long
Trying to test if a package is installed or not is testing the framework,
not the recipe IMHO.
I agree with the general sentiment that the functional service is the true
test.
I'm leaning towards 'testing' at that level, ideally with (a superset of?)
what should be used for the production monitoring system.
So the CI builds services, runs all the checks in test, green can go live
and that's that.

Hey guys,
Thought I’d chime in with my experience testing system configuration code @
RightScale so far. What we’ve been building are integration style cucumber
tests to run a cookbook through it’s paces on all platforms and OSs that we
support.

First we use our API to spin up ‘fresh’ server clusters in EC2, one for
every platform/OS (variation) that the cookbook will be supporting. The
same could be done using other cloud APIs (anyone else doing this with
VMware or etc?) Starting from scratch is important because of chef’s
idempotent nature.

Then a cucumber test is run against every variation in parallel. The
cucumber test runs a series of recipes on the cluster then uses what we call
’spot checks’ to ensure the cluster is configured and functional. The spot
checks are updated when we find a bug, to cover the bug. An example spot
check would be, sshing to every server and checking the mysql.err file for
bad strings.

These high level integration tests are long running but have been very
useful flushing out bugs.
-J

On Fri, Jul 9, 2010 at 9:30 AM, Andrew Shafer andrew@cloudscaling.com wrote:

Here's my thinking at this point... which could be wrong on every level.
There is really no good way to TDD/BDD configuration management for several
reasons:
The recipes are already relatively declarative
Mocking is useless because it may not reflect 'ground truth'
The cycle times to really test convergence are relatively long
Trying to test if a package is installed or not is testing the framework,
not the recipe IMHO.
I agree with the general sentiment that the functional service is the true
test.
I'm leaning towards 'testing' at that level, ideally with (a superset of?)
what should be used for the production monitoring system.
So the CI builds services, runs all the checks in test, green can go live
and that's that.

Agreed on most points, except I'm not sure about the assertion that
testing package installation is testing the framework.

Your recipe could refer to an incorrect package name, or version, or
the system could not be correctly configured with the source to
download that package.

Those are bugs in your recipe, not the framework.

However, this should be caught by the chef run blowing up. And, you
shouldn't even be attempting to test the converged system if the chef
run failed. Which means that package installation is not worth
testing after all...

So, I guess this is leading to the conclusion that the only things
worth testing are the things which could go wrong WITHOUT causing the
actual chef run to fail?