Similar to other stories I started with a mono repo, and quickly this
became maintenance hell. Managing upstream forks, managing multiple
committers. Incompatible merges. All of these things happened at one time
or another. We used braid for a while then moved to using berkshelf.
Although there are things with berks i am not happy about, overall it has
been easier to manage.
IMO a cookbook is a software project with it's own lifecycle. As such it
should be treated as it's own project. Managing it's versions deps and
testing independently in it's own repo makes this easier.
A major benefit to breaking out cookbooks was to unearth some hidden
assumptions and tight coupling we had. Tooling for a cook repo is simple to
automate with scripts, We use rake[1] tasks to keep a lot of the testing
framework up to snuff by merging in a skeleton cookbook.
Another benefit was re-use across groups. Breaking up cookbooks from a
chef-repo allows every group in our large org to manage their
infrastructure independently, but for all teams to funnel work back into
the cookbooks. This promotes collaboration vs forking everything from the
mono repo. Or one team owning every infra.
Our workflow is one in which our CI emits tested cooks into a 'cookbook'
server, and everyone uses that as a source for their berks. So that we can
have confidence. Our CI versions[2] our cookbooks automatically, and in
general we 'trust' versions. There are 'repo' and 'integration' tests that
run based on the downstream cookbook jobs.
We are still actively trying to improve our pipeline especially around
integration and release process.
[1]
https://github.com/cloudware-cookbooks/ktc-base/blob/master/Rakefile#L63-L86
[2] cook-scm-ver.rb · GitHub
Jesse Nelson
On Mon, Mar 10, 2014 at 10:50 AM, steve . leftathome@gmail.com wrote:
We launched our big-company-wide Chef initiative with a single repo o'
cookbooks. The repo wasn't designed to any one particular purpose, but it
contained a number of roles that were designed to exercise many of the
cookbooks in it and yield a working service as a starting point. We
started with a couple dozen cookbooks.
Once the number of contributors to this repository went above five,
managing contributions and releases became a big headache.
As soon as we were able to re-tune our CI to trigger off of individual
repo changes as well as the master repo, we split the cookbooks out (using
a shockingly short git filter-branch command and 'hub' to create the GHE
repos in the cookbooks org) and left unified-repo forks a Cheffile as an
upgrade path to split-repo.
(This was shortly before everyone decided Berkshelf was the future but
procedurally generating a Berksfile is just as easy... the formats are
quite similar! )
The benefits of this approach to the maintainer(s) of an individual
cookbook should be pretty obvious - you have one clearly-defined issue /
pull request queue, you don't have to worry too much about rebasing against
a fast-moving repository, etc. ...
This has also made the mechanics of a "release" much easier - we update
the pinned versions in that central Cheffile, pull together a change log
and send out an e-mail once a month. People who want to stay bleeding edge
on a cookbook can pull in off-cycle versions if they want.
It's also more straightforward from a CI approach - each potential
dependency is in its own repository with its own trigger, so it's a bit
easier to constrain the scope of integration to just what's changed ...
though of course it's still possible to do this in a unified repository.
One CI area we haven't really explored enough internally is getting
successfully-tested/released changes in one cookbook to trigger CI runs in
dependent cookbooks, though. (In the meantime, we're triggering manual
builds in the days/hours before release)
In summary, three years ago it might have made sense to have everything in
the same bucket, but I don't think that approach scales up to larger teams
and/or higher frequencies of contributions.
On Mon, Mar 10, 2014 at 8:03 AM, Morgan Blackthorne <stormerider@gmail.com
wrote:
Wanted to bump this thread and see if anyone else had further feedback on
this...
On Friday, February 21, 2014, Morgan Blackthorne stormerider@gmail.com
wrote:
This is actually something that we've been discussing at my workplace.
Right now, we have one master repo for all of our cookbooks, each in their
own subdirectory. Bamboo is polling this repo and will execute a Rake task
on updates to push out new changes via Berkshelf, where the cookbooks are
listed using the 'rel' tag (assuming it passes knife cookbook test on each
of the cookbooks). We also have a secondary scheduled job that runs
foodcritic/rubocop and reports on the results.
Given that we're only using this repo for our own internal cookbooks
which are too specific to be of any use to anyone else (even if Legal would
allow us to share them), what are the pros/cons of this approach? It seems
like we would lower git contention between members of our team if we broke
them out into different repos, but I'm not sure how we would then refactor
the CI jobs. One thing I like about this approach as that the only thing we
have to do in regards to CI is just to add the new cookbook to the
Berksfile and it just works. If we set up Bamboo to monitor multiple repos,
that increases the chance that someone will add a new cookbook and forget
to monitor that new repo in both Bamboo jobs (pushing and linting). Not to
mention that it complicates the jobs themselves which now have to pull in
multiple repos-- Berks will handle that fine, but knife cookbook test will
need them all checked out to execute, as will foodcritic/rubocop on the
linting side, and I definitely like the acceptance criteria of passing
knife before being pushed with berks. I don't like the thought of pushing
it up to the chef server with potentially broken ruby code.
Now, we could do per-repo Rakefile/Berksfile setups, but that increases
the overhead of setting up a new cookbook. And the idea of having 20+ jobs
in Bamboo, each for their own cookbook, seems wrong to me.
Thoughts?
--
~~ StormeRider ~~
"Every world needs its heroes [...] They inspire us to be better than we
are. And they protect from the darkness that's just around the corner."
(from Smallville Season 6x1: "Zod")
On why I hate the phrase "that's so lame"... http://bit.ly/Ps3uSS
On Fri, Feb 21, 2014 at 2:46 PM, Pete Cheslock petecheslock@gmail.comwrote:
Just like the choice between using which configuration management. My
vote is to pick one and go. Starting with a single repo is the easiest to
get started for beginners. And as you scale you can split out into
separate cookbooks.
On Fri, Feb 21, 2014 at 5:36 PM, Booker Bense bbense@gmail.com wrote:
I doubt there's a hard and fast rule to apply to all situations, but
there has been a lot of experience with using a single repo for the entire
set of chef cookbooks. That was more or less the default recommendation 3
years ago. Almost everyone that started there has changed to a repo per
cookbook.
At this point I think you have to have a really strong reason not to use
a repo per cookbook. Or at least a repo per cookbook suite ( a set of
related cookbooks that have interdependencies. )
Having a separate repo for each cookbook will make automated testing
easier and it also imposes some discipline on creating dependencies.
Automated config management is a powerful amplifier, but unfortunately it
amplifies stupid just
as fast as clever. The more testing you do the better, and at this point
the tools are there to make TDD part of your Chef
workflow.
On Fri, Feb 21, 2014 at 2:14 PM, Alex Myasnikov <
amyasnikov@practicefusion.com> wrote:
Ohai Chefs,
I am trying to understand what advantages (and disadvantages if any?)
are there in having a git repo per each cookbook in the chef-repo as
opposed to having all of one's application cookbooks in a single git repo.
Up to this point I was thinking of a single repo containing all
cookbooks (minus community ones managed by Berkshelf), however I came
across a few references (below) that mentioned having git repo per
cookbook. It seems like the latter helps CI, but I am not sure how exactly
and what tangible benefits are there and what potential tradeoffs are. Is
having a repo per each cookbook that's developed constitutes a best
practice?
First reference is from last year's ChefConf presentation in Getting
More Chefs in the Kitchen - Andrew Grosshttp://www.youtube.com/watch?v=ipSudpDYhTM (Slide depicting master repo consisting of individual repos per cookbook)
And then Nathen Harvey's blog post on MVT had this snippet:
- gem install foodcritic
- Go to Travis CI http://travis-ci.org/ and follow the Sign In
link at the top.
- Activate the GitHub Service Hook for your cookbook's repository
from your TravisCI profile page. Each of your cookbooks has its own
repository, right?!
MVT: Foodcritic and Travis CI - CustomInk Technology Blog
Setup:
Chef Server 11
Berkshelf 2.X
Thanks in advance.<
--
~~ StormeRider ~~
"Every world needs its heroes [...] They inspire us to be better than we
are. And they protect from the darkness that's just around the corner."
(from Smallville Season 6x1: "Zod")
On why I hate the phrase "that's so lame"... http://bit.ly/Ps3uSS