I’m a bit new to this but surely the dependency possibilities for the cookbooks actually stored on the server are fixed and can be computed in advance such that any node runlist can satisfied very quickly.
Bit then perhaps I totally misunderstand the order of magnitude of the problem.
In any case, I don’t see how offloading that problem to every node on every run makes sense.
FWIW…
Wazza
Warren Bain
http://ninefold.com
Australia’s cloud
direct: +61 2 8221 7729
mobile: +61 414 867 559
follow: http://twitter.com/thoughtcroft
Daniel DeLeo dan@kallistec.com wrote:
On Saturday, March 30, 2013 at 1:55 PM, Jay Feldblum wrote:
Daniel,
The chef-client could also do the dependency resolution itself rather than asking the chef-server to do it. The only new API the chef-server would need to provide is a batch API to fetch the full metadatas of all versions of all cookbooks uploaded to the chef-server, or at least as much of the metadatas as is necessary for the chef-client perform the dependency-resolution. The chef-client can then perform its own dependency-resolution on that data and the chef-server wouldn’t need to be involved.
I dislike this approach because it requires an ever-growing amount of data to be shipped to the client on every run, while not solving the problem of version clobbering. With hosted chef, we see that the cookbook version API call is slower than all the others by quite a wide margin, but with the gecode-based solver (I have less personal experience with the pure-erlang replacement) the constraint solution usually only takes a few milliseconds–the call time of the request is dominated by the large amount of disk and network IO required to get the necessary information into the constraint solver. If you were to automate patch-version bumps to avoid clobbering, then you will automatically exacerbate this issue.
Adding another (potentially much slower) network link in this chain feels like a move in the wrong direction to me.
In fact, perhaps this should be done anyway. Dependency-resolution can take exponential time. Nothing on the chef-server should ever be permitted to take exponential time. While it is a problem for a given chef-client if the dependency-resolution takes too long on the chef-client, it’s a problem for all clients in an infrastructure if that knocks out the whole chef-server rather than just the one chef-client.
Part of the point of my proposal is that dependency resolution is moved to the workstation: compile a list of compatible cookbooks by hand or automatically, then upload (if necessary) and use environments to lock some set of nodes to the pre-computed solution. This feels more elegant because it requires the least computation overall and (more importantly) moves it off of production systems–no worries about all your hosts suddenly chewing up CPU due to a gnarly dependency graph.
Regarding environments, I’ve added my thoughts here:
https://gist.github.com/danielsdeleo/7c55ebe39639928134df/#comment-808117
If roles become environment-version-able then the only thing that’s left are data bags (and clients, but in practice these are tied pretty closely to nodes, so I’m not clear about the use-case). I’ve heard many people say that they’d like data bags to be environment-version-able as well; I’ve always used environment as the id for data bag items where the contents differ per-environment, so the need for this is a bit foreign to me (not to say it’s invalid, I just don’t understand the use case).
Seth,
Regarding Mixlib::Versioning - cool! I’ve added my thoughts here:
https://github.com/opscode/mixlib-versioning/issues/2
Cheers,
Jay
–
Daniel DeLeo