Re: Re: Re: Re: Mixlib::Versioning 1.0.0 released


#1

I’m a bit new to this but surely the dependency possibilities for the cookbooks actually stored on the server are fixed and can be computed in advance such that any node runlist can satisfied very quickly.

Bit then perhaps I totally misunderstand the order of magnitude of the problem.

In any case, I don’t see how offloading that problem to every node on every run makes sense.

FWIW…

Wazza

Warren Bain
http://ninefold.com
Australia’s cloud
direct: +61 2 8221 7729
mobile: +61 414 867 559
follow: http://twitter.com/thoughtcroft

Daniel DeLeo dan@kallistec.com wrote:

On Saturday, March 30, 2013 at 1:55 PM, Jay Feldblum wrote:

Daniel,

The chef-client could also do the dependency resolution itself rather than asking the chef-server to do it. The only new API the chef-server would need to provide is a batch API to fetch the full metadatas of all versions of all cookbooks uploaded to the chef-server, or at least as much of the metadatas as is necessary for the chef-client perform the dependency-resolution. The chef-client can then perform its own dependency-resolution on that data and the chef-server wouldn’t need to be involved.
I dislike this approach because it requires an ever-growing amount of data to be shipped to the client on every run, while not solving the problem of version clobbering. With hosted chef, we see that the cookbook version API call is slower than all the others by quite a wide margin, but with the gecode-based solver (I have less personal experience with the pure-erlang replacement) the constraint solution usually only takes a few milliseconds–the call time of the request is dominated by the large amount of disk and network IO required to get the necessary information into the constraint solver. If you were to automate patch-version bumps to avoid clobbering, then you will automatically exacerbate this issue.

Adding another (potentially much slower) network link in this chain feels like a move in the wrong direction to me.

In fact, perhaps this should be done anyway. Dependency-resolution can take exponential time. Nothing on the chef-server should ever be permitted to take exponential time. While it is a problem for a given chef-client if the dependency-resolution takes too long on the chef-client, it’s a problem for all clients in an infrastructure if that knocks out the whole chef-server rather than just the one chef-client.
Part of the point of my proposal is that dependency resolution is moved to the workstation: compile a list of compatible cookbooks by hand or automatically, then upload (if necessary) and use environments to lock some set of nodes to the pre-computed solution. This feels more elegant because it requires the least computation overall and (more importantly) moves it off of production systems–no worries about all your hosts suddenly chewing up CPU due to a gnarly dependency graph.

Regarding environments, I’ve added my thoughts here:
https://gist.github.com/danielsdeleo/7c55ebe39639928134df/#comment-808117
If roles become environment-version-able then the only thing that’s left are data bags (and clients, but in practice these are tied pretty closely to nodes, so I’m not clear about the use-case). I’ve heard many people say that they’d like data bags to be environment-version-able as well; I’ve always used environment as the id for data bag items where the contents differ per-environment, so the need for this is a bit foreign to me (not to say it’s invalid, I just don’t understand the use case).

Seth,

Regarding Mixlib::Versioning - cool! I’ve added my thoughts here:
https://github.com/opscode/mixlib-versioning/issues/2

Cheers,
Jay


Daniel DeLeo


#2

Warren,

Currently, that would require re-resolving every node’s run-list on every
cookbook or role push. There are ways to optimize that process for the
common case, but that would be the requirement. It would also require
re-resolving the node’s run-list on every node push (e.g. changing the
normal attributes, tags, environment, run-list).

If environments were isolated tenants, cookbooks were per-environment, and
chef-server permitted only one version of a cookbook at once per
environment, then neither the chef-server nor the chef-client would need to
do any dependency-resolutions at all that takes exponential worst-case
time. Dependency-resolution that doesn’t consider versions takes linear
worst-case time because it’s a graph walk, not a constraint problem.
Dependencies with their versions would be resolved on the workstation and
the resolved set could be synced to the server atomically and
transactionally.

Cheers,
Jay

On Sat, Mar 30, 2013 at 8:30 PM, Warren Bain Warren@ninefold.com wrote:

I’m a bit new to this but surely the dependency possibilities for the
cookbooks actually stored on the server are fixed and can be computed in
advance such that any node runlist can satisfied very quickly.

Bit then perhaps I totally misunderstand the order of magnitude of the
problem.

In any case, I don’t see how offloading that problem to every node on
every run makes sense.

FWIW…

Wazza

Warren Bain
http://ninefold.com
Australia’s cloud
direct: +61 2 8221 7729
mobile: +61 414 867 559
follow: http://twitter.com/thoughtcroft

Daniel DeLeo dan@kallistec.com wrote:

On Saturday, March 30, 2013 at 1:55 PM, Jay Feldblum wrote:

Daniel,

The chef-client could also do the dependency resolution itself rather than
asking the chef-server to do it. The only new API the chef-server would
need to provide is a batch API to fetch the full metadatas of all versions
of all cookbooks uploaded to the chef-server, or at least as much of the
metadatas as is necessary for the chef-client perform the
dependency-resolution. The chef-client can then perform its own
dependency-resolution on that data and the chef-server wouldn’t need to be
involved.
I dislike this approach because it requires an ever-growing amount of data
to be shipped to the client on every run, while not solving the problem of
version clobbering. With hosted chef, we see that the cookbook version API
call is slower than all the others by quite a wide margin, but with the
gecode-based solver (I have less personal experience with the pure-erlang
replacement) the constraint solution usually only takes a few
milliseconds–the call time of the request is dominated by the large amount
of disk and network IO required to get the necessary information into the
constraint solver. If you were to automate patch-version bumps to avoid
clobbering, then you will automatically exacerbate this issue.

Adding another (potentially much slower) network link in this chain feels
like a move in the wrong direction to me.

In fact, perhaps this should be done anyway. Dependency-resolution can
take exponential time. Nothing on the chef-server should ever be permitted
to take exponential time. While it is a problem for a given chef-client if
the dependency-resolution takes too long on the chef-client, it’s a problem
for all clients in an infrastructure if that knocks out the whole
chef-server rather than just the one chef-client.
Part of the point of my proposal is that dependency resolution is moved to
the workstation: compile a list of compatible cookbooks by hand or
automatically, then upload (if necessary) and use environments to lock some
set of nodes to the pre-computed solution. This feels more elegant because
it requires the least computation overall and (more importantly) moves it
off of production systems–no worries about all your hosts suddenly chewing
up CPU due to a gnarly dependency graph.

Regarding environments, I’ve added my thoughts here:

https://gist.github.com/danielsdeleo/7c55ebe39639928134df/#comment-808117
If roles become environment-version-able then the only thing that’s left
are data bags (and clients, but in practice these are tied pretty closely
to nodes, so I’m not clear about the use-case). I’ve heard many people say
that they’d like data bags to be environment-version-able as well; I’ve
always used environment as the id for data bag items where the contents
differ per-environment, so the need for this is a bit foreign to me (not to
say it’s invalid, I just don’t understand the use case).

Seth,

Regarding Mixlib::Versioning - cool! I’ve added my thoughts here:
https://github.com/opscode/mixlib-versioning/issues/2

Cheers,
Jay


Daniel DeLeo