This thread is interesting since I am trying to introduce a new usage
pattern with bare metal servers and chef-provisioning.
I have one Chef server and whenever we buy a new set of racks in one of our
datacenters, servers boot via PXE and automatically register in Chef in a
specific “firstboot” role.
For now we assign nodes manually with one node per file in a git repo and
#1 is avoided by using git to solve conflicts. #2 is not too bad since we
can re-sync from git if a Chef run happened during a modification.
I would like to stop using Chef nodes as file but use the new chefDK
provision command with a special driver that would “pick” a node from the
firstboot pool (so basically my “cloud” provider is the pool of firstboot
nodes in Chef). Without dealing with concurrent access to Chef provision,
this seem doable: to allocate a node I can “tag” a firstboot node and
delete it once the machine is ready.
But how to do this with concurrent access? It seems almost impossible. And
the way things are going with Policy files will tend towards a separate git
repo and provision cookbook per policy, all sharing the same pool of
firstboot nodes (for now I don’t use Policy files).
I wish I could have a way to “lock” a node or something like that.
On Jul 2, 2015 1:51 AM, “Daniel DeLeo” email@example.com wrote:
On Wednesday, July 1, 2015 at 3:08 PM, Lamont Granquist wrote:
is the closest thing we have for plans for addressing node editing
conflicts, but it still won’t help you in the case of two admins doing a
knife node edit. It does address a lot of the use cases of #2.
#1 might be addressed better via something like a knife plugin the
converted the task the admins were doing into something more similar to
knife node run_list add 'role[foo]'. by getting it onto the command line
you narrow the race condition between reading the old value and writing the
new value. you could also make it even a bit more
declarative/idempotent/convergent so that running it twice by two admins
didn’t result in duplicated edits (unlike knife node run_list add).
Policyfiles mitigate the problem by moving the really contentious part
(the run_list) out of the node and into a different object which is shared
between nodes. If you’re making heavy use of node-specific attributes it
won’t help, but I’d recommend avoiding those as much as possible anyway.