How to make one node tell another to do some work

I have an “admin” machine of sorts that runs some reports in a
database, then based on the result of that report, a set of nodes will
need to do some work (customer migrations, updates, etc)

I want the admin machine to tell the end node it has work to do, then
the end node update whatever that thing is to say it is done.

Options I’ve considered:

  1. databag
    admin machine adds entries to a databag that has sub-hashes for each
    end node, end node updates databag to remove work items
    will it work? yes
    but all machines need to be admin clients to be able to edit the
    databag. I don’t want this.

  2. node attributes
    admin machine adds entries to a hash in each individual node’s attributes.
    will it work? I don’t know… I tried this:
    search(:node, "fqdn:#{machine_to_be_updated}) do |machine|
    machine.set[“hash”].push "work_to_do"
    end

and nothing ended up in that hash.

  1. more node attributes
    admin machines adds entries to a has in its own attributes, end nodes
    look for entries in admin machine’s attributes, do the work, then add
    entries to their own attributes that says they did the work, admin
    machine removes entries from its hash based on whether the end node’s
    attributes hash for completed work contains the work item, end node
    removes entries from completed work hash based on what is no longer on
    admin machine
    will it work? yes, but it is messy!

Any other ideas I’m not thinking of? Any way to make #2 work? Should
#2 work as I specified it? Can one node modify another node’s
attributes? Is there a different way to pick up the node besides
search? should i update that node’s attributes using a databag method
(since node attributes are just a type of databag), or can you even do
that? is there a node.save() method that I’m not calling?

Thanks!
-Jesse

This doesn't feel like chef is the right tool for this.

What you're describing is a distributed work queue, no? Why try to do this
via chef?

There are lots of frameworks, tools, services out there that are designed
for this use case. Perhaps I'm missing something in what you're trying to
do?

--
Denis Haskin

On Fri, Sep 28, 2012 at 10:39 AM, Jesse Campbell hikeit@gmail.com wrote:

I have an "admin" machine of sorts that runs some reports in a
database, then based on the result of that report, a set of nodes will
need to do some work (customer migrations, updates, etc)

I want the admin machine to tell the end node it has work to do, then
the end node update whatever that thing is to say it is done.

Options I've considered:

  1. databag
    admin machine adds entries to a databag that has sub-hashes for each
    end node, end node updates databag to remove work items
    will it work? yes
    but all machines need to be admin clients to be able to edit the
    databag. I don't want this.

  2. node attributes
    admin machine adds entries to a hash in each individual node's attributes.
    will it work? I don't know... I tried this:
    search(:node, "fqdn:#{machine_to_be_updated}) do |machine|
    machine.set["hash"].push "work_to_do"
    end

and nothing ended up in that hash.

  1. more node attributes
    admin machines adds entries to a has in its own attributes, end nodes
    look for entries in admin machine's attributes, do the work, then add
    entries to their own attributes that says they did the work, admin
    machine removes entries from its hash based on whether the end node's
    attributes hash for completed work contains the work item, end node
    removes entries from completed work hash based on what is no longer on
    admin machine
    will it work? yes, but it is messy!

Any other ideas I'm not thinking of? Any way to make #2 work? Should
#2 work as I specified it? Can one node modify another node's
attributes? Is there a different way to pick up the node besides
search? should i update that node's attributes using a databag method
(since node attributes are just a type of databag), or can you even do
that? is there a node.save() method that I'm not calling?

Thanks!
-Jesse

+1.. looks more like an orchestration issue,but you dont have too many
tools in that place.. mcollective can be one

On Fri, Sep 28, 2012 at 8:49 PM, Denis Haskin denis@constantorbit.comwrote:

This doesn't feel like chef is the right tool for this.

What you're describing is a distributed work queue, no? Why try to do
this via chef?

There are lots of frameworks, tools, services out there that are designed
for this use case. Perhaps I'm missing something in what you're trying to
do?

--
Denis Haskin

On Fri, Sep 28, 2012 at 10:39 AM, Jesse Campbell hikeit@gmail.com wrote:

I have an "admin" machine of sorts that runs some reports in a
database, then based on the result of that report, a set of nodes will
need to do some work (customer migrations, updates, etc)

I want the admin machine to tell the end node it has work to do, then
the end node update whatever that thing is to say it is done.

Options I've considered:

  1. databag
    admin machine adds entries to a databag that has sub-hashes for each
    end node, end node updates databag to remove work items
    will it work? yes
    but all machines need to be admin clients to be able to edit the
    databag. I don't want this.

  2. node attributes
    admin machine adds entries to a hash in each individual node's attributes.
    will it work? I don't know... I tried this:
    search(:node, "fqdn:#{machine_to_be_updated}) do |machine|
    machine.set["hash"].push "work_to_do"
    end

and nothing ended up in that hash.

  1. more node attributes
    admin machines adds entries to a has in its own attributes, end nodes
    look for entries in admin machine's attributes, do the work, then add
    entries to their own attributes that says they did the work, admin
    machine removes entries from its hash based on whether the end node's
    attributes hash for completed work contains the work item, end node
    removes entries from completed work hash based on what is no longer on
    admin machine
    will it work? yes, but it is messy!

Any other ideas I'm not thinking of? Any way to make #2 work? Should
#2 work as I specified it? Can one node modify another node's
attributes? Is there a different way to pick up the node besides
search? should i update that node's attributes using a databag method
(since node attributes are just a type of databag), or can you even do
that? is there a node.save() method that I'm not calling?

Thanks!
-Jesse

Excerpts from Jesse Campbell's message of Fri Sep 28 16:39:37 +0200 2012:

I have an "admin" machine of sorts that runs some reports in a
database, then based on the result of that report, a set of nodes will
need to do some work (customer migrations, updates, etc)
Sounds like you need a work queue. beanstalkd[1] might be the solution for you.

[1] http://kr.github.com/beanstalkd/

I’m moving this from an orchestration system into chef because it does
not seem to me to be a terribly complex problem, and adding a new
orchestration system to the puzzle seems like a lot of work for such a
small task.

Maybe I was too vague in what I’m trying to do?

Each customer DB server knows what databases are running on it.

Sometimes one of the customer DB servers gets overloaded and we have
to migrate a database from one server to another.

Is there another simple solution to telling server A to dump the DB to
disk, and tell server B to pick it up?

I don’t think that a work queue will work here, because only one DB is
relevant for the task… setting up 40 work queues for the 40 db
servers seems quite messy, and then I need to maintain yet another
service?

-Jesse

nagios event handler.. or what monitoring tool you are using does it have
call backs to alerts?

On Fri, Sep 28, 2012 at 11:03 PM, Jesse Campbell hikeit@gmail.com wrote:

I'm moving this from an orchestration system into chef because it does
not seem to me to be a terribly complex problem, and adding a new
orchestration system to the puzzle seems like a lot of work for such a
small task.

Maybe I was too vague in what I'm trying to do?

Each customer DB server knows what databases are running on it.

Sometimes one of the customer DB servers gets overloaded and we have
to migrate a database from one server to another.

Is there another simple solution to telling server A to dump the DB to
disk, and tell server B to pick it up?

I don't think that a work queue will work here, because only one DB is
relevant for the task... setting up 40 work queues for the 40 db
servers seems quite messy, and then I need to maintain yet another
service?

-Jesse

So the db servers are managed with chef?

I guess if you really felt like you needed to use chef to do this, I guess
my thinking might be that you have a role that corresponds to each
customer's database, e.g. "customerA_db_server", "customerB_db_server", etc.

Then I guess (handwaving like crazy here) you could change the roles for
the 2 nodes and when the nodes converge, somehow the node that used to
have customer A's db would have to know to dump it somewhere, and then the
node that is now responsible for it would know it has to look for the dump
somewhere, and load it.

This feels really ugly, though, and square-peg-in-round-hole-ish. And
maybe it's just a variant on what you were proposing at first with
attributes and such. I dunno.

--
Denis Haskin

On Fri, Sep 28, 2012 at 1:33 PM, Jesse Campbell hikeit@gmail.com wrote:

I'm moving this from an orchestration system into chef because it does
not seem to me to be a terribly complex problem, and adding a new
orchestration system to the puzzle seems like a lot of work for such a
small task.

Maybe I was too vague in what I'm trying to do?

Each customer DB server knows what databases are running on it.

Sometimes one of the customer DB servers gets overloaded and we have
to migrate a database from one server to another.

Is there another simple solution to telling server A to dump the DB to
disk, and tell server B to pick it up?

I don't think that a work queue will work here, because only one DB is
relevant for the task... setting up 40 work queues for the 40 db
servers seems quite messy, and then I need to maintain yet another
service?

-Jesse

Jesse,

after your clarification, I guess it makes sense that you want to use Chef.

However the stock data structures in Chef don’t lend themselves too
well to this kind of scenarios.
So why not move this information outside of Chef?

I’ve been using this: https://gist.github.com/3802182
My recipe can then traverse the returned JSON for a list of “jobs” to
run, e.g. apps to upgrades and so on.

If this sounds useful, I could just publish the cookbook.

Andrea

That does look useful.

I may consider moving to that as I get more and more of the old system
to "go away".
in the mean time, I got my original idea to work, using the "admin"
node to push things into the database nodes' attributes as shown
below.
i've stripped it out, but there are checks to make sure the various
levels of the array exist.
yes... this might be a messy way to do it, but it gets the job done
without adding any new external dependencies to an already
far-too-complex project.

maybe it will make it easier for someone else to understand once I
move on (i've already started my search, and OMG there are a lot of
people happy to get in touch when they see both development and ops
experience AND chef. never thought a single specialized tool would
be the ticket to an interview, but chef seems to be it!

parsed_json["migrations"].each do |site_id, migration_info|
  search(:node, "fqdn:#{migration_info["source_db"]}") do |src_node|
      src_node.set["sam"]["db_migrations"]["source"].push site_id
    src_node.save
  end
  search(:node, "fqdn:#{migration_info["dest_db"]}") do |dest_node|
      dest_node.set["sam"]["db_migrations"]["dest"].push site_id
    dest_node.save
  end
end

On Fri, Sep 28, 2012 at 5:33 PM, Andrea Campi
andrea.campi@zephirworks.com wrote:

Jesse,

after your clarification, I guess it makes sense that you want to use Chef.

However the stock data structures in Chef don't lend themselves too
well to this kind of scenarios.
So why not move this information outside of Chef?

I've been using this: cookbooks/consume_json/libraries/consume_json.rb · GitHub
My recipe can then traverse the returned JSON for a list of "jobs" to
run, e.g. apps to upgrades and so on.

If this sounds useful, I could just publish the cookbook.

Andrea

Wait ... data bags can only be edited by admin nodes? Is this OSS
Chef or Hosted/Private Chef?

(If the latter, that totally torpedoes an idea I have for
databag-driven database provisioning! :frowning: )

On Tue, Oct 9, 2012 at 4:28 PM, Jesse Campbell hikeit@gmail.com wrote:

That does look useful.

I may consider moving to that as I get more and more of the old system
to "go away".
in the mean time, I got my original idea to work, using the "admin"
node to push things into the database nodes' attributes as shown
below.
i've stripped it out, but there are checks to make sure the various
levels of the array exist.
yes... this might be a messy way to do it, but it gets the job done
without adding any new external dependencies to an already
far-too-complex project.

maybe it will make it easier for someone else to understand once I
move on (i've already started my search, and OMG there are a lot of
people happy to get in touch when they see both development and ops
experience AND chef. never thought a single specialized tool would
be the ticket to an interview, but chef seems to be it!

parsed_json["migrations"].each do |site_id, migration_info|
  search(:node, "fqdn:#{migration_info["source_db"]}") do |src_node|
      src_node.set["sam"]["db_migrations"]["source"].push site_id
    src_node.save
  end
  search(:node, "fqdn:#{migration_info["dest_db"]}") do |dest_node|
      dest_node.set["sam"]["db_migrations"]["dest"].push site_id
    dest_node.save
  end
end

On Fri, Sep 28, 2012 at 5:33 PM, Andrea Campi
andrea.campi@zephirworks.com wrote:

Jesse,

after your clarification, I guess it makes sense that you want to use Chef.

However the stock data structures in Chef don't lend themselves too
well to this kind of scenarios.
So why not move this information outside of Chef?

I've been using this: cookbooks/consume_json/libraries/consume_json.rb · GitHub
My recipe can then traverse the returned JSON for a list of "jobs" to
run, e.g. apps to upgrades and so on.

If this sounds useful, I could just publish the cookbook.

Andrea

On Wed, Oct 10, 2012 at 7:34 AM, steve . leftathome@gmail.com wrote:

Wait ... data bags can only be edited by admin nodes? Is this OSS
Chef or Hosted/Private Chef?

With open source Chef Server, you need admin privileges to modify databags.

"Altering data bags from the node when using the Open Source
chef-server requires giving the node's API client admin privileges. In
most cases, this is not advisable."

http://wiki.opscode.com/display/chef/Data+Bags#DataBags-CreatingandEditingDataBagswithinaRecipe

--
Andy Gale

http://twitter.com/andygale
https://alpha.app.net/andygale

This is with chef opensource. I know there is much better ACLs in
hosted/private chef, but I don't know if it has the ability to
'easily' give databag write privileges to specific groups of
clients...

On Wed, Oct 10, 2012 at 2:34 AM, steve . leftathome@gmail.com wrote:

Wait ... data bags can only be edited by admin nodes? Is this OSS
Chef or Hosted/Private Chef?

(If the latter, that totally torpedoes an idea I have for
databag-driven database provisioning! :frowning: )