Re: Re: Push Jobs Questions


#1

Great, thanks for the links. Somehow google wasn’t being helpful.

My use case is fairly simple. I want to run chef-client on my nodes in a
particular environment but I want to ensure that,

  • I don’t end up restarting all the services at once
  • A bad config doesn’t take out the whole service

The simplest version of this would just be to group the nodes into X number
of groups and run chef-client one group at a time. Knife ssh has a
concurrency option which limits the number of ssh connections which would
also achieve the same goal.

On Fri Nov 21 2014 at 1:45:45 PM AJ Christensen aj@junglistheavy.industries
wrote:

https://github.com/opscode/omnibus-pushy
https://github.com/opscode/oc-pushy-pedant
https://github.com/opscode/omnibus-push-jobs-client
https://github.com/opscode/opscode-pushy-server
https://github.com/opscode/opscode-pushy-simulator
https://github.com/opscode/pushy_common
https://github.com/opscode/knife-push
https://github.com/opscode/opscode-pushy-client

Try searching through the issues, or logging an issue or feature
request or RFC. "don’t run command X on all nodes at the same time"
sounds a little hard to implement. Do you mean mutually exclusive,
configurable contention, locking of commands?

Can you describe your use-case? An independent team of Chef operators
has been evaluating use cases and building tests/example cases for
Pushy and tools in the same field (our cases thus far have been around
Cassandra ring bootstrap, expansion and contraction)

cheers,

–aj

On Sat, Nov 22, 2014 at 8:39 AM, Bryan Baugher bjbq4d@gmail.com wrote:

Hello everyone,

Is the push jobs code available on github anywhere? Also are there any
plans
to add a kind of concurrency option (i.e. don’t run command X on all
nodes
at the same time)?

Bryan


#2

You may be able to use the Quorum functionality to half-bake this [0]:

knife job start --quorum 90% 'chef-client' --search 'role:webapp'

The minimum number of nodes that match the search criteria, are
available, and acknowledge the job request. This can be expressed as a
percentage (e.g. 50%) or as an absolute number of nodes (e.g. 145).
Default value: 100%

I’d try values like 50% of your available foobars and see how it works
out for ya.

Good Luck!

cheers,

–aj

[0] https://github.com/opscode/knife-push#examples

On Sat, Nov 22, 2014 at 8:56 AM, Bryan Baugher bjbq4d@gmail.com wrote:

Great, thanks for the links. Somehow google wasn’t being helpful.

My use case is fairly simple. I want to run chef-client on my nodes in a
particular environment but I want to ensure that,

  • I don’t end up restarting all the services at once
  • A bad config doesn’t take out the whole service

The simplest version of this would just be to group the nodes into X number
of groups and run chef-client one group at a time. Knife ssh has a
concurrency option which limits the number of ssh connections which would
also achieve the same goal.

On Fri Nov 21 2014 at 1:45:45 PM AJ Christensen
aj@junglistheavy.industries wrote:

https://github.com/opscode/omnibus-pushy
https://github.com/opscode/oc-pushy-pedant
https://github.com/opscode/omnibus-push-jobs-client
https://github.com/opscode/opscode-pushy-server
https://github.com/opscode/opscode-pushy-simulator
https://github.com/opscode/pushy_common
https://github.com/opscode/knife-push
https://github.com/opscode/opscode-pushy-client

Try searching through the issues, or logging an issue or feature
request or RFC. "don’t run command X on all nodes at the same time"
sounds a little hard to implement. Do you mean mutually exclusive,
configurable contention, locking of commands?

Can you describe your use-case? An independent team of Chef operators
has been evaluating use cases and building tests/example cases for
Pushy and tools in the same field (our cases thus far have been around
Cassandra ring bootstrap, expansion and contraction)

cheers,

–aj

On Sat, Nov 22, 2014 at 8:39 AM, Bryan Baugher bjbq4d@gmail.com wrote:

Hello everyone,

Is the push jobs code available on github anywhere? Also are there any
plans
to add a kind of concurrency option (i.e. don’t run command X on all
nodes
at the same time)?

Bryan


#3

hi,

please inform what values to be given for push jobs

“package_url”: “”,
“package_checksum”: “”

.

thanks,
K.Gopalakrishnan

On Sat, Nov 22, 2014 at 1:40 AM, AJ Christensen <aj@junglistheavy.industries

wrote:

You may be able to use the Quorum functionality to half-bake this [0]:

knife job start --quorum 90% 'chef-client' --search 'role:webapp'

The minimum number of nodes that match the search criteria, are
available, and acknowledge the job request. This can be expressed as a
percentage (e.g. 50%) or as an absolute number of nodes (e.g. 145).
Default value: 100%

I’d try values like 50% of your available foobars and see how it works
out for ya.

Good Luck!

cheers,

–aj

[0] https://github.com/opscode/knife-push#examples

On Sat, Nov 22, 2014 at 8:56 AM, Bryan Baugher bjbq4d@gmail.com wrote:

Great, thanks for the links. Somehow google wasn’t being helpful.

My use case is fairly simple. I want to run chef-client on my nodes in a
particular environment but I want to ensure that,

  • I don’t end up restarting all the services at once
  • A bad config doesn’t take out the whole service

The simplest version of this would just be to group the nodes into X
number
of groups and run chef-client one group at a time. Knife ssh has a
concurrency option which limits the number of ssh connections which would
also achieve the same goal.

On Fri Nov 21 2014 at 1:45:45 PM AJ Christensen
aj@junglistheavy.industries wrote:

https://github.com/opscode/omnibus-pushy
https://github.com/opscode/oc-pushy-pedant
https://github.com/opscode/omnibus-push-jobs-client
https://github.com/opscode/opscode-pushy-server
https://github.com/opscode/opscode-pushy-simulator
https://github.com/opscode/pushy_common
https://github.com/opscode/knife-push
https://github.com/opscode/opscode-pushy-client

Try searching through the issues, or logging an issue or feature
request or RFC. "don’t run command X on all nodes at the same time"
sounds a little hard to implement. Do you mean mutually exclusive,
configurable contention, locking of commands?

Can you describe your use-case? An independent team of Chef operators
has been evaluating use cases and building tests/example cases for
Pushy and tools in the same field (our cases thus far have been around
Cassandra ring bootstrap, expansion and contraction)

cheers,

–aj

On Sat, Nov 22, 2014 at 8:39 AM, Bryan Baugher bjbq4d@gmail.com
wrote:

Hello everyone,

Is the push jobs code available on github anywhere? Also are there any
plans
to add a kind of concurrency option (i.e. don’t run command X on all
nodes
at the same time)?

Bryan


#4

I don’t think the quorum functionality behaves like you think. The quorum essentially says “If at least X% is available, run on all”.

In my opinion, a better pattern here would be to create an orchestration recipe that depends on the push_jobs cookbook (and thus gets access to the push_jobs resource). Within that recipe, you would do a search to grab all of your potential nodes, then loop over that resource N nodes at time (being sure to set the parameter to wait for the job to finish before moving on).

On Sat, Nov 22, 2014 at 1:40 AM, AJ Christensen <aj@junglistheavy.industriesmailto:aj@junglistheavy.industries> wrote:
You may be able to use the Quorum functionality to half-bake this [0]:

knife job start --quorum 90% 'chef-client' --search 'role:webapp'

The minimum number of nodes that match the search criteria, are
available, and acknowledge the job request. This can be expressed as a
percentage (e.g. 50%) or as an absolute number of nodes (e.g. 145).
Default value: 100%

I’d try values like 50% of your available foobars and see how it works
out for ya.

Good Luck!

cheers,

–aj

[0] https://github.com/opscode/knife-push#exampleshttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_opscode_knife-2Dpush-23examples&d=AAMFaQ&c=NRtzTzKNaCCmhN_9N2YJR-XrNU1huIgYP99yDsEzaJo&r=GILYRJcJRsxsdqYUMh0wHjDg362khLkdVkm1ik6VeYg&m=r23jRUZU6t4of1AGx4jtH7VDqSNuTbWqQ_Wdq2QZqjQ&s=EDuLq-T0kIqVdHmna3z2oVhzXYJXqr5ffKsc-OsVTM0&e=

On Sat, Nov 22, 2014 at 8:56 AM, Bryan Baugher <bjbq4d@gmail.commailto:bjbq4d@gmail.com> wrote:

Great, thanks for the links. Somehow google wasn’t being helpful.

My use case is fairly simple. I want to run chef-client on my nodes in a
particular environment but I want to ensure that,

  • I don’t end up restarting all the services at once
  • A bad config doesn’t take out the whole service

The simplest version of this would just be to group the nodes into X number
of groups and run chef-client one group at a time. Knife ssh has a
concurrency option which limits the number of ssh connections which would
also achieve the same goal.

On Fri Nov 21 2014 at 1:45:45 PM AJ Christensen
<aj@junglistheavy.industriesmailto:aj@junglistheavy.industries> wrote:

https://github.com/opscode/omnibus-pushyhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_opscode_omnibus-2Dpushy&d=AAMFaQ&c=NRtzTzKNaCCmhN_9N2YJR-XrNU1huIgYP99yDsEzaJo&r=GILYRJcJRsxsdqYUMh0wHjDg362khLkdVkm1ik6VeYg&m=r23jRUZU6t4of1AGx4jtH7VDqSNuTbWqQ_Wdq2QZqjQ&s=MVthdhPwK5f5TNG6lv1DVeMreAwPkSOXQVqUFxu7y9g&e=
https://github.com/opscode/oc-pushy-pedanthttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_opscode_oc-2Dpushy-2Dpedant&d=AAMFaQ&c=NRtzTzKNaCCmhN_9N2YJR-XrNU1huIgYP99yDsEzaJo&r=GILYRJcJRsxsdqYUMh0wHjDg362khLkdVkm1ik6VeYg&m=r23jRUZU6t4of1AGx4jtH7VDqSNuTbWqQ_Wdq2QZqjQ&s=DG3gYbIegf0gDgiAJ6U-6WWRyX19pe_UqAHrZJT0xuQ&e=
https://github.com/opscode/omnibus-push-jobs-clienthttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_opscode_omnibus-2Dpush-2Djobs-2Dclient&d=AAMFaQ&c=NRtzTzKNaCCmhN_9N2YJR-XrNU1huIgYP99yDsEzaJo&r=GILYRJcJRsxsdqYUMh0wHjDg362khLkdVkm1ik6VeYg&m=r23jRUZU6t4of1AGx4jtH7VDqSNuTbWqQ_Wdq2QZqjQ&s=IZ5XQJ1NqoGgQuTr-J7IZbOa7mpJ9cHotms5CVJNz88&e=
https://github.com/opscode/opscode-pushy-serverhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_opscode_opscode-2Dpushy-2Dserver&d=AAMFaQ&c=NRtzTzKNaCCmhN_9N2YJR-XrNU1huIgYP99yDsEzaJo&r=GILYRJcJRsxsdqYUMh0wHjDg362khLkdVkm1ik6VeYg&m=r23jRUZU6t4of1AGx4jtH7VDqSNuTbWqQ_Wdq2QZqjQ&s=KRS1NnHxx3I_ONzTqA2gyVIpzf71HTC2G-TbdKGR-o0&e=
https://github.com/opscode/opscode-pushy-simulatorhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_opscode_opscode-2Dpushy-2Dsimulator&d=AAMFaQ&c=NRtzTzKNaCCmhN_9N2YJR-XrNU1huIgYP99yDsEzaJo&r=GILYRJcJRsxsdqYUMh0wHjDg362khLkdVkm1ik6VeYg&m=r23jRUZU6t4of1AGx4jtH7VDqSNuTbWqQ_Wdq2QZqjQ&s=pR5ZHA42r7dqOueD0hsdZN98nAv3xKnZUQbZCvDzu0w&e=
https://github.com/opscode/pushy_commonhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_opscode_pushy-5Fcommon&d=AAMFaQ&c=NRtzTzKNaCCmhN_9N2YJR-XrNU1huIgYP99yDsEzaJo&r=GILYRJcJRsxsdqYUMh0wHjDg362khLkdVkm1ik6VeYg&m=r23jRUZU6t4of1AGx4jtH7VDqSNuTbWqQ_Wdq2QZqjQ&s=hGrXpV2GtpWIyzAcrfDeG0jZu4cDH0sDlPwS7zvq9Mw&e=
https://github.com/opscode/knife-pushhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_opscode_knife-2Dpush&d=AAMFaQ&c=NRtzTzKNaCCmhN_9N2YJR-XrNU1huIgYP99yDsEzaJo&r=GILYRJcJRsxsdqYUMh0wHjDg362khLkdVkm1ik6VeYg&m=r23jRUZU6t4of1AGx4jtH7VDqSNuTbWqQ_Wdq2QZqjQ&s=Kf6GiRTy1uUsfblB1xYt7qjINl76goz8tpH12UWnxG4&e=
https://github.com/opscode/opscode-pushy-clienthttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_opscode_opscode-2Dpushy-2Dclient&d=AAMFaQ&c=NRtzTzKNaCCmhN_9N2YJR-XrNU1huIgYP99yDsEzaJo&r=GILYRJcJRsxsdqYUMh0wHjDg362khLkdVkm1ik6VeYg&m=r23jRUZU6t4of1AGx4jtH7VDqSNuTbWqQ_Wdq2QZqjQ&s=9okJKTer1coXiNPHfrPQViIy1kCfjBZfhKR14UVwcUs&e=

Try searching through the issues, or logging an issue or feature
request or RFC. "don’t run command X on all nodes at the same time"
sounds a little hard to implement. Do you mean mutually exclusive,
configurable contention, locking of commands?

Can you describe your use-case? An independent team of Chef operators
has been evaluating use cases and building tests/example cases for
Pushy and tools in the same field (our cases thus far have been around
Cassandra ring bootstrap, expansion and contraction)

cheers,

–aj

On Sat, Nov 22, 2014 at 8:39 AM, Bryan Baugher <bjbq4d@gmail.commailto:bjbq4d@gmail.com> wrote:

Hello everyone,

Is the push jobs code available on github anywhere? Also are there any
plans
to add a kind of concurrency option (i.e. don’t run command X on all
nodes
at the same time)?

Bryan

CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation and are intended only for the addressee. The information contained in this message is confidential and may constitute inside or non-public information under international, federal, or state securities laws. Unauthorized forwarding, printing, copying, distribution, or use of such information is strictly prohibited and may be unlawful. If you are not the addressee, please promptly delete this message and notify the sender of the delivery error by e-mail or you may call Cerner’s corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024.


#5

Yo,

The quorum functionality behaves like I think:
“The minimum number of nodes that match the search criteria, are
available, and acknowledge the job request.”

What makes you think that specifying quorum of 50% causes the job to
be accepted by 100% of the nodes? I would assert that if 50% of the
nodes were unavailable and you specified 100%, the job would not
launch. If you specified 50%, the job would launch, but (likely)
potentially not run on 100% of machines.

Could you clarify? Are you suggesting that if a job is launched with
quorum of 50%, 100% of machines will run it , despite
their temporal unavailability?

In your opinion where does that recipe that runs the push_jobs
execute? A workstation? Some authorized God node?

ta,

–aj

On Tue, Nov 25, 2014 at 4:48 AM, Cerny,Nathan Nathan.Cerny@cerner.com wrote:

I don’t think the quorum functionality behaves like you think. The quorum
essentially says “If at least X% is available, run on all”.

In my opinion, a better pattern here would be to create an orchestration
recipe that depends on the push_jobs cookbook (and thus gets access to the
push_jobs resource). Within that recipe, you would do a search to grab all
of your potential nodes, then loop over that resource N nodes at time (being
sure to set the parameter to wait for the job to finish before moving on).

On Sat, Nov 22, 2014 at 1:40 AM, AJ Christensen
aj@junglistheavy.industries wrote:

You may be able to use the Quorum functionality to half-bake this [0]:

knife job start --quorum 90% 'chef-client' --search 'role:webapp'

The minimum number of nodes that match the search criteria, are
available, and acknowledge the job request. This can be expressed as a
percentage (e.g. 50%) or as an absolute number of nodes (e.g. 145).
Default value: 100%

I’d try values like 50% of your available foobars and see how it works
out for ya.

Good Luck!

cheers,

–aj

[0] https://github.com/opscode/knife-push#examples

On Sat, Nov 22, 2014 at 8:56 AM, Bryan Baugher bjbq4d@gmail.com wrote:

Great, thanks for the links. Somehow google wasn’t being helpful.

My use case is fairly simple. I want to run chef-client on my nodes in a
particular environment but I want to ensure that,

  • I don’t end up restarting all the services at once
  • A bad config doesn’t take out the whole service

The simplest version of this would just be to group the nodes into X
number
of groups and run chef-client one group at a time. Knife ssh has a
concurrency option which limits the number of ssh connections which
would
also achieve the same goal.

On Fri Nov 21 2014 at 1:45:45 PM AJ Christensen
aj@junglistheavy.industries wrote:

https://github.com/opscode/omnibus-pushy
https://github.com/opscode/oc-pushy-pedant
https://github.com/opscode/omnibus-push-jobs-client
https://github.com/opscode/opscode-pushy-server
https://github.com/opscode/opscode-pushy-simulator
https://github.com/opscode/pushy_common
https://github.com/opscode/knife-push
https://github.com/opscode/opscode-pushy-client

Try searching through the issues, or logging an issue or feature
request or RFC. "don’t run command X on all nodes at the same time"
sounds a little hard to implement. Do you mean mutually exclusive,
configurable contention, locking of commands?

Can you describe your use-case? An independent team of Chef operators
has been evaluating use cases and building tests/example cases for
Pushy and tools in the same field (our cases thus far have been around
Cassandra ring bootstrap, expansion and contraction)

cheers,

–aj

On Sat, Nov 22, 2014 at 8:39 AM, Bryan Baugher bjbq4d@gmail.com
wrote:

Hello everyone,

Is the push jobs code available on github anywhere? Also are there
any
plans
to add a kind of concurrency option (i.e. don’t run command X on all
nodes
at the same time)?

Bryan

CONFIDENTIALITY NOTICE This message and any included attachments are from
Cerner Corporation and are intended only for the addressee. The information
contained in this message is confidential and may constitute inside or
non-public information under international, federal, or state securities
laws. Unauthorized forwarding, printing, copying, distribution, or use of
such information is strictly prohibited and may be unlawful. If you are not
the addressee, please promptly delete this message and notify the sender of
the delivery error by e-mail or you may call Cerner’s corporate offices in
Kansas City, Missouri, U.S.A at (+1) (816)221-1024.


#6

AJ, Nathan:

The correct answer is “both”.

If, for example, quorum is 80% and there are 10 total servers. If 8 of them
ping back and say “Available. Please run commands against me!” but all 10
of them are actually available, then the commands will run against all 10.
But if (for some reason) 2 of them are indeed down, then it’ll still run
against the 8 available because quorum was met.

I will add this example to the docs at docs.getchef.com/push_jobs.html.

james

On Mon, Nov 24, 2014 at 10:24 AM, AJ Christensen <
aj@junglistheavy.industries> wrote:

Yo,

The quorum functionality behaves like I think:
“The minimum number of nodes that match the search criteria, are
available, and acknowledge the job request.”

What makes you think that specifying quorum of 50% causes the job to
be accepted by 100% of the nodes? I would assert that if 50% of the
nodes were unavailable and you specified 100%, the job would not
launch. If you specified 50%, the job would launch, but (likely)
potentially not run on 100% of machines.

Could you clarify? Are you suggesting that if a job is launched with
quorum of 50%, 100% of machines will run it , despite
their temporal unavailability?

In your opinion where does that recipe that runs the push_jobs
execute? A workstation? Some authorized God node?

ta,

–aj

On Tue, Nov 25, 2014 at 4:48 AM, Cerny,Nathan Nathan.Cerny@cerner.com
wrote:

I don’t think the quorum functionality behaves like you think. The
quorum
essentially says “If at least X% is available, run on all”.

In my opinion, a better pattern here would be to create an orchestration
recipe that depends on the push_jobs cookbook (and thus gets access to
the
push_jobs resource). Within that recipe, you would do a search to grab
all
of your potential nodes, then loop over that resource N nodes at time
(being
sure to set the parameter to wait for the job to finish before moving
on).

On Sat, Nov 22, 2014 at 1:40 AM, AJ Christensen
aj@junglistheavy.industries wrote:

You may be able to use the Quorum functionality to half-bake this [0]:

knife job start --quorum 90% 'chef-client' --search 'role:webapp'

The minimum number of nodes that match the search criteria, are
available, and acknowledge the job request. This can be expressed as a
percentage (e.g. 50%) or as an absolute number of nodes (e.g. 145).
Default value: 100%

I’d try values like 50% of your available foobars and see how it works
out for ya.

Good Luck!

cheers,

–aj

[0] https://github.com/opscode/knife-push#examples

On Sat, Nov 22, 2014 at 8:56 AM, Bryan Baugher bjbq4d@gmail.com
wrote:

Great, thanks for the links. Somehow google wasn’t being helpful.

My use case is fairly simple. I want to run chef-client on my nodes
in a

particular environment but I want to ensure that,

  • I don’t end up restarting all the services at once
  • A bad config doesn’t take out the whole service

The simplest version of this would just be to group the nodes into X
number
of groups and run chef-client one group at a time. Knife ssh has a
concurrency option which limits the number of ssh connections which
would
also achieve the same goal.

On Fri Nov 21 2014 at 1:45:45 PM AJ Christensen
aj@junglistheavy.industries wrote:

https://github.com/opscode/omnibus-pushy
https://github.com/opscode/oc-pushy-pedant
https://github.com/opscode/omnibus-push-jobs-client
https://github.com/opscode/opscode-pushy-server
https://github.com/opscode/opscode-pushy-simulator
https://github.com/opscode/pushy_common
https://github.com/opscode/knife-push
https://github.com/opscode/opscode-pushy-client

Try searching through the issues, or logging an issue or feature
request or RFC. "don’t run command X on all nodes at the same time"
sounds a little hard to implement. Do you mean mutually exclusive,
configurable contention, locking of commands?

Can you describe your use-case? An independent team of Chef operators
has been evaluating use cases and building tests/example cases for
Pushy and tools in the same field (our cases thus far have been
around

Cassandra ring bootstrap, expansion and contraction)

cheers,

–aj

On Sat, Nov 22, 2014 at 8:39 AM, Bryan Baugher bjbq4d@gmail.com
wrote:

Hello everyone,

Is the push jobs code available on github anywhere? Also are there
any
plans
to add a kind of concurrency option (i.e. don’t run command X on
all

nodes
at the same time)?

Bryan

CONFIDENTIALITY NOTICE This message and any included attachments are from
Cerner Corporation and are intended only for the addressee. The
information
contained in this message is confidential and may constitute inside or
non-public information under international, federal, or state securities
laws. Unauthorized forwarding, printing, copying, distribution, or use of
such information is strictly prohibited and may be unlawful. If you are
not
the addressee, please promptly delete this message and notify the sender
of
the delivery error by e-mail or you may call Cerner’s corporate offices
in
Kansas City, Missouri, U.S.A at (+1) (816)221-1024.


#7

Sorry for the confusion. I think I¹m coloring what I read with my own
understanding when I started researching push-jobs.

Reading the help, I thought it behaved like ³Ensure X% availability² -
instead it¹s a simple quorum.
If you specify a quorum of 50%, and 75% of the nodes are available, it
will run on the 75% that are available. If 40% of the nodes are
available, then it will not run anywhere. Available is defined as ³has
the push-jobs client running and has checked in².

The functionality I was looking for was some way to schedule availability
for things like service cycling - I wanted to ensure that at any given
point in time, only 25% (for example) of my nodes would be out of the
cluster. This quorum concept doesn¹t help there.

In the ³orchestration cookbook² model, I see two probable implementations:

  1. For specific cookbooks that do one specific thing, that are closely
    tied to the implementation, it could actually be baked into the cookbook
    itself. For example, we have a cluster where you have to wait for the
    services to stabilize for each of the members before you can move onto the
    next member. To query if the cluster is stabilized or not, you must be on
    the cluster master. So in this case, I would bake push-jobs into the
    cluster master recipe. Whenever chef-client executes on the cluster
    master, it does it¹s thing, waits for the cluster to stabilize, then
    issues a push job for the first member node, waits for the cluster to
    stabilize, etc, etc. Because this is a specific implementation, we don¹t
    necessarily have to worry about portability, so tightly coupling push-jobs
    isn¹t a big deal (however, we would likely give the option to turn off
    this functionality and move back to the ³manual orchestration² we¹re doing
    today).

  2. For generic cookbooks, the orchestration functionality could live in a
    separate cookbook, or a recipe in the generic cookbook (that would be
    called directly). This cookbook/recipe would be run on an orchestration
    node (In our case, Jenkins), using either one time chef-client runs, or
    chef-apply. This is similar in concept to the model used with
    Chef-Provisioning.

Nathan Cerny

On 11/24/14, 12:24 PM, “AJ Christensen” aj@junglistheavy.industries
wrote:

Yo,

The quorum functionality behaves like I think:
“The minimum number of nodes that match the search criteria, are
available, and acknowledge the job request.”

What makes you think that specifying quorum of 50% causes the job to
be accepted by 100% of the nodes? I would assert that if 50% of the
nodes were unavailable and you specified 100%, the job would not
launch. If you specified 50%, the job would launch, but (likely)
potentially not run on 100% of machines.

Could you clarify? Are you suggesting that if a job is launched with
quorum of 50%, 100% of machines will run it , despite
their temporal unavailability?

In your opinion where does that recipe that runs the push_jobs
execute? A workstation? Some authorized God node?

ta,

–aj

On Tue, Nov 25, 2014 at 4:48 AM, Cerny,Nathan Nathan.Cerny@cerner.com
wrote:

I don¹t think the quorum functionality behaves like you think. The
quorum
essentially says ³If at least X% is available, run on all².

In my opinion, a better pattern here would be to create an orchestration
recipe that depends on the push_jobs cookbook (and thus gets access to
the
push_jobs resource). Within that recipe, you would do a search to grab
all
of your potential nodes, then loop over that resource N nodes at time
(being
sure to set the parameter to wait for the job to finish before moving
on).

On Sat, Nov 22, 2014 at 1:40 AM, AJ Christensen
aj@junglistheavy.industries wrote:

You may be able to use the Quorum functionality to half-bake this [0]:

knife job start --quorum 90% 'chef-client' --search 'role:webapp'

The minimum number of nodes that match the search criteria, are
available, and acknowledge the job request. This can be expressed as a
percentage (e.g. 50%) or as an absolute number of nodes (e.g. 145).
Default value: 100%

I’d try values like 50% of your available foobars and see how it works
out for ya.

Good Luck!

cheers,

–aj

[0] https://github.com/opscode/knife-push#examples

On Sat, Nov 22, 2014 at 8:56 AM, Bryan Baugher bjbq4d@gmail.com
wrote:

Great, thanks for the links. Somehow google wasn’t being helpful.

My use case is fairly simple. I want to run chef-client on my nodes
in a
particular environment but I want to ensure that,

  • I don’t end up restarting all the services at once
  • A bad config doesn’t take out the whole service

The simplest version of this would just be to group the nodes into X
number
of groups and run chef-client one group at a time. Knife ssh has a
concurrency option which limits the number of ssh connections which
would
also achieve the same goal.

On Fri Nov 21 2014 at 1:45:45 PM AJ Christensen
aj@junglistheavy.industries wrote:

https://github.com/opscode/omnibus-pushy
https://github.com/opscode/oc-pushy-pedant
https://github.com/opscode/omnibus-push-jobs-client
https://github.com/opscode/opscode-pushy-server
https://github.com/opscode/opscode-pushy-simulator
https://github.com/opscode/pushy_common
https://github.com/opscode/knife-push
https://github.com/opscode/opscode-pushy-client

Try searching through the issues, or logging an issue or feature
request or RFC. "don’t run command X on all nodes at the same time"
sounds a little hard to implement. Do you mean mutually exclusive,
configurable contention, locking of commands?

Can you describe your use-case? An independent team of Chef
operators

has been evaluating use cases and building tests/example cases for
Pushy and tools in the same field (our cases thus far have been
around

Cassandra ring bootstrap, expansion and contraction)

cheers,

–aj

On Sat, Nov 22, 2014 at 8:39 AM, Bryan Baugher bjbq4d@gmail.com
wrote:

Hello everyone,

Is the push jobs code available on github anywhere? Also are there
any
plans
to add a kind of concurrency option (i.e. don’t run command X on
all

nodes
at the same time)?

Bryan

CONFIDENTIALITY NOTICE This message and any included attachments are
from
Cerner Corporation and are intended only for the addressee. The
information
contained in this message is confidential and may constitute inside or
non-public information under international, federal, or state securities
laws. Unauthorized forwarding, printing, copying, distribution, or use
of
such information is strictly prohibited and may be unlawful. If you are
not
the addressee, please promptly delete this message and notify the
sender of
the delivery error by e-mail or you may call Cerner’s corporate offices
in
Kansas City, Missouri, U.S.A at (+1) (816)221-1024.