Chef on Amazon EC2 with auto-scaling

Hi there,

I’m very interested in using Chef for deployment of our web services on
Amazon EC2, and was looking if someone could help me out with some advice.

I’ve been trying my best to read through the existing information that’s out
there, but I’m just coming up very confused. Basically, the desired setup I
want to achieve is as so:-

  • Use Amazon auto-scaling to automatically bring up new instances as the
    load requires
  • The servers automatically configure themselves with services via Chef

Is this something that’s actually possible with using Chef? From what I see,
it is possible, but I don’t really see any information on the real mechanics
of this, especially with regard to how one manages these servers once they
have been brought up. Are they manageable using Knife (even though they
weren’t launched using it?)

If anyone could point me in the direction of some more information, I’d be
most grateful.

Thanks,
Oliver Beattie

Hi Oliver,
this is definitely possible, you can use autoscale launch configs to shoot
custom ami (or community ami's) which has chef-client installed, and pass
instance specific tasks as user data. I have used cloudkick to auto
provision ec2 instances via cloudkick's webhooks (address they call it)
which is basically a sinatra wrapper over knife.

On Mon, Jul 18, 2011 at 3:23 PM, Oliver Beattie oliver@luckyvoice.comwrote:

Hi there,

I'm very interested in using Chef for deployment of our web services on
Amazon EC2, and was looking if someone could help me out with some advice.

I've been trying my best to read through the existing information that's
out there, but I'm just coming up very confused. Basically, the desired
setup I want to achieve is as so:-

  • Use Amazon auto-scaling to automatically bring up new instances as the
    load requires
  • The servers automatically configure themselves with services via Chef

Is this something that's actually possible with using Chef? From what I
see, it is possible, but I don't really see any information on the real
mechanics of this, especially with regard to how one manages these servers
once they have been brought up. Are they manageable using Knife (even though
they weren't launched using it?)

If anyone could point me in the direction of some more information, I'd be
most grateful.

Thanks,
Oliver Beattie

On Mon, Jul 18, 2011 at 5:53 AM, Oliver Beattie oliver@luckyvoice.com wrote:

Hi there,
I'm very interested in using Chef for deployment of our web services on
Amazon EC2, and was looking if someone could help me out with some advice.
I've been trying my best to read through the existing information that's out
there, but I'm just coming up very confused. Basically, the desired setup I
want to achieve is as so:-

  • Use Amazon auto-scaling to automatically bring up new instances as the
    load requires
  • The servers automatically configure themselves with services via Chef
    Is this something that's actually possible with using Chef? From what I see,
    it is possible, but I don't really see any information on the real mechanics
    of this, especially with regard to how one manages these servers once they
    have been brought up. Are they manageable using Knife (even though they
    weren't launched using it?)
    If anyone could point me in the direction of some more information, I'd be
    most grateful.
    Thanks,
    Oliver Beattie

How you do this is pretty flexible and depends on how much logic you
want to bake into your AMIs.

I've only glanced askew at the Amazon autoscaling but the gist is this:

  • You bake in ruby and chef-client
  • You don't have to bake in the validation.pem but you could (This
    would mean respinning your AMIs if you ever had to revoke it. Also
    consider how that impacts things if running multiple chef-servers)
  • Using userdata, you would essentially create a "first-boot.json" and
    run chef-client against it.

The node will autoregister itself. Remember that userdata has a fixed
size (that I can't recall off hand). If there's anything more complex,
you'd probably want to pull everything from a private S3 bucket.

To get an idea of how this would work, look at the bootstrap templates
that knife uses. They're just ERB templates that knife executes via
ssh. knife-ec2 just adds in the ec2 api calls to spin up the instance.

You can see (if you've never looked) what the bootstrap templates look
like here: custom natty bootstrap template for chef · GitHub

The biggest thing is how do you want to get your validation.pem onto
the server? S3 private bucket or bake it into the AMI. You'll need
some "custom" logic for parsing the run_list out of userdata as well.
All first-boot.json needs is this:

{"run_list":["role[tracker]"]}

Hi John,

Thanks, that's very helpful, and solidifies some of the things I'd read but
couldn't quite see how they fit together.

So I guess I'm left with a couple other questions:

  • As I originally mentioned, what is the procedure for managing these
    servers? Would I just be able to run commands via knife to all my servers?
    How does it keep track of nodes joining (or more importantly leaving) my
    "cluster"?
  • Another (somewhat unrelated question) I had is how does Chef manage OS
    upgrades? Does it manage them at all? For instance, how would I say "go run
    aptitude upgrade on all my production servers"?

—Oliver

On 18 July 2011 11:17, John E. Vincent (lusis) <lusis.org+
chef-list@gmail.com> wrote:

On Mon, Jul 18, 2011 at 5:53 AM, Oliver Beattie oliver@luckyvoice.com
wrote:

Hi there,
I'm very interested in using Chef for deployment of our web services on
Amazon EC2, and was looking if someone could help me out with some
advice.
I've been trying my best to read through the existing information that's
out
there, but I'm just coming up very confused. Basically, the desired setup
I
want to achieve is as so:-

  • Use Amazon auto-scaling to automatically bring up new instances as the
    load requires
  • The servers automatically configure themselves with services via Chef
    Is this something that's actually possible with using Chef? From what I
    see,
    it is possible, but I don't really see any information on the real
    mechanics
    of this, especially with regard to how one manages these servers once
    they
    have been brought up. Are they manageable using Knife (even though they
    weren't launched using it?)
    If anyone could point me in the direction of some more information, I'd
    be
    most grateful.
    Thanks,
    Oliver Beattie

How you do this is pretty flexible and depends on how much logic you
want to bake into your AMIs.

I've only glanced askew at the Amazon autoscaling but the gist is this:

  • You bake in ruby and chef-client
  • You don't have to bake in the validation.pem but you could (This
    would mean respinning your AMIs if you ever had to revoke it. Also
    consider how that impacts things if running multiple chef-servers)
  • Using userdata, you would essentially create a "first-boot.json" and
    run chef-client against it.

The node will autoregister itself. Remember that userdata has a fixed
size (that I can't recall off hand). If there's anything more complex,
you'd probably want to pull everything from a private S3 bucket.

To get an idea of how this would work, look at the bootstrap templates
that knife uses. They're just ERB templates that knife executes via
ssh. knife-ec2 just adds in the ec2 api calls to spin up the instance.

You can see (if you've never looked) what the bootstrap templates look
like here: custom natty bootstrap template for chef · GitHub

The biggest thing is how do you want to get your validation.pem onto
the server? S3 private bucket or bake it into the AMI. You'll need
some "custom" logic for parsing the run_list out of userdata as well.
All first-boot.json needs is this:

{"run_list":["role[tracker]"]}

On Mon, Jul 18, 2011 at 6:31 AM, Oliver Beattie oliver@luckyvoice.com wrote:

Hi John,
Thanks, that's very helpful, and solidifies some of the things I'd read but
couldn't quite see how they fit together.
So I guess I'm left with a couple other questions:

  • As I originally mentioned, what is the procedure for managing these
    servers? Would I just be able to run commands via knife to all my servers?
    How does it keep track of nodes joining (or more importantly leaving) my
    "cluster"?

The servers can still be managed just fine via knife in the future.
They register with the chef-server like any node boostrapped with
knife.

As for leaving the cluster, that's entirely on you to decide how to
handle it. I'm not sure, off-hand, if the autoscaling issues a
shutdown command or has hooks for calling custom logic.
My thought would be to add something to the shutdown that does a
'knife node delete nodename -y' and 'knife client delete nodename -y'.
However you'll have to have a different set of credentials on the
server somewhere that has permissions to do that. Another option is to
have your servers send some sort of message on shutdown to something
that does the cleanup for you. I'm considering a Noah callback that
does some of the ec2 api interaction but I've just not gotten around
to it. Another option is to use RunDeck or even Jenkins and have the
hook call those guys when it shuts down. It only needs to be a small
message (something like {"node":"nodename","action":"shutdown"}) to a
small service somewhere that can then do the cleanup on the backend
(including removing it from monitoring if need be).

  • Another (somewhat unrelated question) I had is how does Chef manage OS
    upgrades? Does it manage them at all? For instance, how would I say "go run
    aptitude upgrade on all my production servers"?

That is also up to you. If you leave chef-client running in daemon
mode, then if you have a cookbook/recipe that does that (upgrades all
packages), it will run. Personally, I have a cookbook that upgrades a
specific set of packages if they're available on each run. Since
you're running in EC2 and being elastic, it's probably not worth the
effort. The instances won't be long lived enough (generally speaking)
to matter. If you're concerned about package drift from any long-lived
instances and autoscaling ones, you could always have your bootstrap
update to the latest packages but that prolongs the time it takes the
instance to come up.

I'm still debating how much logic to bake into my AMIs myself. I'm a
big fan of "just enough operating system" but over time my bootstraps
start to take a bit longer catching up the AMI to where it needs to
be. I personally treat this as an impetus to not install so much crap
on my servers.

—Oliver
On 18 July 2011 11:17, John E. Vincent (lusis)
lusis.org+chef-list@gmail.com wrote:

On Mon, Jul 18, 2011 at 5:53 AM, Oliver Beattie oliver@luckyvoice.com
wrote:

Hi there,
I'm very interested in using Chef for deployment of our web services on
Amazon EC2, and was looking if someone could help me out with some
advice.
I've been trying my best to read through the existing information that's
out
there, but I'm just coming up very confused. Basically, the desired
setup I
want to achieve is as so:-

  • Use Amazon auto-scaling to automatically bring up new instances as the
    load requires
  • The servers automatically configure themselves with services via Chef
    Is this something that's actually possible with using Chef? From what I
    see,
    it is possible, but I don't really see any information on the real
    mechanics
    of this, especially with regard to how one manages these servers once
    they
    have been brought up. Are they manageable using Knife (even though they
    weren't launched using it?)
    If anyone could point me in the direction of some more information, I'd
    be
    most grateful.
    Thanks,
    Oliver Beattie

How you do this is pretty flexible and depends on how much logic you
want to bake into your AMIs.

I've only glanced askew at the Amazon autoscaling but the gist is this:

  • You bake in ruby and chef-client
  • You don't have to bake in the validation.pem but you could (This
    would mean respinning your AMIs if you ever had to revoke it. Also
    consider how that impacts things if running multiple chef-servers)
  • Using userdata, you would essentially create a "first-boot.json" and
    run chef-client against it.

The node will autoregister itself. Remember that userdata has a fixed
size (that I can't recall off hand). If there's anything more complex,
you'd probably want to pull everything from a private S3 bucket.

To get an idea of how this would work, look at the bootstrap templates
that knife uses. They're just ERB templates that knife executes via
ssh. knife-ec2 just adds in the ec2 api calls to spin up the instance.

You can see (if you've never looked) what the bootstrap templates look
like here: custom natty bootstrap template for chef · GitHub

The biggest thing is how do you want to get your validation.pem onto
the server? S3 private bucket or bake it into the AMI. You'll need
some "custom" logic for parsing the run_list out of userdata as well.
All first-boot.json needs is this:

{"run_list":["role[tracker]"]}

you can do both of these two tasks via knife ssh sub command

On Mon, Jul 18, 2011 at 4:01 PM, Oliver Beattie oliver@luckyvoice.comwrote:

Hi John,

Thanks, that's very helpful, and solidifies some of the things I'd read but
couldn't quite see how they fit together.

So I guess I'm left with a couple other questions:

  • As I originally mentioned, what is the procedure for managing these
    servers? Would I just be able to run commands via knife to all my servers?
    How does it keep track of nodes joining (or more importantly leaving) my
    "cluster"?
  • Another (somewhat unrelated question) I had is how does Chef manage OS
    upgrades? Does it manage them at all? For instance, how would I say "go run
    aptitude upgrade on all my production servers"?

—Oliver

On 18 July 2011 11:17, John E. Vincent (lusis) <lusis.org+
chef-list@gmail.com> wrote:

On Mon, Jul 18, 2011 at 5:53 AM, Oliver Beattie oliver@luckyvoice.com
wrote:

Hi there,
I'm very interested in using Chef for deployment of our web services on
Amazon EC2, and was looking if someone could help me out with some
advice.
I've been trying my best to read through the existing information that's
out
there, but I'm just coming up very confused. Basically, the desired
setup I
want to achieve is as so:-

  • Use Amazon auto-scaling to automatically bring up new instances as the
    load requires
  • The servers automatically configure themselves with services via Chef
    Is this something that's actually possible with using Chef? From what I
    see,
    it is possible, but I don't really see any information on the real
    mechanics
    of this, especially with regard to how one manages these servers once
    they
    have been brought up. Are they manageable using Knife (even though they
    weren't launched using it?)
    If anyone could point me in the direction of some more information, I'd
    be
    most grateful.
    Thanks,
    Oliver Beattie

How you do this is pretty flexible and depends on how much logic you
want to bake into your AMIs.

I've only glanced askew at the Amazon autoscaling but the gist is this:

  • You bake in ruby and chef-client
  • You don't have to bake in the validation.pem but you could (This
    would mean respinning your AMIs if you ever had to revoke it. Also
    consider how that impacts things if running multiple chef-servers)
  • Using userdata, you would essentially create a "first-boot.json" and
    run chef-client against it.

The node will autoregister itself. Remember that userdata has a fixed
size (that I can't recall off hand). If there's anything more complex,
you'd probably want to pull everything from a private S3 bucket.

To get an idea of how this would work, look at the bootstrap templates
that knife uses. They're just ERB templates that knife executes via
ssh. knife-ec2 just adds in the ec2 api calls to spin up the instance.

You can see (if you've never looked) what the bootstrap templates look
like here: custom natty bootstrap template for chef · GitHub

The biggest thing is how do you want to get your validation.pem onto
the server? S3 private bucket or bake it into the AMI. You'll need
some "custom" logic for parsing the run_list out of userdata as well.
All first-boot.json needs is this:

{"run_list":["role[tracker]"]}

On Jul 18, 2011 6:32 AM, "Oliver Beattie" oliver@luckyvoice.com wrote:

  • As I originally mentioned, what is the procedure for managing these
    servers? Would I just be able to run commands via knife to all my servers?
    How does it keep track of nodes joining (or more importantly leaving) my
    "cluster"?

Knife uses the Chef server API to talk to the server. Since all nodes
register with the server (both a node object for the data and a client
object for authentication) knife node list produces a list of all nodes
registered with the server. Knife doesn't know about nodes itself. When you
use knife to create a new system, via ec2 server create or bootstrap, the
node still registers itself with the chef server, not knife.

  • Another (somewhat unrelated question) I had is how does Chef manage OS
    upgrades? Does it manage them at all? For instance, how would I say "go run
    aptitude upgrade on all my production servers"?

knife ssh name:* "sudo aptitude upgrade -y"

Or you can create a cookbook to do this if you trust upstream to produce
non-breaking changes.

Chef itself doesn't manage OS upgrades, but it certainly can.Remember that
Chef is a tool designed to help you automate your systems. A hammer doesn't
pound nails alone.

Bryan

Hi,
Please forgive me for directing you to my own blog but here is my post
on how I did it [1] (which Opscode kindly link to). This method
(provided to me on this list) uses Ubuntu's cloud-init to bootstrap
Chef onto the image and then gets Chef to do the rest.

Re: OS upgrades. If you mean package upgrades then write a cookbook
that does it. There is an apt cookbook for ubuntu that updates the
package list but doesn't run the upgrade for you.

If you want to actually upgrade the OS (i.e. Ubuntu Maverick to Natty)
then Chef doesn't do this directly. In EC2 these images are pre-baked
so, with Chef, instead of starting with the Maverick image you start
with the Natty image. Chef will then install everything else you need
and you just need to test to make sure it worked..

[1] http://www.trailhunger.com/blog/technical/2011/05/28/keeping-an-amazon-elastic-compute-cloud-ec2-instance-up-with-chef-and-auto-scaling/

On Mon, Jul 18, 2011 at 5:16 AM, Bryan McLellan btm@loftninjas.org wrote:

On Jul 18, 2011 6:32 AM, "Oliver Beattie" oliver@luckyvoice.com wrote:

  • As I originally mentioned, what is the procedure for managing these
    servers? Would I just be able to run commands via knife to all my servers?
    How does it keep track of nodes joining (or more importantly leaving) my
    "cluster"?

Knife uses the Chef server API to talk to the server. Since all nodes
register with the server (both a node object for the data and a client
object for authentication) knife node list produces a list of all nodes
registered with the server. Knife doesn't know about nodes itself. When you
use knife to create a new system, via ec2 server create or bootstrap, the
node still registers itself with the chef server, not knife.

  • Another (somewhat unrelated question) I had is how does Chef manage OS
    upgrades? Does it manage them at all? For instance, how would I say "go run
    aptitude upgrade on all my production servers"?

knife ssh name:* "sudo aptitude upgrade -y"

Or you can create a cookbook to do this if you trust upstream to produce
non-breaking changes.

Chef itself doesn't manage OS upgrades, but it certainly can.Remember that
Chef is a tool designed to help you automate your systems. A hammer doesn't
pound nails alone.

Bryan

I've been using chef with autoscaling quite often in the last two years.
I've found the most versatile approach is to have a minimal user-data
script take care of bootstrapping chef and let chef do the rest. the
validation.pem is included in the user-data.

Another, somewhat unrelated script is a cleanup daemon, basically it
scans the list of EC2 servers periodically and updates chef nodes with
the status of the related ec2 instance (e.g. node[:ec2][:status] =
"running", matched on instance-id). This allows filtering search results
for servers that are dead/stopped/etc. the daemon also removes nodes and
clients after a they are dead for a while.

pros of my method:

  • no ami maintenance, you can use any community ami

  • works very well with high rates of recycling nodes

  • simple, easy to extend and modify cluster functionality

  • cleanup doesn't depend on proper node shutdown

Cons (those i thought of at least)

  • servers take longer to get to "production ready" status

  • chef server and recipes become major points of failure for autoscaling

  • a little more load on chef server

Regards,
Avishai

On 18/07/11 15:46, Edward Sargisson wrote:

Hi,
Please forgive me for directing you to my own blog but here is my post
on how I did it [1] (which Opscode kindly link to). This method
(provided to me on this list) uses Ubuntu's cloud-init to bootstrap
Chef onto the image and then gets Chef to do the rest.

Re: OS upgrades. If you mean package upgrades then write a cookbook
that does it. There is an apt cookbook for ubuntu that updates the
package list but doesn't run the upgrade for you.

If you want to actually upgrade the OS (i.e. Ubuntu Maverick to Natty)
then Chef doesn't do this directly. In EC2 these images are pre-baked
so, with Chef, instead of starting with the Maverick image you start
with the Natty image. Chef will then install everything else you need
and you just need to test to make sure it worked..

[1] http://www.trailhunger.com/blog/technical/2011/05/28/keeping-an-amazon-elastic-compute-cloud-ec2-instance-up-with-chef-and-auto-scaling/

On Mon, Jul 18, 2011 at 5:16 AM, Bryan McLellan btm@loftninjas.org wrote:

On Jul 18, 2011 6:32 AM, "Oliver Beattie" oliver@luckyvoice.com wrote:

  • As I originally mentioned, what is the procedure for managing these
    servers? Would I just be able to run commands via knife to all my servers?
    How does it keep track of nodes joining (or more importantly leaving) my
    "cluster"?
    Knife uses the Chef server API to talk to the server. Since all nodes
    register with the server (both a node object for the data and a client
    object for authentication) knife node list produces a list of all nodes
    registered with the server. Knife doesn't know about nodes itself. When you
    use knife to create a new system, via ec2 server create or bootstrap, the
    node still registers itself with the chef server, not knife.
  • Another (somewhat unrelated question) I had is how does Chef manage OS
    upgrades? Does it manage them at all? For instance, how would I say "go run
    aptitude upgrade on all my production servers"?
    knife ssh name:* "sudo aptitude upgrade -y"

Or you can create a cookbook to do this if you trust upstream to produce
non-breaking changes.

Chef itself doesn't manage OS upgrades, but it certainly can.Remember that
Chef is a tool designed to help you automate your systems. A hammer doesn't
pound nails alone.

Bryan

Avishai,

Passing in the validation.pem with the user-data becomes a security concern.
It will always be present as the instance runs. Once a client gets it's
client.pem you should be removing the validation.pem. You won't be able to
do this when it's passed in.

On Mon, Jul 18, 2011 at 8:25 AM, Avishai Ish-Shalom avishai@fewbytes.comwrote:

I've been using chef with autoscaling quite often in the last two years.
I've found the most versatile approach is to have a minimal user-data
script take care of bootstrapping chef and let chef do the rest. the
validation.pem is included in the user-data.

Another, somewhat unrelated script is a cleanup daemon, basically it
scans the list of EC2 servers periodically and updates chef nodes with
the status of the related ec2 instance (e.g. node[:ec2][:status] =
"running", matched on instance-id). This allows filtering search results
for servers that are dead/stopped/etc. the daemon also removes nodes and
clients after a they are dead for a while.

pros of my method:

  • no ami maintenance, you can use any community ami

  • works very well with high rates of recycling nodes

  • simple, easy to extend and modify cluster functionality

  • cleanup doesn't depend on proper node shutdown

Cons (those i thought of at least)

  • servers take longer to get to "production ready" status

  • chef server and recipes become major points of failure for autoscaling

  • a little more load on chef server

Regards,
Avishai

On 18/07/11 15:46, Edward Sargisson wrote:

Hi,
Please forgive me for directing you to my own blog but here is my post
on how I did it [1] (which Opscode kindly link to). This method
(provided to me on this list) uses Ubuntu's cloud-init to bootstrap
Chef onto the image and then gets Chef to do the rest.

Re: OS upgrades. If you mean package upgrades then write a cookbook
that does it. There is an apt cookbook for ubuntu that updates the
package list but doesn't run the upgrade for you.

If you want to actually upgrade the OS (i.e. Ubuntu Maverick to Natty)
then Chef doesn't do this directly. In EC2 these images are pre-baked
so, with Chef, instead of starting with the Maverick image you start
with the Natty image. Chef will then install everything else you need
and you just need to test to make sure it worked..

[1]
http://www.trailhunger.com/blog/technical/2011/05/28/keeping-an-amazon-elastic-compute-cloud-ec2-instance-up-with-chef-and-auto-scaling/

On Mon, Jul 18, 2011 at 5:16 AM, Bryan McLellan btm@loftninjas.org
wrote:

On Jul 18, 2011 6:32 AM, "Oliver Beattie" oliver@luckyvoice.com
wrote:

  • As I originally mentioned, what is the procedure for managing these
    servers? Would I just be able to run commands via knife to all my
    servers?
    How does it keep track of nodes joining (or more importantly leaving)
    my
    "cluster"?
    Knife uses the Chef server API to talk to the server. Since all nodes
    register with the server (both a node object for the data and a client
    object for authentication) knife node list produces a list of all nodes
    registered with the server. Knife doesn't know about nodes itself. When
    you
    use knife to create a new system, via ec2 server create or bootstrap,
    the
    node still registers itself with the chef server, not knife.
  • Another (somewhat unrelated question) I had is how does Chef manage
    OS
    upgrades? Does it manage them at all? For instance, how would I say "go
    run
    aptitude upgrade on all my production servers"?
    knife ssh name:* "sudo aptitude upgrade -y"

Or you can create a cookbook to do this if you trust upstream to produce
non-breaking changes.

Chef itself doesn't manage OS upgrades, but it certainly can.Remember
that
Chef is a tool designed to help you automate your systems. A hammer
doesn't
pound nails alone.

Bryan

You are almost correct. User data can be modified after the instance is
up using ec2-modify-instance-attribute, however this is cumbersome and
requires the instance to be stopped first (and naturally only works with
EBS based amis). sigh.

The secure, convenient option would be a signed s3 link set to expire in
15 minutes, this however forces you to generate user-data using
templates (i use erubies) and doesn't work with autoscaling.

In short, if you want a method that works with autoscaling and doesn't
require bundling an ami, you're screwed security-wise. Unless someone in
this list figures how to do it of course... i've considered ip bound
one-time tokens but decided against implementing yet another security
layer. I bake ami's with validation.pem when i can and take my chances
when i don't have the time.

Regards,
Avishai

On 23/07/11 03:38, Bryan Brandau wrote:

Avishai,

Passing in the validation.pem with the user-data becomes a security
concern. It will always be present as the instance runs. Once a
client gets it's client.pem you should be removing the validation.pem.
You won't be able to do this when it's passed in.

On Mon, Jul 18, 2011 at 8:25 AM, Avishai Ish-Shalom
<avishai@fewbytes.com mailto:avishai@fewbytes.com> wrote:

I've been using chef with autoscaling quite often in the last two
years.
I've found the most versatile approach is to have a minimal user-data
script take care of bootstrapping chef and let chef do the rest. the
validation.pem is included in the user-data.

Another, somewhat unrelated script is a cleanup daemon, basically it
scans the list of EC2 servers periodically and updates chef nodes with
the status of the related ec2 instance (e.g. node[:ec2][:status] =
"running", matched on instance-id). This allows filtering search
results
for servers that are dead/stopped/etc. the daemon also removes
nodes and
clients after a they are dead for a while.

pros of my method:

* no ami maintenance, you can use any community ami

* works very well with high rates of recycling nodes

* simple, easy to extend and modify cluster functionality

* cleanup doesn't depend on proper node shutdown


Cons (those i thought of at least)

* servers take longer to get to "production ready" status

* chef server and recipes become major points of failure for
autoscaling

* a little more load on chef server

Regards,
Avishai


On 18/07/11 15:46, Edward Sargisson wrote:

> Hi,
> Please forgive me for directing you to my own blog but here is
my post
> on how I did it [1] (which Opscode kindly link to). This method
> (provided to me on this list) uses Ubuntu's cloud-init to bootstrap
> Chef onto the image and then gets Chef to do the rest.
>
> Re: OS upgrades. If you mean package upgrades then write a cookbook
> that does it. There is an apt cookbook for ubuntu that updates the
> package list but doesn't run the upgrade for you.
>
> If you want to actually upgrade the OS (i.e. Ubuntu Maverick to
Natty)
> then Chef doesn't do this directly. In EC2 these images are
pre-baked
> so, with Chef, instead of starting with the Maverick image you start
> with the Natty image. Chef will then install everything else you
need
> and you just need to test to make sure it worked..
>
> [1]
http://www.trailhunger.com/blog/technical/2011/05/28/keeping-an-amazon-elastic-compute-cloud-ec2-instance-up-with-chef-and-auto-scaling/
>
> On Mon, Jul 18, 2011 at 5:16 AM, Bryan McLellan
<btm@loftninjas.org <mailto:btm@loftninjas.org>> wrote:
>> On Jul 18, 2011 6:32 AM, "Oliver Beattie"
<oliver@luckyvoice.com <mailto:oliver@luckyvoice.com>> wrote:
>>> * As I originally mentioned, what is the procedure for
managing these
>>> servers? Would I just be able to run commands via knife to all
my servers?
>>> How does it keep track of nodes joining (or more importantly
leaving) my
>>> "cluster"?
>> Knife uses the Chef server API to talk to the server. Since all
nodes
>> register with the server (both a node object for the data and a
client
>> object for authentication) knife node list produces a list of
all nodes
>> registered with the server. Knife doesn't know about nodes
itself. When you
>> use knife to create a new system, via ec2 server create or
bootstrap, the
>> node still registers itself with the chef server, not knife.
>>
>>> * Another (somewhat unrelated question) I had is how does Chef
manage OS
>>> upgrades? Does it manage them at all? For instance, how would
I say "go run
>>> aptitude upgrade on all my production servers"?
>> knife ssh name:* "sudo aptitude upgrade -y"
>>
>> Or you can create a cookbook to do this if you trust upstream
to produce
>> non-breaking changes.
>>
>> Chef itself doesn't manage OS upgrades, but it certainly
can.Remember that
>> Chef is a tool designed to help you automate your systems. A
hammer doesn't
>> pound nails alone.
>>
>> Bryan
>>

What about scp'ing the pem off a "static" instances internal address that is locked down with a security group? Only your instances can access it... Any ssh key you use would be available in the user data, but it adds another layer of security.

On Jul 23, 2011, at 3:24 AM, Avishai Ish-Shalom avishai@fewbytes.com wrote:

You are almost correct. User data can be modified after the instance is up using ec2-modify-instance-attribute, however this is cumbersome and requires the instance to be stopped first (and naturally only works with EBS based amis). sigh.
The secure, convenient option would be a signed s3 link set to expire in 15 minutes, this however forces you to generate user-data using templates (i use erubies) and doesn't work with autoscaling.

In short, if you want a method that works with autoscaling and doesn't require bundling an ami, you're screwed security-wise. Unless someone in this list figures how to do it of course... i've considered ip bound one-time tokens but decided against implementing yet another security layer. I bake ami's with validation.pem when i can and take my chances when i don't have the time.
Regards,
Avishai

On 23/07/11 03:38, Bryan Brandau wrote:

Avishai,

Passing in the validation.pem with the user-data becomes a security concern. It will always be present as the instance runs. Once a client gets it's client.pem you should be removing the validation.pem. You won't be able to do this when it's passed in.

On Mon, Jul 18, 2011 at 8:25 AM, Avishai Ish-Shalom avishai@fewbytes.com wrote:
I've been using chef with autoscaling quite often in the last two years.
I've found the most versatile approach is to have a minimal user-data
script take care of bootstrapping chef and let chef do the rest. the
validation.pem is included in the user-data.

Another, somewhat unrelated script is a cleanup daemon, basically it
scans the list of EC2 servers periodically and updates chef nodes with
the status of the related ec2 instance (e.g. node[:ec2][:status] =
"running", matched on instance-id). This allows filtering search results
for servers that are dead/stopped/etc. the daemon also removes nodes and
clients after a they are dead for a while.

pros of my method:

  • no ami maintenance, you can use any community ami

  • works very well with high rates of recycling nodes

  • simple, easy to extend and modify cluster functionality

  • cleanup doesn't depend on proper node shutdown

Cons (those i thought of at least)

  • servers take longer to get to "production ready" status

  • chef server and recipes become major points of failure for autoscaling

  • a little more load on chef server

Regards,
Avishai

On 18/07/11 15:46, Edward Sargisson wrote:

Hi,
Please forgive me for directing you to my own blog but here is my post
on how I did it [1] (which Opscode kindly link to). This method
(provided to me on this list) uses Ubuntu's cloud-init to bootstrap
Chef onto the image and then gets Chef to do the rest.

Re: OS upgrades. If you mean package upgrades then write a cookbook
that does it. There is an apt cookbook for ubuntu that updates the
package list but doesn't run the upgrade for you.

If you want to actually upgrade the OS (i.e. Ubuntu Maverick to Natty)
then Chef doesn't do this directly. In EC2 these images are pre-baked
so, with Chef, instead of starting with the Maverick image you start
with the Natty image. Chef will then install everything else you need
and you just need to test to make sure it worked..

[1] http://www.trailhunger.com/blog/technical/2011/05/28/keeping-an-amazon-elastic-compute-cloud-ec2-instance-up-with-chef-and-auto-scaling/

On Mon, Jul 18, 2011 at 5:16 AM, Bryan McLellan btm@loftninjas.org wrote:

On Jul 18, 2011 6:32 AM, "Oliver Beattie" oliver@luckyvoice.com wrote:

  • As I originally mentioned, what is the procedure for managing these
    servers? Would I just be able to run commands via knife to all my servers?
    How does it keep track of nodes joining (or more importantly leaving) my
    "cluster"?
    Knife uses the Chef server API to talk to the server. Since all nodes
    register with the server (both a node object for the data and a client
    object for authentication) knife node list produces a list of all nodes
    registered with the server. Knife doesn't know about nodes itself. When you
    use knife to create a new system, via ec2 server create or bootstrap, the
    node still registers itself with the chef server, not knife.
  • Another (somewhat unrelated question) I had is how does Chef manage OS
    upgrades? Does it manage them at all? For instance, how would I say "go run
    aptitude upgrade on all my production servers"?
    knife ssh name:* "sudo aptitude upgrade -y"

Or you can create a cookbook to do this if you trust upstream to produce
non-breaking changes.

Chef itself doesn't manage OS upgrades, but it certainly can.Remember that
Chef is a tool designed to help you automate your systems. A hammer doesn't
pound nails alone.

Bryan

<avishai.vcf>

A. single point of failure, not good for most autoscaling scenerios

B. this doesn't add much security, if i hack into the instance i have
the ssh key (from user-data) and the scp address and thus i can copy the
validation.pem from the "static" server again.

Regards,
Avishai

On 23/07/11 18:22, Aaron Abramson wrote:

What about scp'ing the pem off a "static" instances internal address
that is locked down with a security group? Only your instances can
access it... Any ssh key you use would be available in the user data,
but it adds another layer of security.

On Jul 23, 2011, at 3:24 AM, Avishai Ish-Shalom <avishai@fewbytes.com
mailto:avishai@fewbytes.com> wrote:

You are almost correct. User data can be modified after the instance
is up using ec2-modify-instance-attribute, however this is cumbersome
and requires the instance to be stopped first (and naturally only
works with EBS based amis). sigh.

The secure, convenient option would be a signed s3 link set to expire
in 15 minutes, this however forces you to generate user-data using
templates (i use erubies) and doesn't work with autoscaling.

In short, if you want a method that works with autoscaling and
doesn't require bundling an ami, you're screwed security-wise. Unless
someone in this list figures how to do it of course... i've
considered ip bound one-time tokens but decided against implementing
yet another security layer. I bake ami's with validation.pem when i
can and take my chances when i don't have the time.

Regards,
Avishai

On 23/07/11 03:38, Bryan Brandau wrote:

Avishai,

Passing in the validation.pem with the user-data becomes a security
concern. It will always be present as the instance runs. Once a
client gets it's client.pem you should be removing the
validation.pem. You won't be able to do this when it's passed in.

On Mon, Jul 18, 2011 at 8:25 AM, Avishai Ish-Shalom
<avishai@fewbytes.com mailto:avishai@fewbytes.com> wrote:

I've been using chef with autoscaling quite often in the last
two years.
I've found the most versatile approach is to have a minimal
user-data
script take care of bootstrapping chef and let chef do the rest. the
validation.pem is included in the user-data.

Another, somewhat unrelated script is a cleanup daemon, basically it
scans the list of EC2 servers periodically and updates chef
nodes with
the status of the related ec2 instance (e.g. node[:ec2][:status] =
"running", matched on instance-id). This allows filtering search
results
for servers that are dead/stopped/etc. the daemon also removes
nodes and
clients after a they are dead for a while.

pros of my method:

* no ami maintenance, you can use any community ami

* works very well with high rates of recycling nodes

* simple, easy to extend and modify cluster functionality

* cleanup doesn't depend on proper node shutdown


Cons (those i thought of at least)

* servers take longer to get to "production ready" status

* chef server and recipes become major points of failure for
autoscaling

* a little more load on chef server

Regards,
Avishai


On 18/07/11 15:46, Edward Sargisson wrote:

> Hi,
> Please forgive me for directing you to my own blog but here is
my post
> on how I did it [1] (which Opscode kindly link to). This method
> (provided to me on this list) uses Ubuntu's cloud-init to
bootstrap
> Chef onto the image and then gets Chef to do the rest.
>
> Re: OS upgrades. If you mean package upgrades then write a
cookbook
> that does it. There is an apt cookbook for ubuntu that updates the
> package list but doesn't run the upgrade for you.
>
> If you want to actually upgrade the OS (i.e. Ubuntu Maverick
to Natty)
> then Chef doesn't do this directly. In EC2 these images are
pre-baked
> so, with Chef, instead of starting with the Maverick image you
start
> with the Natty image. Chef will then install everything else
you need
> and you just need to test to make sure it worked..
>
> [1]
http://www.trailhunger.com/blog/technical/2011/05/28/keeping-an-amazon-elastic-compute-cloud-ec2-instance-up-with-chef-and-auto-scaling/
>
> On Mon, Jul 18, 2011 at 5:16 AM, Bryan McLellan
<btm@loftninjas.org <mailto:btm@loftninjas.org>> wrote:
>> On Jul 18, 2011 6:32 AM, "Oliver Beattie"
<oliver@luckyvoice.com <mailto:oliver@luckyvoice.com>> wrote:
>>> * As I originally mentioned, what is the procedure for
managing these
>>> servers? Would I just be able to run commands via knife to
all my servers?
>>> How does it keep track of nodes joining (or more importantly
leaving) my
>>> "cluster"?
>> Knife uses the Chef server API to talk to the server. Since
all nodes
>> register with the server (both a node object for the data and
a client
>> object for authentication) knife node list produces a list of
all nodes
>> registered with the server. Knife doesn't know about nodes
itself. When you
>> use knife to create a new system, via ec2 server create or
bootstrap, the
>> node still registers itself with the chef server, not knife.
>>
>>> * Another (somewhat unrelated question) I had is how does
Chef manage OS
>>> upgrades? Does it manage them at all? For instance, how
would I say "go run
>>> aptitude upgrade on all my production servers"?
>> knife ssh name:* "sudo aptitude upgrade -y"
>>
>> Or you can create a cookbook to do this if you trust upstream
to produce
>> non-breaking changes.
>>
>> Chef itself doesn't manage OS upgrades, but it certainly
can.Remember that
>> Chef is a tool designed to help you automate your systems. A
hammer doesn't
>> pound nails alone.
>>
>> Bryan
>>

<avishai.vcf>