Using Chef to Configue Autoscaled EC2 Instances

Nathan_Pahucki · November 10, 2011, 5:44pm

Hi all, I’d like to get some feedback on how people are doing this now. It
seems that since .10, the preferred way of starting Amazon EC2 instances is via
the ‘knife ec2 server create’ command, while the ‘knife ec2 instance data’ and
chef preinstalled AMIs are deprecated. That’s all fine and good for manually
launching instances, but what about autoscaling? When a node comes up for
autoscalling, I want it to update and configure itself according to its role(s)
and environment. With the instance data method of doing thing deprecated,
what’s the alternative (aside from having a custom baked AMI that I need to
manually update as is the normal way of doing this without chef)? If there is
no alternative, and I have to use the deprecated method which would mean using
’knife ec2 instance data’ to wire into the autoscaling group as UserData. When
using this method how would I specify the environment that the server should be
run as too? I don’t see anyway to do this using the instance data. Running some
command outside the server after the server it is started to set the
environment is not an option as we won’t know when a new server has been spun
up (controlled by Amazon).

I’m interested in what other people have found to be best practice for this
scenario and any advice, tips or tricks that have been learned while trying to
work with chef and autoscaled EC2 instances.

Thanks for your thoughts!

Edward_Sargisson · November 10, 2011, 5:54pm

Hi,
Forgive me for directing you to my blog but I wrote an extensive post
on how to do this:
http://www.trailhunger.com/blog/technical/2011/05/28/keeping-an-amazon-elastic-compute-cloud-ec2-instance-up-with-chef-and-auto-scaling/

It works pretty well for me.

Cheers,
Edward

On Thu, Nov 10, 2011 at 9:44 AM, npahucki@gmail.com wrote:

Hi all, I'd like to get some feedback on how people are doing this now. It
seems that since .10, the preferred way of starting Amazon EC2 instances is via
the 'knife ec2 server create' command, while the 'knife ec2 instance data' and
chef preinstalled AMIs are deprecated. That's all fine and good for manually
launching instances, but what about autoscaling? When a node comes up for
autoscalling, I want it to update and configure itself according to its role(s)
and environment. With the instance data method of doing thing deprecated,
what's the alternative (aside from having a custom baked AMI that I need to
manually update as is the normal way of doing this without chef)? If there is
no alternative, and I have to use the deprecated method which would mean using
'knife ec2 instance data' to wire into the autoscaling group as UserData. When
using this method how would I specify the environment that the server should be
run as too? I don't see anyway to do this using the instance data. Running some
command outside the server after the server it is started to set the
environment is not an option as we won't know when a new server has been spun
up (controlled by Amazon).

I'm interested in what other people have found to be best practice for this
scenario and any advice, tips or tricks that have been learned while trying to
work with chef and autoscaled EC2 instances.

Thanks for your thoughts!

johnmartinez · November 10, 2011, 5:55pm

Hello,

I agree with you, using knife to provision EC2 instances is not the best way to do this in autoscaled environments. As a temporary solution, I've done exactly what you suggest. I pass along a lot of things into user-data, and specifically, a node's role is one of those. I read user-data and tell Chef on the node what its role is (amongst other things). It's a kludge. I'm going to be looking into integrating Chef with CloudFormation, which plays nice(r) with ASGs.

AWS CloudFormation Developer Resources (second link)
https://s3.amazonaws.com/cloudformation-examples/IntegratingAWSCloudFormationWithOpscodeChef.pdf

-john

On Nov 10, 2011, at 9:44 AM, npahucki@gmail.com npahucki@gmail.com wrote:

Hi all, I'd like to get some feedback on how people are doing this now. It
seems that since .10, the preferred way of starting Amazon EC2 instances is via
the 'knife ec2 server create' command, while the 'knife ec2 instance data' and
chef preinstalled AMIs are deprecated. That's all fine and good for manually
launching instances, but what about autoscaling? When a node comes up for
autoscalling, I want it to update and configure itself according to its role(s)
and environment. With the instance data method of doing thing deprecated,
what's the alternative (aside from having a custom baked AMI that I need to
manually update as is the normal way of doing this without chef)? If there is
no alternative, and I have to use the deprecated method which would mean using
'knife ec2 instance data' to wire into the autoscaling group as UserData. When
using this method how would I specify the environment that the server should be
run as too? I don't see anyway to do this using the instance data. Running some
command outside the server after the server it is started to set the
environment is not an option as we won't know when a new server has been spun
up (controlled by Amazon).

I'm interested in what other people have found to be best practice for this
scenario and any advice, tips or tricks that have been learned while trying to
work with chef and autoscaled EC2 instances.

Thanks for your thoughts!

Aaron_Abramson · November 10, 2011, 6:08pm

I've thought about this, but haven't put much work into implementing it yet. So this is more of a concept than anything.

Set up a script listening on a server. It could be a simple php script, IE http://someurl/script.php

Set up the user-data with a bash script that hits the listener. (wget -O /dev/null http://someurl/script.php?role=myrole)

The script receives the request, and it knows 2 things, the public hostname of the server making the request, and the "role" it is requesting. The script makes certain assumptions (username, ssh-key, etc) and then executes a system command, calling "knife bootstrap server -r "role[myrole]"

So, if you auto-scale official cannonical ubuntu AMI's, simply have the user-data "ping" your knife listener, and knife then bootstraps the server.

Perhaps such a bootstrap request feature could be built into chef-server?

Aaron

On Nov 10, 2011, at 11:55 AM, John Martinez wrote:

Hello,

I agree with you, using knife to provision EC2 instances is not the best way to do this in autoscaled environments. As a temporary solution, I've done exactly what you suggest. I pass along a lot of things into user-data, and specifically, a node's role is one of those. I read user-data and tell Chef on the node what its role is (amongst other things). It's a kludge. I'm going to be looking into integrating Chef with CloudFormation, which plays nice(r) with ASGs.

AWS CloudFormation Developer Resources (second link)
https://s3.amazonaws.com/cloudformation-examples/IntegratingAWSCloudFormationWithOpscodeChef.pdf

-john

On Nov 10, 2011, at 9:44 AM, npahucki@gmail.com npahucki@gmail.com wrote:

Hi all, I'd like to get some feedback on how people are doing this now. It
seems that since .10, the preferred way of starting Amazon EC2 instances is via
the 'knife ec2 server create' command, while the 'knife ec2 instance data' and
chef preinstalled AMIs are deprecated. That's all fine and good for manually
launching instances, but what about autoscaling? When a node comes up for
autoscalling, I want it to update and configure itself according to its role(s)
and environment. With the instance data method of doing thing deprecated,
what's the alternative (aside from having a custom baked AMI that I need to
manually update as is the normal way of doing this without chef)? If there is
no alternative, and I have to use the deprecated method which would mean using
'knife ec2 instance data' to wire into the autoscaling group as UserData. When
using this method how would I specify the environment that the server should be
run as too? I don't see anyway to do this using the instance data. Running some
command outside the server after the server it is started to set the
environment is not an option as we won't know when a new server has been spun
up (controlled by Amazon).

I'm interested in what other people have found to be best practice for this
scenario and any advice, tips or tricks that have been learned while trying to
work with chef and autoscaled EC2 instances.

Thanks for your thoughts!

Edward_Sargisson · November 10, 2011, 6:11pm

You don't need such complexity. You can just include the role the node
wants in the user-date you register for auto-scaling.
I believe Ubuntu Oneiric has accepted a patch which makes the
user-data thing work with Chef better. There was an email announcement
about that a few months ago.

Cheers,
Edward

On Thu, Nov 10, 2011 at 10:08 AM, Aaron Abramson
aabramson@wi-figuys.com wrote:

I've thought about this, but haven't put much work into implementing it yet. So this is more of a concept than anything.

Set up a script listening on a server. It could be a simple php script, IE http://someurl/script.php

Set up the user-data with a bash script that hits the listener. (wget -O /dev/null http://someurl/script.php?role=myrole)

The script receives the request, and it knows 2 things, the public hostname of the server making the request, and the "role" it is requesting. The script makes certain assumptions (username, ssh-key, etc) and then executes a system command, calling "knife bootstrap server -r "role[myrole]"

So, if you auto-scale official cannonical ubuntu AMI's, simply have the user-data "ping" your knife listener, and knife then bootstraps the server.

Perhaps such a bootstrap request feature could be built into chef-server?

Aaron

On Nov 10, 2011, at 11:55 AM, John Martinez wrote:

Hello,

I agree with you, using knife to provision EC2 instances is not the best way to do this in autoscaled environments. As a temporary solution, I've done exactly what you suggest. I pass along a lot of things into user-data, and specifically, a node's role is one of those. I read user-data and tell Chef on the node what its role is (amongst other things). It's a kludge. I'm going to be looking into integrating Chef with CloudFormation, which plays nice(r) with ASGs.

AWS CloudFormation Developer Resources (second link)
https://s3.amazonaws.com/cloudformation-examples/IntegratingAWSCloudFormationWithOpscodeChef.pdf

-john

On Nov 10, 2011, at 9:44 AM, npahucki@gmail.com npahucki@gmail.com wrote:

Hi all, I'd like to get some feedback on how people are doing this now. It
seems that since .10, the preferred way of starting Amazon EC2 instances is via
the 'knife ec2 server create' command, while the 'knife ec2 instance data' and
chef preinstalled AMIs are deprecated. That's all fine and good for manually
launching instances, but what about autoscaling? When a node comes up for
autoscalling, I want it to update and configure itself according to its role(s)
and environment. With the instance data method of doing thing deprecated,
what's the alternative (aside from having a custom baked AMI that I need to
manually update as is the normal way of doing this without chef)? If there is
no alternative, and I have to use the deprecated method which would mean using
'knife ec2 instance data' to wire into the autoscaling group as UserData. When
using this method how would I specify the environment that the server should be
run as too? I don't see anyway to do this using the instance data. Running some
command outside the server after the server it is started to set the
environment is not an option as we won't know when a new server has been spun
up (controlled by Amazon).

I'm interested in what other people have found to be best practice for this
scenario and any advice, tips or tricks that have been learned while trying to
work with chef and autoscaled EC2 instances.

Thanks for your thoughts!

johnmartinez · November 10, 2011, 6:13pm

Yup, that's what I do.

-john

On Nov 10, 2011, at 10:11 AM, Edward Sargisson wrote:

You don't need such complexity. You can just include the role the node
wants in the user-date you register for auto-scaling.
I believe Ubuntu Oneiric has accepted a patch which makes the
user-data thing work with Chef better. There was an email announcement
about that a few months ago.

Cheers,
Edward

On Thu, Nov 10, 2011 at 10:08 AM, Aaron Abramson
aabramson@wi-figuys.com wrote:

I've thought about this, but haven't put much work into implementing it yet. So this is more of a concept than anything.

Set up a script listening on a server. It could be a simple php script, IE http://someurl/script.php

Set up the user-data with a bash script that hits the listener. (wget -O /dev/null http://someurl/script.php?role=myrole)

The script receives the request, and it knows 2 things, the public hostname of the server making the request, and the "role" it is requesting. The script makes certain assumptions (username, ssh-key, etc) and then executes a system command, calling "knife bootstrap server -r "role[myrole]"

So, if you auto-scale official cannonical ubuntu AMI's, simply have the user-data "ping" your knife listener, and knife then bootstraps the server.

Perhaps such a bootstrap request feature could be built into chef-server?

Aaron

On Nov 10, 2011, at 11:55 AM, John Martinez wrote:

Hello,

I agree with you, using knife to provision EC2 instances is not the best way to do this in autoscaled environments. As a temporary solution, I've done exactly what you suggest. I pass along a lot of things into user-data, and specifically, a node's role is one of those. I read user-data and tell Chef on the node what its role is (amongst other things). It's a kludge. I'm going to be looking into integrating Chef with CloudFormation, which plays nice(r) with ASGs.

AWS CloudFormation Developer Resources (second link)
https://s3.amazonaws.com/cloudformation-examples/IntegratingAWSCloudFormationWithOpscodeChef.pdf

-john

On Nov 10, 2011, at 9:44 AM, npahucki@gmail.com npahucki@gmail.com wrote:

Hi all, I'd like to get some feedback on how people are doing this now. It
seems that since .10, the preferred way of starting Amazon EC2 instances is via
the 'knife ec2 server create' command, while the 'knife ec2 instance data' and
chef preinstalled AMIs are deprecated. That's all fine and good for manually
launching instances, but what about autoscaling? When a node comes up for
autoscalling, I want it to update and configure itself according to its role(s)
and environment. With the instance data method of doing thing deprecated,
what's the alternative (aside from having a custom baked AMI that I need to
manually update as is the normal way of doing this without chef)? If there is
no alternative, and I have to use the deprecated method which would mean using
'knife ec2 instance data' to wire into the autoscaling group as UserData. When
using this method how would I specify the environment that the server should be
run as too? I don't see anyway to do this using the instance data. Running some
command outside the server after the server it is started to set the
environment is not an option as we won't know when a new server has been spun
up (controlled by Amazon).

I'm interested in what other people have found to be best practice for this
scenario and any advice, tips or tricks that have been learned while trying to
work with chef and autoscaled EC2 instances.

Thanks for your thoughts!

Bryan_Brandau · November 10, 2011, 6:27pm

We manage several environments that use lots of auto-scaling.
We've iterated on this a little.

At first, we were using user-data and this just ends up being bad. There
is no way of changing this unless you have the ability to stop the
instance. To me, this is unacceptable. We ended up getting trapped when
we wanted to change some of the user-data.

We then ran JEOS with chef-client and put the role in client-config.json.
When chef would run it would configure the server for the role that it is
supposed to be. This can be easily changed or managed with chef.

The problem with all of this bootstrapping and configuration on the fly is
that when you want to auto-scale you're triggering off of an event. You
want the machine to become available as soon as it can boot. This lead us
to baking/bundling AMI's with the full chef run. We still assign the role
with client-config.json. Chef was used to build/configure and will then
maintain state. If it's a minor change, we don't care about we won't
bundle/bake a new AMI. If we care about it, we bake/bundle.

We'll never be impacted from scaling by waiting for a machine to configure
or if the chef server is unavailable.

With user-data or managing the role on the machine you can always just
change what roles/recipes are assigned to the role.

On Thu, Nov 10, 2011 at 12:08 PM, Aaron Abramson aabramson@wi-figuys.comwrote:

I've thought about this, but haven't put much work into implementing it
yet. So this is more of a concept than anything.

Set up a script listening on a server. It could be a simple php script,
IE http://someurl/script.php

Set up the user-data with a bash script that hits the listener. (wget -O
/dev/null http://someurl/script.php?role=myrole)

The script receives the request, and it knows 2 things, the public
hostname of the server making the request, and the "role" it is requesting.
The script makes certain assumptions (username, ssh-key, etc) and then
executes a system command, calling "knife bootstrap server -r
"role[myrole]"

So, if you auto-scale official cannonical ubuntu AMI's, simply have the
user-data "ping" your knife listener, and knife then bootstraps the server.

Perhaps such a bootstrap request feature could be built into chef-server?

Aaron

On Nov 10, 2011, at 11:55 AM, John Martinez wrote:

Hello,

I agree with you, using knife to provision EC2 instances is not the best
way to do this in autoscaled environments. As a temporary solution, I've
done exactly what you suggest. I pass along a lot of things into user-data,
and specifically, a node's role is one of those. I read user-data and tell
Chef on the node what its role is (amongst other things). It's a kludge.
I'm going to be looking into integrating Chef with CloudFormation, which
plays nice(r) with ASGs.

AWS CloudFormation Developer Resources(second link)

https://s3.amazonaws.com/cloudformation-examples/IntegratingAWSCloudFormationWithOpscodeChef.pdf

-john

On Nov 10, 2011, at 9:44 AM, npahucki@gmail.com npahucki@gmail.com
wrote:

Hi all, I'd like to get some feedback on how people are doing this now.
It
seems that since .10, the preferred way of starting Amazon EC2
instances is via
the 'knife ec2 server create' command, while the 'knife ec2 instance
data' and
chef preinstalled AMIs are deprecated. That's all fine and good for
manually
launching instances, but what about autoscaling? When a node comes up
for
autoscalling, I want it to update and configure itself according to its
role(s)
and environment. With the instance data method of doing thing
deprecated,
what's the alternative (aside from having a custom baked AMI that I
need to
manually update as is the normal way of doing this without chef)? If
there is
no alternative, and I have to use the deprecated method which would
mean using
'knife ec2 instance data' to wire into the autoscaling group as
UserData. When
using this method how would I specify the environment that the server
should be
run as too? I don't see anyway to do this using the instance data.
Running some
command outside the server after the server it is started to set the
environment is not an option as we won't know when a new server has
been spun
up (controlled by Amazon).

I'm interested in what other people have found to be best practice for
this
scenario and any advice, tips or tricks that have been learned while
trying to
work with chef and autoscaled EC2 instances.

Thanks for your thoughts!

hedgehog · November 10, 2011, 10:35pm

On Fri, Nov 11, 2011 at 5:27 AM, Bryan Brandau agent462@gmail.com wrote:

We manage several environments that use lots of auto-scaling.
We've iterated on this a little.
At first, we were using user-data and this just ends up being bad. There is
no way of changing this unless you have the ability to stop the instance.
To me, this is unacceptable. We ended up getting trapped when we wanted to
change some of the user-data.
We then ran JEOS with chef-client and put the role in client-config.json.
When chef would run it would configure the server for the role that it is
supposed to be. This can be easily changed or managed with chef.
The problem with all of this bootstrapping and configuration on the fly is
that when you want to auto-scale you're triggering off of an event. You
want the machine to become available as soon as it can boot. This lead us
to baking/bundling AMI's with the full chef run. We still assign the role
with client-config.json. Chef was used to build/configure and will then
maintain state. If it's a minor change, we don't care about we won't
bundle/bake a new AMI. If we care about it, we bake/bundle.
We'll never be impacted from scaling by waiting for a machine to configure
or if the chef server is unavailable.
With user-data or managing the role on the machine you can always just
change what roles/recipes are assigned to the role.

This arises if you use continuous deployment and AutoScaling
instances, i.e. Bryan's bake in your Chef run won't suffice on its own
(we do bake an AMI with the most time consuming installation steps
completed).
Essentially you need to move from pushing your code to machines to
having your instances pulling your code.
Apart from employing userdata, another way is to:

place the Chef role and environment names in the security group
name of an instance.
userdata is a simple script that:
- git clones, then, checks out master (master branch has a ruby
  script that queries (using Ohai) the security group name and parses
  the role and env names).
- runs a script from the master branch to parse the environment and
  role details into shell env variables
- Checks out the $AWS_ENV branch (which contains the cookbooks)
- runs chef-solo, where chef-solo.rb uses ENV['AWS_ENV'],
  ENV['AWS_ROLE'] to configure the instance.

Actually the last step is done by running a bluepill chef-solo process.
Of course you still cannot change the role of an instance when it is
running, since security groups are bound to an instance.

HTH?

On Thu, Nov 10, 2011 at 12:08 PM, Aaron Abramson aabramson@wi-figuys.com
wrote:

I've thought about this, but haven't put much work into implementing it
yet. So this is more of a concept than anything.

Set up a script listening on a server. It could be a simple php script,
IE http://someurl/script.php

Set up the user-data with a bash script that hits the listener. (wget -O
/dev/null http://someurl/script.php?role=myrole)

The script receives the request, and it knows 2 things, the public
hostname of the server making the request, and the "role" it is requesting.
The script makes certain assumptions (username, ssh-key, etc) and then
executes a system command, calling "knife bootstrap server -r
"role[myrole]"

So, if you auto-scale official cannonical ubuntu AMI's, simply have the
user-data "ping" your knife listener, and knife then bootstraps the server.

Perhaps such a bootstrap request feature could be built into chef-server?

Aaron

On Nov 10, 2011, at 11:55 AM, John Martinez wrote:

Hello,

I agree with you, using knife to provision EC2 instances is not the best
way to do this in autoscaled environments. As a temporary solution, I've
done exactly what you suggest. I pass along a lot of things into user-data,
and specifically, a node's role is one of those. I read user-data and tell
Chef on the node what its role is (amongst other things). It's a kludge. I'm
going to be looking into integrating Chef with CloudFormation, which plays
nice(r) with ASGs.

http://aws.amazon.com/cloudformation/aws-cloudformation-articles-and-tutorials/
(second link)

https://s3.amazonaws.com/cloudformation-examples/IntegratingAWSCloudFormationWithOpscodeChef.pdf

-john

On Nov 10, 2011, at 9:44 AM, npahucki@gmail.com npahucki@gmail.com
wrote:

Hi all, I'd like to get some feedback on how people are doing this now.
It
seems that since .10, the preferred way of starting Amazon EC2
instances is via
the 'knife ec2 server create' command, while the 'knife ec2 instance
data' and
chef preinstalled AMIs are deprecated. That's all fine and good for
manually
launching instances, but what about autoscaling? When a node comes up
for
autoscalling, I want it to update and configure itself according to its
role(s)
and environment. With the instance data method of doing thing
deprecated,
what's the alternative (aside from having a custom baked AMI that I
need to
manually update as is the normal way of doing this without chef)? If
there is
no alternative, and I have to use the deprecated method which would
mean using
'knife ec2 instance data' to wire into the autoscaling group as
UserData. When
using this method how would I specify the environment that the server
should be
run as too? I don't see anyway to do this using the instance data.
Running some
command outside the server after the server it is started to set the
environment is not an option as we won't know when a new server has
been spun
up (controlled by Amazon).

I'm interested in what other people have found to be best practice for
this
scenario and any advice, tips or tricks that have been learned while
trying to
work with chef and autoscaled EC2 instances.

Thanks for your thoughts!

--
πόλλ' οἶδ ἀλώπηξ, ἀλλ' ἐχῖνος ἓν μέγα
[The fox knows many things, but the hedgehog knows one big thing.]
Archilochus, Greek poet (c. 680 BC – c. 645 BC)
http://hedgehogshiatus.com

Topic		Replies	Views
Chef on Amazon EC2 with auto-scaling Chef Infra (archive)	12	1492	July 23, 2011
Automatically starting a new EC2 instance with Chef Chef Infra (archive)	11	1749	July 26, 2011
Chef 12 and AWS AutoScaling Chef Infra (archive)	13	956	June 5, 2015
Auto scaling with chef questions Chef Infra (archive)	8	568	December 10, 2012
Best practices for autoscaling nodes? Chef Infra (archive)	3	1230	July 23, 2012

Using Chef to Configue Autoscaled EC2 Instances

Related topics