Starting out with the opscode platform

Hey all.

I want through the starting tutorial on the opscode web site using the
opscode platform. Everything worked as expected but I do have one
question.

Apparently the configuration for my node is kept only on the chef
server (not on my workstation). Is this right? I can see the run list
on the server, I can get the run list when I with knife but I don’t
see a file anywhere on my machine which lists the node, lists the
variables or the params for the node etc.

Does this mean I need to create a recipe for each node so I can
configure it? What is the best practice for setting up your nodes?

Cheers.

Normally node's are configured either via the API (knife is a frontend
for the API, look at knife node run_list add in this case, and
knife node edit in general) or the web interface, unlike roles which
are managed as files. If you would like to manage nodes in files you
can do it in the same was as with roles though, just use knife node show --format json to dump the current data, and use knife node from file to upload changes. Hope that helps!

--Noah

On May 29, 2011, at 7:08 PM, Tim Uckun wrote:

Hey all.

I want through the starting tutorial on the opscode web site using the
opscode platform. Everything worked as expected but I do have one
question.

Apparently the configuration for my node is kept only on the chef
server (not on my workstation). Is this right? I can see the run list
on the server, I can get the run list when I with knife but I don't
see a file anywhere on my machine which lists the node, lists the
variables or the params for the node etc.

Does this mean I need to create a recipe for each node so I can
configure it? What is the best practice for setting up your nodes?

Cheers.

Does this mean I need to create a recipe for each node so I can
configure it? What is the best practice for setting up your nodes?
Ah no.
You define a Role for each kind of server you have in your
architecture. If you have a database server and two application
servers (Tomcat or PHP or something) then you would setup two roles,
db-server and app-server.
Then, in the run list for the node you specify the role.
When the chef-client reads the run list it finds the role, reads the
role and follows the recipes in the role.

If you have some custom setup (most people do) then you would create a
new recipe to do that and include it in the relevant role.

The run list can be set at the time of the first run (by including it
in the bootstrap file) or you can edit it through the web interface or
with knife.

Cheers,
Edward

On Sun, May 29, 2011 at 7:08 PM, Tim Uckun timuckun@gmail.com wrote:

Hey all.

I want through the starting tutorial on the opscode web site using the
opscode platform. Everything worked as expected but I do have one
question.

Apparently the configuration for my node is kept only on the chef
server (not on my workstation). Is this right? I can see the run list
on the server, I can get the run list when I with knife but I don't
see a file anywhere on my machine which lists the node, lists the
variables or the params for the node etc.

Does this mean I need to create a recipe for each node so I can
configure it? What is the best practice for setting up your nodes?

Cheers.

If you have some custom setup (most people do) then you would create a
new recipe to do that and include it in the relevant role.

How about highly specific parameters like IP address etc.

I am sorry if this is a dumb question but I am coming from puppet and
I guess I still have that the puppet way of doing things in my head.

Normally I set up my node and set some variables like

$ip_address = 'blah'
$host_name = 'blah'

Then I might have individual resources for this node alone like

file :some_file { ...}

This along with the inheritance of other common roles defines the
state of the node.

I am wondering where that configuration takes place in the chef node.
From what you guys are saying it sounds like I need to create a role
for the server right?

On May 29, 2011, at 7:30 PM, Tim Uckun wrote:

If you have some custom setup (most people do) then you would
create a
new recipe to do that and include it in the relevant role.

How about highly specific parameters like IP address etc.

I am sorry if this is a dumb question but I am coming from puppet and
I guess I still have that the puppet way of doing things in my head.

Normally I set up my node and set some variables like

$ip_address = 'blah'
$host_name = 'blah'

Then I might have individual resources for this node alone like

file :some_file { ...}

This along with the inheritance of other common roles defines the
state of the node.

I am wondering where that configuration takes place in the chef node.
From what you guys are saying it sounds like I need to create a role
for the server right?

Generally in these kinds of things are already provided via DHCP, but
you could build recipes to set them more explicitly.

--Noah

No dumb questions - only unasked ones. Genuinely - chef is a different
model so ask questions.

I would ask why you need to set the ip address and host name and where
else could you put it?

If you've only got a couple of actual nodes then, by all means, set
the details in a recipe. But the point of chef is that you can manage
hundreds or thousands of nodes.

So I don't think your roles and recipes should care about the actual
IP address - just that there is one. However, presumably some other
server needs to know what the ip address is (the load balancer needs
to know about the app servers, say). This means you need to write
something to find it. You could have your nodes register with the DNS
server or you could call an API on your load balancer. Lastly, your
load balancer could query chef to find out what IP addresses it needs.
I've not used it but Chef has a search function so you could write
something that looks for all nodes with the app-server role. In the
load balancer chef recipe you iterate that list and write a config
file with it.

On Sun, May 29, 2011 at 7:30 PM, Tim Uckun timuckun@gmail.com wrote:

If you have some custom setup (most people do) then you would create a
new recipe to do that and include it in the relevant role.

How about highly specific parameters like IP address etc.

I am sorry if this is a dumb question but I am coming from puppet and
I guess I still have that the puppet way of doing things in my head.

Normally I set up my node and set some variables like

$ip_address = 'blah'
$host_name = 'blah'

Then I might have individual resources for this node alone like

file :some_file { ...}

This along with the inheritance of other common roles defines the
state of the node.

I am wondering where that configuration takes place in the chef node.
From what you guys are saying it sounds like I need to create a role
for the server right?

I would ask why you need to set the ip address and host name and where
else could you put it?

I am just using that as an example. It could be any variable like to
location of the backup file, the name of the user that the
applications are deployed under, etc.

Basically every host has some unique requirements and settings.

If you've only got a couple of actual nodes then, by all means, set
the details in a recipe. But the point of chef is that you can manage
hundreds or thousands of nodes.

to know about the app servers, say). This means you need to write
something to find it. You could have your nodes register with the DNS
server or you could call an API on your load balancer. Lastly, your
load balancer could query chef to find out what IP addresses it needs.
I've not used it but Chef has a search function so you could write
something that looks for all nodes with the app-server role. In the
load balancer chef recipe you iterate that list and write a config
file with it.

I'll have to think about this a while and see how I can make it fit.

Like I said right now I am able to specify global configurations like
load_balancer_ip and load_balancer_traffic_subnet and then I can
reference those in the various templates that nodes use to manage
their resources.

I also have overrides for those variables and additional variables on
a per host basis.

For example in my case the load balancer is managed by rackspace.
Rackspace tells me what subnets to expect the traffic from and the IP
of the load balancer. Every host that is being that load balancer has
to have an IPtables rule to accept traffic from the subnets for that
load balancer.

I am just using that as an example. It could be any variable like to
location of the backup file, the name of the user that the
applications are deployed under, etc.

Those sound like things that should be set per-Role or per-Recipe not per -Node.
i.e. they don't sound like that have to be unique for two nodes of the
same Role.

On Sun, May 29, 2011 at 8:08 PM, Tim Uckun timuckun@gmail.com wrote:

I would ask why you need to set the ip address and host name and where
else could you put it?

I am just using that as an example. It could be any variable like to
location of the backup file, the name of the user that the
applications are deployed under, etc.

Basically every host has some unique requirements and settings.

If you've only got a couple of actual nodes then, by all means, set
the details in a recipe. But the point of chef is that you can manage
hundreds or thousands of nodes.

to know about the app servers, say). This means you need to write
something to find it. You could have your nodes register with the DNS
server or you could call an API on your load balancer. Lastly, your
load balancer could query chef to find out what IP addresses it needs.
I've not used it but Chef has a search function so you could write
something that looks for all nodes with the app-server role. In the
load balancer chef recipe you iterate that list and write a config
file with it.

I'll have to think about this a while and see how I can make it fit.

Like I said right now I am able to specify global configurations like
load_balancer_ip and load_balancer_traffic_subnet and then I can
reference those in the various templates that nodes use to manage
their resources.

I also have overrides for those variables and additional variables on
a per host basis.

For example in my case the load balancer is managed by rackspace.
Rackspace tells me what subnets to expect the traffic from and the IP
of the load balancer. Every host that is being that load balancer has
to have an IPtables rule to accept traffic from the subnets for that
load balancer.

On May 29, 2011, at 8:08 PM, Tim Uckun wrote:

I would ask why you need to set the ip address and host name and
where
else could you put it?

I am just using that as an example. It could be any variable like to
location of the backup file, the name of the user that the
applications are deployed under, etc.

Basically every host has some unique requirements and settings.

We try to encourage the workflow such that these values should be in
roles (or in data bags referred to from roles somehow). You can store
this information on a per-node basis, but this runs counter to the
philosophy that a network should have no special snowflake machines,
they might just be the only machine running a given role. That said,
Chef itself is fairly unopinionated on the matter, using knife node edit and knife node from file will address this use case.

--Noah

We try to encourage the workflow such that these values should be in roles
(or in data bags referred to from roles somehow). You can store this
information on a per-node basis, but this runs counter to the philosophy
that a network should have no special snowflake machines, they might just be
the only machine running a given role. That said, Chef itself is fairly
unopinionated on the matter, using knife node edit and knife node from file will address this use case.

Thank you. I guess I have to adapt my way of thinking but I am
willing to give it a go and see what chef is all about.

On Sun, May 29, 2011 at 8:08 PM, Tim Uckun timuckun@gmail.com wrote:

Basically every host has some unique requirements and settings.

These instances are fantastic opportunities to ask yourself why. Can
you use another service to automate these, like using DHCP for IP
addresses? Could Chef calculate the correct figure, such as Java VM
size from the available physical memory?

One of the goals of automation is removing these manual steps so not
only can you deploy systems faster, but it is easier for someone
lacking the tribal knowledge to do it as well.

Bryan

On Sun, May 29, 2011 at 8:08 PM, Tim Uckun timuckun@gmail.com wrote:

I would ask why you need to set the ip address and host name and where
else could you put it?

I am just using that as an example. It could be any variable like to
location of the backup file, the name of the user that the
applications are deployed under, etc.

Basically every host has some unique requirements and settings.

For host-specific configuration settings I tend to use attributes. See
this blog post for more info:

HTH

Grig

For host-specific configuration settings I tend to use attributes. See
this blog post for more info:

So it sounds like you have a cookbook that holds global attributes
that you override in your JSON file.

Why not create a cookbok for that individual host and override the values there?

On Tue, May 31, 2011 at 4:07 PM, Tim Uckun timuckun@gmail.com wrote:

Why not create a cookbok for that individual host and override the values there?

Systems get treated as resources that can be thrown out and replaced.
Your database server isn't "romulus" and it may not even be "db01".
For example, at Opscode all of our systems have a unique id such as
EC2 instance ID or a partial GUID as their hostname (we also have a
more descriptive alias that is autogenerated based on roles) because
they're simply resources upon which to run services.

Imagine you have a bread company and ten bread trucks with the same
options and paint job. You can use any of them to make your delivery.
One of them gets a flat tire and can't run its route but you have
another that you could use. Do you delay the delivery so you can still
use that truck, or do you use another and fix the truck? You don't
want your trucks to be unique because you're not in the business of
driving trucks, but rather of baking and delivering bread.

We're not in the business of running servers, but whatever business we
are in requires that we run servers so we aim to turn it into a
commodity where reasonable. Chef is designed around this model.

If you have a host-specific cookbook, you're focusing on running that
specific host. Whenever you make another host, you have to create and
upload another cookbook. We want to reduce the number of manual
configuration steps with Chef, not increase them. Whenever possible,
it should only take a couple commands to build another server of the
same type (or role) as one you already have.

Most of us are trying to run a particular service on our servers, so
we make service and application specific cookbooks, and then tie them
together into a package with roles. Then when we register the servers
as nodes, we can apply one or more roles to that server based on what
services we want it to provide.

Depending on what these settings are, I'd personally set them on the
node after it is registered as an attribute.

In summary, you don't want any host specific settings. When you have
to have them, you should avoid changing the model so that it is built
around these settings.

Bryan