Best Practices for Retrieving Generated Passwords

Hi,

I work for a small ISP with about 23,000 servers and I’m trying to get
some configuration management in the mix to help deploy/support a new
product we’re about to roll out.

I’m currently in the process of biting off more than I can chew with
Chef, so some of this may be Chef 101. I apologise if I’m asking stupid
questions, but I’ve not been able to find a solid answer elsewhere (and
I would consider my Google-fu fairly tuned).

We have an in-house application that helps us to manage our inventory,
assets, passwords, etc for the all the servers we host. I need to get
Chef to configure a server with users and passwords, along with
generating some other information to go into various configuration files
on the server. These must all be retrieved and placed into our in-house
system so we have them all on record.

I’ve had a look at the Users cookbook and I see this can generate
passwords and such, and then (from what I can ascertain) those items
become available through knife node show nodename -m. Which is fine
for the odd server here and there, but I intend to use this to deploy a
few hundred servers, so automation is a must.

Two questions:

  • (Likely Chef 101 but I’ve not seen how to do this yet) Is there a way
    I can store arbitrary data for the local node somewhere? For example, if
    I generate a username and password for a haproxy statistics page, where
    can I then retrieve these from? Use of an encrypted databag? This is
    probably me just not RTFM to be fair - links appreciated.

  • How can I gather this username/password information in a more
    automated way? Is there an API of some kind that can be called to
    retrieve this information from the Chef server? Unfortunately the
    in-house system is developed by a separate team, so I don’t have many
    options for integration beyond “here’s an API, implement this”. I’m more
    than happy to write glue code for this if necessary.

I hope my requirements make sense, and I apologise again for being
clueless. :slight_smile:

Dane

Hi Dane,

Chef provides a place where you can store this kind of information, it is
called data bags[1], for a security plus you can use encrypted data
bags[2]. Data bags provide also a convenient way to retrieve the
information on chef recipes.

Chef exposes a full REST API[3], so you can get information out of chef
doing using curl or any http library you want. Also if you use ruby, you
can use chef itself to consume the chef server API[4].

[1] http://wiki.opscode.com/display/chef/Data+Bags
[2] http://wiki.opscode.com/display/chef/Encrypted+Data+Bags
[3] http://wiki.opscode.com/display/chef/Server+API
[4] http://wiki.opscode.com/display/chef/Making+Authenticated+API+Requests

Hope this helps,

Jacobo García López de Araujo
http://thebourbaki.com | http://robotplaysguitar.com |
http://twitter.com/therobot

On Mon, Dec 3, 2012 at 5:12 PM, Dane Elwell mlist@xiol.co.uk wrote:

Hi,

I work for a small ISP with about 23,000 servers and I'm trying to get
some configuration management in the mix to help deploy/support a new
product we're about to roll out.

I'm currently in the process of biting off more than I can chew with Chef,
so some of this may be Chef 101. I apologise if I'm asking stupid
questions, but I've not been able to find a solid answer elsewhere (and I
would consider my Google-fu fairly tuned).

We have an in-house application that helps us to manage our inventory,
assets, passwords, etc for the all the servers we host. I need to get Chef
to configure a server with users and passwords, along with generating some
other information to go into various configuration files on the server.
These must all be retrieved and placed into our in-house system so we have
them all on record.

I've had a look at the Users cookbook and I see this can generate
passwords and such, and then (from what I can ascertain) those items become
available through knife node show nodename -m. Which is fine for the odd
server here and there, but I intend to use this to deploy a few hundred
servers, so automation is a must.

Two questions:

  • (Likely Chef 101 but I've not seen how to do this yet) Is there a way I
    can store arbitrary data for the local node somewhere? For example, if I
    generate a username and password for a haproxy statistics page, where can I
    then retrieve these from? Use of an encrypted databag? This is probably me
    just not RTFM to be fair - links appreciated.

  • How can I gather this username/password information in a more automated
    way? Is there an API of some kind that can be called to retrieve this
    information from the Chef server? Unfortunately the in-house system is
    developed by a separate team, so I don't have many options for integration
    beyond "here's an API, implement this". I'm more than happy to write glue
    code for this if necessary.

I hope my requirements make sense, and I apologise again for being
clueless. :slight_smile:

Dane

What you’ve described is not an uncommon issue, you have a number of options for dealing with storing data on your nodes or retrieving it dynamically. I assume you have a means of bootstrapping the machines and you want to create a user and password during the bootstrap and pull that from your API. The simplest way would be to create an cookbook that is part of your initial bootstrap that pulls this data from your API and creates the user. The user data could be stored on the node by the cookbook and then accessed from the Chef server via search (accessible via API, knife or within recipes). The users cookbook is pulling from a data bag to get the user data, you’re just going to call your in-house API to essentially do the same thing.

If encrypting the data is more of a concern, you may want to use encrypted data bags to store the users and passwords, but then you will need to distribute the decryption key for use by the cookbook.

Thanks,
Matt Ray
Senior Technical Evangelist | Opscode Inc.
matt@opscode.com | (512) 731-2218
Twitter, IRC, GitHub: mattray


From: Dane Elwell [mlist@xiol.co.uk]
Sent: Monday, December 03, 2012 10:12 AM
To: chef@lists.opscode.com
Subject: [chef] Best Practices for Retrieving Generated Passwords

Hi,

I work for a small ISP with about 23,000 servers and I’m trying to get
some configuration management in the mix to help deploy/support a new
product we’re about to roll out.

I’m currently in the process of biting off more than I can chew with
Chef, so some of this may be Chef 101. I apologise if I’m asking stupid
questions, but I’ve not been able to find a solid answer elsewhere (and
I would consider my Google-fu fairly tuned).

We have an in-house application that helps us to manage our inventory,
assets, passwords, etc for the all the servers we host. I need to get
Chef to configure a server with users and passwords, along with
generating some other information to go into various configuration files
on the server. These must all be retrieved and placed into our in-house
system so we have them all on record.

I’ve had a look at the Users cookbook and I see this can generate
passwords and such, and then (from what I can ascertain) those items
become available through knife node show nodename -m. Which is fine
for the odd server here and there, but I intend to use this to deploy a
few hundred servers, so automation is a must.

Two questions:

  • (Likely Chef 101 but I’ve not seen how to do this yet) Is there a way
    I can store arbitrary data for the local node somewhere? For example, if
    I generate a username and password for a haproxy statistics page, where
    can I then retrieve these from? Use of an encrypted databag? This is
    probably me just not RTFM to be fair - links appreciated.

  • How can I gather this username/password information in a more
    automated way? Is there an API of some kind that can be called to
    retrieve this information from the Chef server? Unfortunately the
    in-house system is developed by a separate team, so I don’t have many
    options for integration beyond “here’s an API, implement this”. I’m more
    than happy to write glue code for this if necessary.

I hope my requirements make sense, and I apologise again for being
clueless. :slight_smile:

Dane

Another option I’ve used (if you have some means of authenticating the
client, e.g. own the network and can ensure IP addresses can’t possibly be
spoofed):

Have the “in-house provisioning application” generate passwords and serve a
JSON (or other) file.
It’s a rather simple matter to replace data bags uses with a method that
consumes JSON.
This usually ends up reducing duplication, and makes it easy to enforce any
kind of additional constraints you may have.

I will clean up and publish my consume_json cookbook if you are interested.

Andrea

Dane,

Chef has data-bags, where you can use like your own custom database tables
in the chef-server's database however you like. Provided, usually, you're
using them in single-writer/many-reader fashion, with the human sysadmin
usually being the single writer.

But if you need additional capabilities like transactions, you can of
course set up your own database that offers those capabilities (e.g. your
own postgres db on a central server somewhere) and use that database from
your recipes. Or if you already have a database, you can just use that
database from your recipes. There are ruby drivers for many popular
database systems, and because your recipes are just ruby, you can usually
use them from your recipes.

Cheers,
Jay

On Mon, Dec 3, 2012 at 11:12 AM, Dane Elwell mlist@xiol.co.uk wrote:

Hi,

I work for a small ISP with about 23,000 servers and I'm trying to get
some configuration management in the mix to help deploy/support a new
product we're about to roll out.

I'm currently in the process of biting off more than I can chew with Chef,
so some of this may be Chef 101. I apologise if I'm asking stupid
questions, but I've not been able to find a solid answer elsewhere (and I
would consider my Google-fu fairly tuned).

We have an in-house application that helps us to manage our inventory,
assets, passwords, etc for the all the servers we host. I need to get Chef
to configure a server with users and passwords, along with generating some
other information to go into various configuration files on the server.
These must all be retrieved and placed into our in-house system so we have
them all on record.

I've had a look at the Users cookbook and I see this can generate
passwords and such, and then (from what I can ascertain) those items become
available through knife node show nodename -m. Which is fine for the odd
server here and there, but I intend to use this to deploy a few hundred
servers, so automation is a must.

Two questions:

  • (Likely Chef 101 but I've not seen how to do this yet) Is there a way I
    can store arbitrary data for the local node somewhere? For example, if I
    generate a username and password for a haproxy statistics page, where can I
    then retrieve these from? Use of an encrypted databag? This is probably me
    just not RTFM to be fair - links appreciated.

  • How can I gather this username/password information in a more automated
    way? Is there an API of some kind that can be called to retrieve this
    information from the Chef server? Unfortunately the in-house system is
    developed by a separate team, so I don't have many options for integration
    beyond "here's an API, implement this". I'm more than happy to write glue
    code for this if necessary.

I hope my requirements make sense, and I apologise again for being
clueless. :slight_smile:

Dane

On 12/3/12 8:12 AM, "Dane Elwell" mlist@xiol.co.uk wrote:

Two questions:

  • (Likely Chef 101 but I've not seen how to do this yet) Is there a way
    I can store arbitrary data for the local node somewhere? For example, if
    I generate a username and password for a haproxy statistics page, where
    can I then retrieve these from? Use of an encrypted databag? This is
    probably me just not RTFM to be fair - links appreciated.

So is the flow you are looking for here:

  • Configure a bunch of services on a server, auto-generating secure
    passwords
  • Store those passwords in your arbitrary database someplace
  • Profit

Yes?

The node itself stores its attributes, so that would be the logical place
for the auto-generated dataŠ but do you really want to store them in plain
text?

  • How can I gather this username/password information in a more
    automated way? Is there an API of some kind that can be called to
    retrieve this information from the Chef server? Unfortunately the
    in-house system is developed by a separate team, so I don't have many
    options for integration beyond "here's an API, implement this". I'm more
    than happy to write glue code for this if necessary.

The Chef Server itself is a REST API, and you can absolutely use it here.
Answer my question above re: flow, and I'l reply again. :slight_smile:

I hope my requirements make sense, and I apologise again for being
clueless. :slight_smile:

No worries, dude - everyone starts somewhere. :slight_smile:

Best,
Adam

On 2012-12-03 17:47, Adam Jacob wrote:

On 12/3/12 8:12 AM, "Dane Elwell" mlist@xiol.co.uk wrote:

Two questions:

  • (Likely Chef 101 but I've not seen how to do this yet) Is there a
    way
    I can store arbitrary data for the local node somewhere? For example,
    if
    I generate a username and password for a haproxy statistics page,
    where
    can I then retrieve these from? Use of an encrypted databag? This is
    probably me just not RTFM to be fair - links appreciated.

So is the flow you are looking for here:

  • Configure a bunch of services on a server, auto-generating secure
    passwords
  • Store those passwords in your arbitrary database someplace
  • Profit

Yes?

The node itself stores its attributes, so that would be the logical
place
for the auto-generated dataŠ but do you really want to store them in
plain
text?

Indeed, that's kinda the flow I'm looking for, as strange as the second
requirement may seem.

I don't have much insight as to the storage of those passwords on our
other system as I have no control or view into the internal workings of
that system. (I consider it to be a black box that consumes REST and
JSON (luckily), maybe some SOAP, and spits out lots of information about
our servers). FWIW, the security of that system is fairly robust.

  • How can I gather this username/password information in a more
    automated way? Is there an API of some kind that can be called to
    retrieve this information from the Chef server? Unfortunately the
    in-house system is developed by a separate team, so I don't have many
    options for integration beyond "here's an API, implement this". I'm
    more
    than happy to write glue code for this if necessary.

The Chef Server itself is a REST API, and you can absolutely use it
here.
Answer my question above re: flow, and I'l reply again. :slight_smile:

The Chef server API does seem to be the way to go here. The developers
of the other system are attempting to standardize on REST and JSON APIs
so getting them up and running with that shouldn't be too difficult!

Thanks

Dane

On 2012-12-03 16:49, Andrea Campi wrote:

Another option I've used (if you have some means of authenticating
the
client, e.g. own the network and can ensure IP addresses can't
possibly
be spoofed):

Have the "in-house provisioning application" generate passwords and
serve
a JSON (or other) file.
It's a rather simple matter to replace data bags uses with a method
that
consumes JSON.
This usually ends up reducing duplication, and makes it easy to
enforce
any kind of additional constraints you may have.

I will clean up and publish my consume_json cookbook if you are
interested.

Andrea

That would be quite interesting to see if you don't mind sharing?
Replacing databags with that means I can invoke my requests directly to
the other system rather than have that pull from the Chef server. I
think this would be the best way of going about things for my
intentions.

Thanks

Dane

Dane,

I just pushed this to Github:

It contains some basic documentation, I trust you'll be able to figure it
out (it's really trivial).

One more idea: a neat trick is to change the JSON document returned by that
endpoint on every request.
Does that sound like violating the "idempotent" tenet? Read on :slight_smile:

We run several dozens (and growing) of instances of pretty much the same
application; they are usually pinned to the same release.
When we want to roll out a new build, we'll just tell the Rails app that
emits the JSON document, and wait for the next Chef run.
On each request the controller app will randomly pick a few instances to
upgrade; it will also remember not to pick that instance again for some
amount of time.
Once the instance comes up after the upgrade, they will broadcast the
current version info on AMQP; the controller will notice and mark that
instance as up to date.

Ignoring details specific to us, what we accomplish in this way is:

  • rate limiting: each Chef client will only process x work items every y
    minutes;
  • horizontal scaling: if we need to process more per time unit, we can just
    add more Chef clients;
  • robust retry on error with throttling;
  • out of band feedback loop on the outcome of the deployment (we don't need
    to act on notifications, if anything goes wrong we'll just do it again
    after the quiet period);

Hope somebody else can find this interesting!

Andrea

On Mon, Dec 3, 2012 at 7:12 PM, Dane Elwell mlist@xiol.co.uk wrote:

On 2012-12-03 16:49, Andrea Campi wrote:

Another option I've used (if you have some means of authenticating the

client, e.g. own the network and can ensure IP addresses can't possibly
be spoofed):

Have the "in-house provisioning application" generate passwords and serve
a JSON (or other) file.
It's a rather simple matter to replace data bags uses with a method that
consumes JSON.
This usually ends up reducing duplication, and makes it easy to enforce
any kind of additional constraints you may have.

I will clean up and publish my consume_json cookbook if you are
interested.

Andrea

That would be quite interesting to see if you don't mind sharing?
Replacing databags with that means I can invoke my requests directly to the
other system rather than have that pull from the Chef server. I think this
would be the best way of going about things for my intentions.

Thanks

Dane