Using data generated in a LWRP


#1

Hi all,

I’ve only been using chef for a couple of months, so bear with me if I
mangle the terminology :slight_smile:

I’ve got a LWRP that’s creating an AWS RDS instance. Within the LWRP I
wait till the RDS instance becomes available (while loop + sleep) and
get it’s details, of which I’m particularly interested in the endpoint
address. I can log this by doing something along the lines of:

Chef::Log.info("RDS endpoint address is
#{rds_instance[:endpoint_address]}")

What I now want to do is subscribe to the resource that’s using this
provider to create the rds_instance and for the subscribing resource to
have the rds_instance object available, particularly the endpoint
address. The subscribing resource is then going to connect to the RDS
instance create a database, set up some permissions, etc.

Is there a way to pass this information, e.g. something along the lines
of making the rds_instance object available? If not, my other thought is
to store it on the node or in a data bag, but this seems like a hack to
get round a limitation that’s only there due to my lack of knowledge.

I’m reasonably new to both chef and ruby, so hopefully the solution is
obvious to more experienced chef/ruby devs. Any advice greatly appreciated.

Thanks
Nick


#2

On Mon, Apr 2, 2012 at 7:21 PM, Nick Peirson nickpeirson@gmail.com wrote:

Hi all,

I’ve only been using chef for a couple of months, so bear with me if I
mangle the terminology :slight_smile:

I’ve got a LWRP that’s creating an AWS RDS instance. Within the LWRP I wait
till the RDS instance becomes available (while loop + sleep) and get it’s
details, of which I’m particularly interested in the endpoint address. I can
log this by doing something along the lines of:

Chef::Log.info(“RDS endpoint address is #{rds_instance[:endpoint_address]}”)

What I now want to do is subscribe to the resource that’s using this
provider to create the rds_instance and for the subscribing resource to have
the rds_instance object available, particularly the endpoint address. The
subscribing resource is then going to connect to the RDS instance create a
database, set up some permissions, etc.

Is there a way to pass this information, e.g. something along the lines of
making the rds_instance object available? If not, my other thought is to
store it on the node or in a data bag, but this seems like a hack to get
round a limitation that’s only there due to my lack of knowledge.

The interesting thing is that the Notification object (which is used
to keep track of queued notifications) holds a reference to resource
that triggered the notification.
You can see that in the logs:

    Chef::Log.info( "#{notification.notifying_resource} sending

#{notification.action}"
" action to #{notification.resource} (delayed)")

However when it’s time to call run_action, that reference is dropped.

We could store a reference to either the Notification or
notification.notifying_resource in the run_context, so that the
notified resource could access it.
I don’t see a real downside here, and it.

Thoughts?

Andrea


#3

I’d suggest storing the data you are receiving from AWS in the node.
You could (with a bit of fiddling) probably store it in a data-bag
too, although it would have to be pre-created so machines can update
it.

The elastic block device LWRP has an example of the former.

Cheers,

–AJ

On 3 April 2012 18:53, Andrea Campi andrea.campi@zephirworks.com wrote:

On Mon, Apr 2, 2012 at 7:21 PM, Nick Peirson nickpeirson@gmail.com wrote:

Hi all,

I’ve only been using chef for a couple of months, so bear with me if I
mangle the terminology :slight_smile:

I’ve got a LWRP that’s creating an AWS RDS instance. Within the LWRP I wait
till the RDS instance becomes available (while loop + sleep) and get it’s
details, of which I’m particularly interested in the endpoint address. I can
log this by doing something along the lines of:

Chef::Log.info(“RDS endpoint address is #{rds_instance[:endpoint_address]}”)

What I now want to do is subscribe to the resource that’s using this
provider to create the rds_instance and for the subscribing resource to have
the rds_instance object available, particularly the endpoint address. The
subscribing resource is then going to connect to the RDS instance create a
database, set up some permissions, etc.

Is there a way to pass this information, e.g. something along the lines of
making the rds_instance object available? If not, my other thought is to
store it on the node or in a data bag, but this seems like a hack to get
round a limitation that’s only there due to my lack of knowledge.

The interesting thing is that the Notification object (which is used
to keep track of queued notifications) holds a reference to resource
that triggered the notification.
You can see that in the logs:

   Chef::Log.info( "#{notification.notifying_resource} sending

#{notification.action}"
" action to #{notification.resource} (delayed)")

However when it’s time to call run_action, that reference is dropped.

We could store a reference to either the Notification or
notification.notifying_resource in the run_context, so that the
notified resource could access it.
I don’t see a real downside here, and it.

Thoughts?

Andrea


#4

I was thinking further about this and thought I’d mention that the
node has a ‘run_state’ hash that you can chuck stuff into and access
throughout compile time. It will not make it into the node data, but
is fine for temporal (or calculated) data that exists the length of
the run.

node.run_state[:foo] = ‘bar’

HTH

–AJ

On 3 April 2012 18:55, AJ Christensen aj@junglist.gen.nz wrote:

I’d suggest storing the data you are receiving from AWS in the node.
You could (with a bit of fiddling) probably store it in a data-bag
too, although it would have to be pre-created so machines can update
it.

The elastic block device LWRP has an example of the former.

Cheers,

–AJ

On 3 April 2012 18:53, Andrea Campi andrea.campi@zephirworks.com wrote:

On Mon, Apr 2, 2012 at 7:21 PM, Nick Peirson nickpeirson@gmail.com wrote:

Hi all,

I’ve only been using chef for a couple of months, so bear with me if I
mangle the terminology :slight_smile:

I’ve got a LWRP that’s creating an AWS RDS instance. Within the LWRP I wait
till the RDS instance becomes available (while loop + sleep) and get it’s
details, of which I’m particularly interested in the endpoint address. I can
log this by doing something along the lines of:

Chef::Log.info(“RDS endpoint address is #{rds_instance[:endpoint_address]}”)

What I now want to do is subscribe to the resource that’s using this
provider to create the rds_instance and for the subscribing resource to have
the rds_instance object available, particularly the endpoint address. The
subscribing resource is then going to connect to the RDS instance create a
database, set up some permissions, etc.

Is there a way to pass this information, e.g. something along the lines of
making the rds_instance object available? If not, my other thought is to
store it on the node or in a data bag, but this seems like a hack to get
round a limitation that’s only there due to my lack of knowledge.

The interesting thing is that the Notification object (which is used
to keep track of queued notifications) holds a reference to resource
that triggered the notification.
You can see that in the logs:

   Chef::Log.info( "#{notification.notifying_resource} sending

#{notification.action}"
" action to #{notification.resource} (delayed)")

However when it’s time to call run_action, that reference is dropped.

We could store a reference to either the Notification or
notification.notifying_resource in the run_context, so that the
notified resource could access it.
I don’t see a real downside here, and it.

Thoughts?

Andrea


#5

I second fujin, except u have two options on where to store it

I have run into this same issue w/ my tomcat cookbook

You can store it on the node or you can store it on the run_state object

the node object will be persisted to the main chef repository. If you add
too much to the node object, you could end up w/ a lot of crap in your
couchdb.

the run_state object is not persisted to couchdb, this makes it ideal for
storing passwords or anything you don’t want or need persisted to the chef
repository

On Tue, Apr 3, 2012 at 8:55 AM, AJ Christensen aj@junglist.gen.nz wrote:

I’d suggest storing the data you are receiving from AWS in the node.
You could (with a bit of fiddling) probably store it in a data-bag
too, although it would have to be pre-created so machines can update
it.

The elastic block device LWRP has an example of the former.

Cheers,

–AJ

On 3 April 2012 18:53, Andrea Campi andrea.campi@zephirworks.com wrote:

On Mon, Apr 2, 2012 at 7:21 PM, Nick Peirson nickpeirson@gmail.com
wrote:

Hi all,

I’ve only been using chef for a couple of months, so bear with me if I
mangle the terminology :slight_smile:

I’ve got a LWRP that’s creating an AWS RDS instance. Within the LWRP I
wait

till the RDS instance becomes available (while loop + sleep) and get
it’s

details, of which I’m particularly interested in the endpoint address.
I can

log this by doing something along the lines of:

Chef::Log.info(“RDS endpoint address is
#{rds_instance[:endpoint_address]}”)

What I now want to do is subscribe to the resource that’s using this
provider to create the rds_instance and for the subscribing resource to
have

the rds_instance object available, particularly the endpoint address.
The

subscribing resource is then going to connect to the RDS instance
create a

database, set up some permissions, etc.

Is there a way to pass this information, e.g. something along the lines
of

making the rds_instance object available? If not, my other thought is to
store it on the node or in a data bag, but this seems like a hack to get
round a limitation that’s only there due to my lack of knowledge.

The interesting thing is that the Notification object (which is used
to keep track of queued notifications) holds a reference to resource
that triggered the notification.
You can see that in the logs:

   Chef::Log.info( "#{notification.notifying_resource} sending

#{notification.action}"
" action to #{notification.resource} (delayed)")

However when it’s time to call run_action, that reference is dropped.

We could store a reference to either the Notification or
notification.notifying_resource in the run_context, so that the
notified resource could access it.
I don’t see a real downside here, and it.

Thoughts?

Andrea


#6

On 03/04/2012 07:53, Andrea Campi wrote:

We could store a reference to either the Notification or
notification.notifying_resource in the run_context, so that the
notified resource could access it.
I don’t see a real downside here, and it.
This sounds like the ideal solution. From the replies it sounds like a few people have this issue and have found ways to deal with it, however this would seem like a better implementation as it makes it clearer that this data has been made available by the LWRP and gives a clear indication of when the data will be available. I think node.run_state will do what I want, so I’m going to take a look at it, but I feel that it may not be as clear for other people looking at my code how/when something was pushed into the node.run_state. Andrea’s suggested solution feels much more explicit, which I prefer.

If there is, as you say, no downside, then providing this as an optional way of approaching the problem leaves the choice down to the developer of the LWRP/resource as to how they wish to implement it.

Either way, I’m very impressed with the amount of constructive responses, what a great community! Your help is much appreciated.

Thanks
Nick


#7

Steven Danna had some pretty strong opinions about where to store data
returned from an lwrp iirc

SSD, care to chime in?

On Tue, Apr 3, 2012 at 9:28 AM, Nick Peirson nickpeirson@gmail.com wrote:

On 03/04/2012 07:53, Andrea Campi wrote:

We could store a reference to either the Notification or
notification.notifying_**resource in the run_context, so that the
notified resource could access it.
I don’t see a real downside here, and it.

This sounds like the ideal solution. From the replies it sounds like a few
people have this issue and have found ways to deal with it, however this
would seem like a better implementation as it makes it clearer that this
data has been made available by the LWRP and gives a clear indication of
when the data will be available. I think node.run_state will do what I
want, so I’m going to take a look at it, but I feel that it may not be as
clear for other people looking at my code how/when something was pushed
into the node.run_state. Andrea’s suggested solution feels much more
explicit, which I prefer.

If there is, as you say, no downside, then providing this as an optional
way of approaching the problem leaves the choice down to the developer of
the LWRP/resource as to how they wish to implement it.

Either way, I’m very impressed with the amount of constructive responses,
what a great community! Your help is much appreciated.

Thanks
Nick


#8

Hi,

On Tue, Apr 3, 2012 at 12:49 AM, Bryan Berry bryan.berry@gmail.com wrote:

Steven Danna had some pretty strong opinions about where to store data
returned from an lwrp iirc

I think that ultimately my strong opinion about this is to avoid using
the node object to pass state around resources if it is possible.
However, it is occasionally necessary, and in those cases I don’t have
a strong opinion about whether to store the information in a node
attribute or the run_state.

AJ used the ebs resource as an example. One point worth mentioning,
is that if you end up storing this information in a node attribute,
you should almost certainly follow the ebs resource’s example and make
sure that the data in the node attribute serves as a default for a
resource attribute that can be explicitly set if desired.

Cheers,

Steven


#9

On 03/04/2012 15:40, Steven Danna wrote:

I think that ultimately my strong opinion about this is to avoid using
the node object to pass state around resources if it is possible.
However, it is occasionally necessary, and in those cases I don’t have
a strong opinion about whether to store the information in a node
attribute or the run_state.
I agree. I’ve ended up using node.run_state to pass state, but this is implicit and not a convention I’ve seen documented. Anyone using my the LWRP I’ve written will have to understand this before being able to use it to find the state information.

Andrea’s suggestion seemed to provide a good solution, what are everyone’s thoughts on it? If my ruby was stronger I’d try to hack together a pull request to get the ball rolling but, as it’s not, would raising a bug report be a good starting point?

Cheers
Nick


#10

On Thu, Apr 5, 2012 at 10:32 AM, Nick Peirson nickpeirson@gmail.com wrote:

On 03/04/2012 15:40, Steven Danna wrote:

I think that ultimately my strong opinion about this is to avoid using
the node object to pass state around resources if it is possible.
However, it is occasionally necessary, and in those cases I don’t have
a strong opinion about whether to store the information in a node
attribute or the run_state.

I agree. I’ve ended up using node.run_state to pass state, but this is
implicit and not a convention I’ve seen documented. Anyone using my the LWRP
I’ve written will have to understand this before being able to use it to
find the state information.

Andrea’s suggestion seemed to provide a good solution, what are everyone’s
thoughts on it? If my ruby was stronger I’d try to hack together a pull
request to get the ball rolling but, as it’s not, would raising a bug report
be a good starting point?

Indeed it would. Also, pestering^Wpinging people for feedback either
here or on IRC would help.
Ideally, it would be nice to have other use cases; for example,
existing cookbooks that could be simplified by this functionality.

If there is enough interest, I may sit down and implement it the next
time I get a time slot for working on Chef “core” functionality.
Or I may do it regardless, but no promise about that :wink:

Andrea


#11

On Mon, Apr 2, 2012 at 7:21 PM, Nick Peirson nickpeirson@gmail.com wrote:

Hi all,

I’ve only been using chef for a couple of months, so bear with me if I
mangle the terminology :slight_smile:

I’ve got a LWRP that’s creating an AWS RDS instance. Within the LWRP I wait
till the RDS instance becomes available (while loop + sleep) and get it’s
details, of which I’m particularly interested in the endpoint address. I can
log this by doing something along the lines of:

Chef::Log.info(“RDS endpoint address is #{rds_instance[:endpoint_address]}”)

What I now want to do is subscribe to the resource that’s using this
provider to create the rds_instance and for the subscribing resource to have
the rds_instance object available, particularly the endpoint address. The
subscribing resource is then going to connect to the RDS instance create a
database, set up some permissions, etc.

Is there a way to pass this information, e.g. something along the lines of
making the rds_instance object available? If not, my other thought is to
store it on the node or in a data bag, but this seems like a hack to get
round a limitation that’s only there due to my lack of knowledge.

I’ve done a spike on my idea and it looks doable. However, that made
me think harder and I’m not sure it’s such a great idea.

Say you have resources A and B that both notify resource C.
At the risk of over-generalizing, I’m assuming C may then respond
differently based on who notified it.
The consequence then is that C will likely not be idempotent.

As another way to look at it, imagine the run fails between the moment
A triggers it and when C runs.
On the next run, C will not be notified and will likely do nothing.
There are a bunch of assumptions in here, but in general, you don’t want this.

In your case, I would create a node attribute with your RDS endpoint.
C could then be written in an idempotent way; for example, it would
somehow notice that the address has changed on one or more resource,
and do its thing.

Hand-waving aside, would this make sense for you?

Andrea


#12

So, I’m not really aware of a way of passing around references to
objects in chef. The lifecycle of objects is a bit funky.

An alternative would be to use something like definitions [1], which
basically reverses the flow. Rather than the “subscriber” getting an
instance… the definition ““collects”” all the the actions that need
to be applied (during the compile phase), and performs them during the
convergence phase. Make sure to grok [2]…

[1] http://wiki.opscode.com/display/chef/Definitions
[2] http://wiki.opscode.com/display/chef/Anatomy+of+a+Chef+Run

On Mon, Apr 2, 2012 at 1:21 PM, Nick Peirson nickpeirson@gmail.com wrote:

Hi all,

I’ve only been using chef for a couple of months, so bear with me if I
mangle the terminology :slight_smile:

I’ve got a LWRP that’s creating an AWS RDS instance. Within the LWRP I wait
till the RDS instance becomes available (while loop + sleep) and get it’s
details, of which I’m particularly interested in the endpoint address. I can
log this by doing something along the lines of:

Chef::Log.info(“RDS endpoint address is #{rds_instance[:endpoint_address]}”)

What I now want to do is subscribe to the resource that’s using this
provider to create the rds_instance and for the subscribing resource to have
the rds_instance object available, particularly the endpoint address. The
subscribing resource is then going to connect to the RDS instance create a
database, set up some permissions, etc.

Is there a way to pass this information, e.g. something along the lines of
making the rds_instance object available? If not, my other thought is to
store it on the node or in a data bag, but this seems like a hack to get
round a limitation that’s only there due to my lack of knowledge.

I’m reasonably new to both chef and ruby, so hopefully the solution is
obvious to more experienced chef/ruby devs. Any advice greatly appreciated.

Thanks
Nick


#13

On 06/04/2012 07:36, Andrea Campi wrote:

I’ve done a spike on my idea and it looks doable. However, that made
me think harder and I’m not sure it’s such a great idea.

Say you have resources A and B that both notify resource C.
At the risk of over-generalizing, I’m assuming C may then respond
differently based on who notified it.
The consequence then is that C will likely not be idempotent.

As another way to look at it, imagine the run fails between the moment
A triggers it and when C runs.
On the next run, C will not be notified and will likely do nothing.
There are a bunch of assumptions in here, but in general, you don’t want this.

In your case, I would create a node attribute with your RDS endpoint.
C could then be written in an idempotent way; for example, it would
somehow notice that the address has changed on one or more resource,
and do its thing.

Hand-waving aside, would this make sense for you?

Andrea
I’m more interested in subscribing to my RDS action than notifying it in my
current use case. Let me try and clarify my use case.

I’ve got an RDS LWRP. It has a create action which tests for the existence of
the RDS instance before trying to create it.

Whether it’s created it or not, I’m currently setting an attribute(?) on
node.run_state to contain a hash containing data about the RDS instance and
set updated_by_last_action to true. If I’m unable to create or find the
instance, I raise an exception (or let it bubble up from the underlying RDS
class), however I suppose I could change this to just not set
updated_by_last_action to true.

I don’t want to persist the RDS data on the node as RDS doesn’t run on a node
like, for example, MySQL does. It’s not specific to a node and it’s feasible
that the recipe could be run from multiple nodes. If I could run chef on an
RDS instance, life would be much simpler :slight_smile: Maybe looking at other use cases
where chef is controlling resources that aren’t on the node itself would help
determine the best way to handle this situation?

As it stands, I instead use the information on node.run_state to update a
data bag with the current details from the RDS LWRP create action. Using the
solution you described I could subscribe to the notification from the
creation of the RDS instance and use the information directly from the
notification object, which seems more explicit.

In practice I could make my RDS LWRP write immediately to a data bag, however
I’m aiming for nicely decoupled, re-usable components, which I’ve achieved to
some extent. Thanks to the help from this list, I’ve got a workaround by
(ab)using node.run_state. My opinion is if your solution was implemented then
it gives the potential of producing cleaner code, but if it doesn’t get
implemented I still have a working solution.

Thanks
Nick