AWS Cookbook + Snapshot Tracking

Hector_Castro · January 27, 2012, 3:46am

I’m currently in the middle of assembling a process that takes a Chef bootstrapped AMI and yields a fully functional (data as well) server. My goal is to have EBS snapshot IDs of primary EBS volumes available to the bootstrapped node in some way (data bags, node attributes) so that the data that becomes available is not stale.

The AWS cookbook has a nice LWRP for EBS that handles snapshots — but doesn’t give you a way to save its ID for later use (unlike the create action that saves volume IDs to node attributes).

Would there be any value in adding an array of snapshot IDs to the corresponding volume ID node attribute on snapshot creation? I mentioned it in #chef today and received some initial concern over having to maintain an accurate representation of available snapshots in the node attributes — a valid concern. Any other feedback or suggestions to alternate approaches are welcome.

Thanks,

–
Hector

AJ_Christensen · January 27, 2012, 3:52am

I'll reiterate my points from Chef Infra (archive)

On 27 January 2012 16:46, Hector Castro hectcastro@gmail.com wrote:

I'm currently in the middle of assembling a process that takes a Chef bootstrapped AMI and yields a fully functional (data as well) server. My goal is to have EBS snapshot IDs of primary EBS volumes available to the bootstrapped node in some way (data bags, node attributes) so that the data that becomes available is not stale.

The AWS cookbook has a nice LWRP for EBS that handles snapshots — but doesn't give you a way to save its ID for later use (unlike the create action that saves volume IDs to node attributes).

Would there be any value in adding an array of snapshot IDs to the corresponding volume ID node attribute on snapshot creation? I mentioned it in Chef Infra (archive) today and received some initial concern over having to maintain an accurate representation of available snapshots in the node attributes — a valid concern. Any other feedback or suggestions to alternate approaches are welcome.

My concerns mostly lie around using this data to make decisions on the
infrastructure. It's not the API -- it's not the canonical source of
information, sure, you could update it when you perform a snapshot..
but what if that snapshot errors out? It's a non blocking attribute
write, so.. that just gets lost somewhere. If something else tried to
search for that snapshot id, and it had errored after the chef run had
complete.. then.. you would have to obviously deal with that state.

Perhaps an ohai plugin could be developed to present node attributes
on the available snapshots for all volumes attached to the current
instance -- at least it would be accurate at the time of plugin
actually running, instead of having this delayed view of the API.

There are dragons here. Be super careful about any automatic recovery
system designed around EBS snapshots.

--AJ

Thanks,

--
Hector

Edward_Sargisson · January 27, 2012, 3:57am

In your snapshot backup script grab the snapshot id and store that in a
data bag.
Then your chef script grabs the id and loads that.

It's a non blocking attribute write, so.. that just gets lost somewhere.
However, this comment concerns me. I've built my infrastructure on the
assumption that if I get a snapshot id returned then I'm good to go.

On Thu, Jan 26, 2012 at 7:52 PM, AJ Christensen aj@junglist.gen.nz wrote:

I'll reiterate my points from Chef Infra (archive)

On 27 January 2012 16:46, Hector Castro hectcastro@gmail.com wrote:

I'm currently in the middle of assembling a process that takes a Chef
bootstrapped AMI and yields a fully functional (data as well) server. My
goal is to have EBS snapshot IDs of primary EBS volumes available to the
bootstrapped node in some way (data bags, node attributes) so that the data
that becomes available is not stale.

The AWS cookbook has a nice LWRP for EBS that handles snapshots — but
doesn't give you a way to save its ID for later use (unlike the create
action that saves volume IDs to node attributes).

Would there be any value in adding an array of snapshot IDs to the
corresponding volume ID node attribute on snapshot creation? I mentioned
it in Chef Infra (archive) today and received some initial concern over having to maintain
an accurate representation of available snapshots in the node attributes —
a valid concern. Any other feedback or suggestions to alternate approaches
are welcome.

My concerns mostly lie around using this data to make decisions on the
infrastructure. It's not the API -- it's not the canonical source of
information, sure, you could update it when you perform a snapshot..
but what if that snapshot errors out? It's a non blocking attribute
write, so.. that just gets lost somewhere. If something else tried to
search for that snapshot id, and it had errored after the chef run had
complete.. then.. you would have to obviously deal with that state.

Perhaps an ohai plugin could be developed to present node attributes
on the available snapshots for all volumes attached to the current
instance -- at least it would be accurate at the time of plugin
actually running, instead of having this delayed view of the API.

There are dragons here. Be super careful about any automatic recovery
system designed around EBS snapshots.

--AJ

Thanks,

--
Hector

AJ_Christensen · January 27, 2012, 4:00am

If you get a snapshot ID returned from the API and the snapshot status
is good and you store that snapshot ID in a databag, apart from any
lifetime concerns with that particular snapshot ID (e.g. something out
of your control) nothing obvious springs to mind.

I hope that clarifies what I meant

On 27 January 2012 16:57, Edward Sargisson esarge@pobox.com wrote:

In your snapshot backup script grab the snapshot id and store that in a data
bag.
Then your chef script grabs the id and loads that.

It's a non blocking attribute write, so.. that just gets lost somewhere.
However, this comment concerns me. I've built my infrastructure on the
assumption that if I get a snapshot id returned then I'm good to go.

On Thu, Jan 26, 2012 at 7:52 PM, AJ Christensen aj@junglist.gen.nz wrote:

I'll reiterate my points from Chef Infra (archive)

On 27 January 2012 16:46, Hector Castro hectcastro@gmail.com wrote:

I'm currently in the middle of assembling a process that takes a Chef
bootstrapped AMI and yields a fully functional (data as well) server. My
goal is to have EBS snapshot IDs of primary EBS volumes available to the
bootstrapped node in some way (data bags, node attributes) so that the data
that becomes available is not stale.

The AWS cookbook has a nice LWRP for EBS that handles snapshots — but
doesn't give you a way to save its ID for later use (unlike the create
action that saves volume IDs to node attributes).

Would there be any value in adding an array of snapshot IDs to the
corresponding volume ID node attribute on snapshot creation? I mentioned it
in Chef Infra (archive) today and received some initial concern over having to maintain an
accurate representation of available snapshots in the node attributes — a
valid concern. Any other feedback or suggestions to alternate approaches
are welcome.

My concerns mostly lie around using this data to make decisions on the
infrastructure. It's not the API -- it's not the canonical source of
information, sure, you could update it when you perform a snapshot..
but what if that snapshot errors out? It's a non blocking attribute
write, so.. that just gets lost somewhere. If something else tried to
search for that snapshot id, and it had errored after the chef run had
complete.. then.. you would have to obviously deal with that state.

Perhaps an ohai plugin could be developed to present node attributes
on the available snapshots for all volumes attached to the current
instance -- at least it would be accurate at the time of plugin
actually running, instead of having this delayed view of the API.

There are dragons here. Be super careful about any automatic recovery
system designed around EBS snapshots.

--AJ

Thanks,

--
Hector

Hector_Castro · January 27, 2012, 2:45pm

AJ, thanks for adding your feedback here.

On Jan 26, 2012, at 10:52 PM, AJ Christensen wrote:

Perhaps an ohai plugin could be developed to present node attributes
on the available snapshots for all volumes attached to the current
instance -- at least it would be accurate at the time of plugin
actually running, instead of having this delayed view of the API.

I actually like your Ohai plugin approach better – thanks.

--
Hector

Hector_Castro · February 3, 2012, 9:46pm

I took some time today to assemble AJ's suggestion of handling this via an Ohai plugin:

https://github.com/hectcastro/ohai-plugins/blob/master/ebs_snapshots.rb

Any feedback is welcome.

--
Hector

On Friday, January 27, 2012 at 9:45 AM, Hector Castro wrote:

AJ, thanks for adding your feedback here.

On Jan 26, 2012, at 10:52 PM, AJ Christensen wrote:

Perhaps an ohai plugin could be developed to present node attributes
on the available snapshots for all volumes attached to the current
instance -- at least it would be accurate at the time of plugin
actually running, instead of having this delayed view of the API.

I actually like your Ohai plugin approach better – thanks.

--
Hector

AJ_Christensen · February 3, 2012, 9:57pm

this is fucking awesome

On 4 February 2012 10:46, Hector Castro hectcastro@gmail.com wrote:

I took some time today to assemble AJ's suggestion of handling this via an
Ohai plugin:

https://github.com/hectcastro/ohai-plugins/blob/master/ebs_snapshots.rb

Any feedback is welcome.

--
Hector

On Friday, January 27, 2012 at 9:45 AM, Hector Castro wrote:

AJ, thanks for adding your feedback here.

On Jan 26, 2012, at 10:52 PM, AJ Christensen wrote:

Perhaps an ohai plugin could be developed to present node attributes
on the available snapshots for all volumes attached to the current
instance -- at least it would be accurate at the time of plugin
actually running, instead of having this delayed view of the API.

I actually like your Ohai plugin approach better – thanks.

--
Hector

Topic		Replies	Views
Snapshot backup cookbook in AWS Chef Infra (archive)	0	254	February 19, 2015
How to access attributes later in a recipe from where they're set Chef Infra (archive)	3	432	May 20, 2011
Examples of mounting and RAIDing EBS volumes with chef? Chef Infra (archive)	1	644	November 11, 2011
AWS cookbook and the run list Chef Infra (archive)	11	531	January 27, 2015
Get attribute from workstation (as opposed to from the node) Chef Infra (archive)	1	304	February 24, 2015

AWS Cookbook + Snapshot Tracking

Related topics