AWS Cookbook + Snapshot Tracking

I’m currently in the middle of assembling a process that takes a Chef bootstrapped AMI and yields a fully functional (data as well) server. My goal is to have EBS snapshot IDs of primary EBS volumes available to the bootstrapped node in some way (data bags, node attributes) so that the data that becomes available is not stale.

The AWS cookbook has a nice LWRP for EBS that handles snapshots — but doesn’t give you a way to save its ID for later use (unlike the create action that saves volume IDs to node attributes).

Would there be any value in adding an array of snapshot IDs to the corresponding volume ID node attribute on snapshot creation? I mentioned it in #chef today and received some initial concern over having to maintain an accurate representation of available snapshots in the node attributes — a valid concern. Any other feedback or suggestions to alternate approaches are welcome.

Thanks,


Hector

I'll reiterate my points from Chef Infra (archive)

On 27 January 2012 16:46, Hector Castro hectcastro@gmail.com wrote:

I'm currently in the middle of assembling a process that takes a Chef bootstrapped AMI and yields a fully functional (data as well) server. My goal is to have EBS snapshot IDs of primary EBS volumes available to the bootstrapped node in some way (data bags, node attributes) so that the data that becomes available is not stale.

The AWS cookbook has a nice LWRP for EBS that handles snapshots — but doesn't give you a way to save its ID for later use (unlike the create action that saves volume IDs to node attributes).

Would there be any value in adding an array of snapshot IDs to the corresponding volume ID node attribute on snapshot creation? I mentioned it in Chef Infra (archive) today and received some initial concern over having to maintain an accurate representation of available snapshots in the node attributes — a valid concern. Any other feedback or suggestions to alternate approaches are welcome.

My concerns mostly lie around using this data to make decisions on the
infrastructure. It's not the API -- it's not the canonical source of
information, sure, you could update it when you perform a snapshot..
but what if that snapshot errors out? It's a non blocking attribute
write, so.. that just gets lost somewhere. If something else tried to
search for that snapshot id, and it had errored after the chef run had
complete.. then.. you would have to obviously deal with that state.

Perhaps an ohai plugin could be developed to present node attributes
on the available snapshots for all volumes attached to the current
instance -- at least it would be accurate at the time of plugin
actually running, instead of having this delayed view of the API.

There are dragons here. Be super careful about any automatic recovery
system designed around EBS snapshots.

--AJ

Thanks,

--
Hector

In your snapshot backup script grab the snapshot id and store that in a
data bag.
Then your chef script grabs the id and loads that.

It's a non blocking attribute write, so.. that just gets lost somewhere.
However, this comment concerns me. I've built my infrastructure on the
assumption that if I get a snapshot id returned then I'm good to go.

On Thu, Jan 26, 2012 at 7:52 PM, AJ Christensen aj@junglist.gen.nz wrote:

I'll reiterate my points from Chef Infra (archive)

On 27 January 2012 16:46, Hector Castro hectcastro@gmail.com wrote:

I'm currently in the middle of assembling a process that takes a Chef
bootstrapped AMI and yields a fully functional (data as well) server. My
goal is to have EBS snapshot IDs of primary EBS volumes available to the
bootstrapped node in some way (data bags, node attributes) so that the data
that becomes available is not stale.

The AWS cookbook has a nice LWRP for EBS that handles snapshots — but
doesn't give you a way to save its ID for later use (unlike the create
action that saves volume IDs to node attributes).

Would there be any value in adding an array of snapshot IDs to the
corresponding volume ID node attribute on snapshot creation? I mentioned
it in Chef Infra (archive) today and received some initial concern over having to maintain
an accurate representation of available snapshots in the node attributes —
a valid concern. Any other feedback or suggestions to alternate approaches
are welcome.

My concerns mostly lie around using this data to make decisions on the
infrastructure. It's not the API -- it's not the canonical source of
information, sure, you could update it when you perform a snapshot..
but what if that snapshot errors out? It's a non blocking attribute
write, so.. that just gets lost somewhere. If something else tried to
search for that snapshot id, and it had errored after the chef run had
complete.. then.. you would have to obviously deal with that state.

Perhaps an ohai plugin could be developed to present node attributes
on the available snapshots for all volumes attached to the current
instance -- at least it would be accurate at the time of plugin
actually running, instead of having this delayed view of the API.

There are dragons here. Be super careful about any automatic recovery
system designed around EBS snapshots.

--AJ

Thanks,

--
Hector

If you get a snapshot ID returned from the API and the snapshot status
is good and you store that snapshot ID in a databag, apart from any
lifetime concerns with that particular snapshot ID (e.g. something out
of your control) nothing obvious springs to mind.

I hope that clarifies what I meant

On 27 January 2012 16:57, Edward Sargisson esarge@pobox.com wrote:

In your snapshot backup script grab the snapshot id and store that in a data
bag.
Then your chef script grabs the id and loads that.

It's a non blocking attribute write, so.. that just gets lost somewhere.
However, this comment concerns me. I've built my infrastructure on the
assumption that if I get a snapshot id returned then I'm good to go.

On Thu, Jan 26, 2012 at 7:52 PM, AJ Christensen aj@junglist.gen.nz wrote:

I'll reiterate my points from Chef Infra (archive)

On 27 January 2012 16:46, Hector Castro hectcastro@gmail.com wrote:

I'm currently in the middle of assembling a process that takes a Chef
bootstrapped AMI and yields a fully functional (data as well) server. My
goal is to have EBS snapshot IDs of primary EBS volumes available to the
bootstrapped node in some way (data bags, node attributes) so that the data
that becomes available is not stale.

The AWS cookbook has a nice LWRP for EBS that handles snapshots — but
doesn't give you a way to save its ID for later use (unlike the create
action that saves volume IDs to node attributes).

Would there be any value in adding an array of snapshot IDs to the
corresponding volume ID node attribute on snapshot creation? I mentioned it
in Chef Infra (archive) today and received some initial concern over having to maintain an
accurate representation of available snapshots in the node attributes — a
valid concern. Any other feedback or suggestions to alternate approaches
are welcome.

My concerns mostly lie around using this data to make decisions on the
infrastructure. It's not the API -- it's not the canonical source of
information, sure, you could update it when you perform a snapshot..
but what if that snapshot errors out? It's a non blocking attribute
write, so.. that just gets lost somewhere. If something else tried to
search for that snapshot id, and it had errored after the chef run had
complete.. then.. you would have to obviously deal with that state.

Perhaps an ohai plugin could be developed to present node attributes
on the available snapshots for all volumes attached to the current
instance -- at least it would be accurate at the time of plugin
actually running, instead of having this delayed view of the API.

There are dragons here. Be super careful about any automatic recovery
system designed around EBS snapshots.

--AJ

Thanks,

--
Hector

AJ, thanks for adding your feedback here.

On Jan 26, 2012, at 10:52 PM, AJ Christensen wrote:

Perhaps an ohai plugin could be developed to present node attributes
on the available snapshots for all volumes attached to the current
instance -- at least it would be accurate at the time of plugin
actually running, instead of having this delayed view of the API.

I actually like your Ohai plugin approach better – thanks.

--
Hector

I took some time today to assemble AJ's suggestion of handling this via an Ohai plugin:

https://github.com/hectcastro/ohai-plugins/blob/master/ebs_snapshots.rb

Any feedback is welcome.

--
Hector

On Friday, January 27, 2012 at 9:45 AM, Hector Castro wrote:

AJ, thanks for adding your feedback here.

On Jan 26, 2012, at 10:52 PM, AJ Christensen wrote:

Perhaps an ohai plugin could be developed to present node attributes
on the available snapshots for all volumes attached to the current
instance -- at least it would be accurate at the time of plugin
actually running, instead of having this delayed view of the API.

I actually like your Ohai plugin approach better – thanks.

--
Hector

this is fucking awesome

On 4 February 2012 10:46, Hector Castro hectcastro@gmail.com wrote:

I took some time today to assemble AJ's suggestion of handling this via an
Ohai plugin:

https://github.com/hectcastro/ohai-plugins/blob/master/ebs_snapshots.rb

Any feedback is welcome.

--
Hector

On Friday, January 27, 2012 at 9:45 AM, Hector Castro wrote:

AJ, thanks for adding your feedback here.

On Jan 26, 2012, at 10:52 PM, AJ Christensen wrote:

Perhaps an ohai plugin could be developed to present node attributes
on the available snapshots for all volumes attached to the current
instance -- at least it would be accurate at the time of plugin
actually running, instead of having this delayed view of the API.

I actually like your Ohai plugin approach better – thanks.

--
Hector