The future of the database and application cookbooks


#1

As some people have noticed from my musings on IRC and elsewhere, I have embarked on a (probably long overdue) overhaul of the application and database cookbooks. This is largely orthogonal (for now) with the recent LWRPs added to the database cookbook, though in the end they would be removed. What follows is a set of syntax ideas for where I think this should head. Not all of this is implemented, but you can see what is in my cookbooks branch https://github.com/coderanger/cookbooks/tree/COOK-634.

Database clusters

In the new LWRP hierarchy the top level resource is a database_cluster. This defines an abstract model of a single-master-multi-slave cluster setup. Within a cluster is one or more database_servers, each of which defines a single type of database service (MySQL, Postgres, Redis, Cassandra, etc) mapped on to the containing cluster. In general I would expect most clusters to only contain a single database_server block, but this isn’t required and if you have a beefy primary box for, say, both Postgres and Redis and wanted to have both use the same backup machine you could put those both in the same cluster definition. Within a database_server you can define database and users, though this isn’t needed for all types of servers and the exact implementation of what those two constructs do is left up the backend plugin implementing the specified server type.

With all that laying down the model, lets look at some syntax:

database_cluster "prod"
master_role "prod_database_master"
slave_role “prod_database_slave”

database_server do
type :mysql
database “prod” do
engine "innodb"
end
user “myuser” do
password "xxx"
grant do
select "prod"
insert "prod"
update "prod"
end
end
end

database_server do
type :redis
end
end

This will create a cluster running a MySQL server with one database named prod and a Redis server (in the case of Redis there are no specific tuning params we need to worry about so far). This example shows a very verbose form of the desired syntax, below is an equivalent version using some more sugar and allowing for sane defaults:

database_cluster "prod"
mysql do
database do
engine "innodb"
end
user “myuser” do
password "xxx"
grant “prod” do
select
insert
update
end
end
end

redis
end

You can also pass Hashes to database and user instead of blocks if you want to feed them from an external data source (such as a data bag):

database_cluster "prod"
mysql do
database “prod”, {:engine => “innodb”}
user “myuser”, {:password => “xxx”, :grant => {:select => "prod, …}}
end

redis
end

As mentioned before, the actual implementation can choose how to handle databases and users. In the above examples, if you added any to the redis section it would simply be a no-op (or maybe a runtime error?). Also as a structural piece, database and user definitions will only ever be processed on the database master (are there databases where this isn’t the case and slaves should create users and such too?). The individual backends for this would live in the cookbook for that program, so the database cookbook would only hold the core infrastructure and plumbing to run things. The idea is that this resource block would be placed somewhere central and assigned to all servers so they can use it for reference (see below in the app cookbook section). Only nodes with the master or slave role would actually do anything. As a special case if the system detects that the master role actually doesn’t exist it will run in a degraded mode assuming a single-node cluster (so obviously the current node is the master).

The database cookbook would also grow an associated knife-database plugin to handle replication setup/teardown as for some databases you have to perform an out-of-band data sync before attaching the slave.

Seen in isolation, does this seem suitable flexible to cover most needs? I am mostly familiar with SQL databases and a few smaller NoSQL products, so for the greater NoSQL world is this still a viable model?

Application deployment

The model for the application LWRPs is generally simpler in terms of data, but is also more callback-driven. The top-level application resource just contains the central data used for all applications, and then any framework or platform specific bits are contained in sub-resources similar to the database_server arrangement. Below is an example of deploying the Radiant CMS:

application “radiant” do
path "/srv/radiant"
owner "nobody"
group "nogroup"
repository "git://github.com/radiant/radiant.git"
revision "master"
packages [“libxml2-dev”, “libxslt1-dev”, “libsqlite3-dev”]
migrate true

rails do
gems “bundler” => nil, “RedCloth” => "4.2.3"
database :mysql => “radiant_production” do
user "radiant"
reconnect true
encoding "utf8"
end
migration_command "bundle exec rake db:migrate"
end

unicorn do
port 8000
end
end

The global data maps very directly to fields in the existing application data bags, except that they are not Hashes indexed by environment name. In the rails sub-resource you see that again things mostly map to the data bag. The biggest exception is the database section which doesn’t actually state any information about the database. Instead things are looked up by reference to the information in the database resource. It would still be possible to specify all information manually if you aren’t using the database cookbook.

Application backends define actions as callbacks more or less, to provide code to execute at different points during the deploy process. The current callbacks are :before_compile, :before_deploy, :before_migrate, :before_symlink, :before_restart, and :after_restart. The last 4 just map to the existing callbacks in the deploy resource system. before_compile takes place as the first thing when the application resource executes (so when the provider is compiled), while before_deploy takes place during the execute phase in the provider, but after the main application folder structure is created.

The plan is to convert all the existing application cookbook recipes into a small wrapper just piping data bags into this LWRP structure, so they should continue to work as they always have.

So, what do people think? Is this a step in the right direction for application deployment? Do you like the idea but what a different syntax for it? Are there big, glaring use cases I’ve missed?

–Noah Kantrowitz

tl;dr Application deployment with Chef is going to be awesome!


What will become of the database cookbook?
#2

Thanks for the opportunity to discuss, always a good thing to hear different opinions.

As an engineer I totally understand the desire to DRY up these common patterns, but IMO these multi-stack/generic cookbooks add complexity to the cookbook and pull in lots of dependencies I don’t use, cluttering up my cookbook repo.

I see this being similar to the ‘rails’ database agnosticism of the migrations dsl. I’ve only used it once, and that was due to engineering curiosity of how well it worked using sqlite locally and a ‘real’ db in production. In practice, there has always been something that forced me to go outside the dsl and run db specific DDL. So in my experience you really don’t utilize this particular db agnosticism feature once you pick a stack, but it adds lot’s of engineering effort for the feature implementor to cover all the variations in platforms.

With databases’ in particular, each vendor has their peculiarities, so I see lot’s of rat holes in a single cookbook that covers different databases and deployment patterns. I understand that’s the reason for providers, but I think the engineering effort is better spent getting a kick-ass db specific cookbook versus a superficial cookbook that kinda/sorta works across db’s.

Alex

On Aug 30, 2011, at 7:39 PM, Noah Kantrowitz wrote:

As some people have noticed from my musings on IRC and elsewhere, I have embarked on a (probably long overdue) overhaul of the application and database cookbooks. This is largely orthogonal (for now) with the recent LWRPs added to the database cookbook, though in the end they would be removed. What follows is a set of syntax ideas for where I think this should head. Not all of this is implemented, but you can see what is in my cookbooks branch https://github.com/coderanger/cookbooks/tree/COOK-634.

Database clusters

In the new LWRP hierarchy the top level resource is a database_cluster. This defines an abstract model of a single-master-multi-slave cluster setup. Within a cluster is one or more database_servers, each of which defines a single type of database service (MySQL, Postgres, Redis, Cassandra, etc) mapped on to the containing cluster. In general I would expect most clusters to only contain a single database_server block, but this isn’t required and if you have a beefy primary box for, say, both Postgres and Redis and wanted to have both use the same backup machine you could put those both in the same cluster definition. Within a database_server you can define database and users, though this isn’t needed for all types of servers and the exact implementation of what those two constructs do is left up the backend plugin implementing the specified server type.

With all that laying down the model, lets look at some syntax:

database_cluster "prod"
master_role "prod_database_master"
slave_role “prod_database_slave”

database_server do
type :mysql
database “prod” do
engine "innodb"
end
user “myuser” do
password "xxx"
grant do
select "prod"
insert "prod"
update "prod"
end
end
end

database_server do
type :redis
end
end

This will create a cluster running a MySQL server with one database named prod and a Redis server (in the case of Redis there are no specific tuning params we need to worry about so far). This example shows a very verbose form of the desired syntax, below is an equivalent version using some more sugar and allowing for sane defaults:

database_cluster "prod"
mysql do
database do
engine "innodb"
end
user “myuser” do
password "xxx"
grant “prod” do
select
insert
update
end
end
end

redis
end

You can also pass Hashes to database and user instead of blocks if you want to feed them from an external data source (such as a data bag):

database_cluster "prod"
mysql do
database “prod”, {:engine => “innodb”}
user “myuser”, {:password => “xxx”, :grant => {:select => "prod, …}}
end

redis
end

As mentioned before, the actual implementation can choose how to handle databases and users. In the above examples, if you added any to the redis section it would simply be a no-op (or maybe a runtime error?). Also as a structural piece, database and user definitions will only ever be processed on the database master (are there databases where this isn’t the case and slaves should create users and such too?). The individual backends for this would live in the cookbook for that program, so the database cookbook would only hold the core infrastructure and plumbing to run things. The idea is that this resource block would be placed somewhere central and assigned to all servers so they can use it for reference (see below in the app cookbook section). Only nodes with the master or slave role would actually do anything. As a special case if the system detects that the master role actually doesn’t exist it will run in a degraded mode assuming a single-node cluster (so obviously the current node is the master).

The database cookbook would also grow an associated knife-database plugin to handle replication setup/teardown as for some databases you have to perform an out-of-band data sync before attaching the slave.

Seen in isolation, does this seem suitable flexible to cover most needs? I am mostly familiar with SQL databases and a few smaller NoSQL products, so for the greater NoSQL world is this still a viable model?

Application deployment

The model for the application LWRPs is generally simpler in terms of data, but is also more callback-driven. The top-level application resource just contains the central data used for all applications, and then any framework or platform specific bits are contained in sub-resources similar to the database_server arrangement. Below is an example of deploying the Radiant CMS:

application “radiant” do
path "/srv/radiant"
owner "nobody"
group "nogroup"
repository "git://github.com/radiant/radiant.git"
revision "master"
packages [“libxml2-dev”, “libxslt1-dev”, “libsqlite3-dev”]
migrate true

rails do
gems “bundler” => nil, “RedCloth” => "4.2.3"
database :mysql => “radiant_production” do
user "radiant"
reconnect true
encoding "utf8"
end
migration_command "bundle exec rake db:migrate"
end

unicorn do
port 8000
end
end

The global data maps very directly to fields in the existing application data bags, except that they are not Hashes indexed by environment name. In the rails sub-resource you see that again things mostly map to the data bag. The biggest exception is the database section which doesn’t actually state any information about the database. Instead things are looked up by reference to the information in the database resource. It would still be possible to specify all information manually if you aren’t using the database cookbook.

Application backends define actions as callbacks more or less, to provide code to execute at different points during the deploy process. The current callbacks are :before_compile, :before_deploy, :before_migrate, :before_symlink, :before_restart, and :after_restart. The last 4 just map to the existing callbacks in the deploy resource system. before_compile takes place as the first thing when the application resource executes (so when the provider is compiled), while before_deploy takes place during the execute phase in the provider, but after the main application folder structure is created.

The plan is to convert all the existing application cookbook recipes into a small wrapper just piping data bags into this LWRP structure, so they should continue to work as they always have.

So, what do people think? Is this a step in the right direction for application deployment? Do you like the idea but what a different syntax for it? Are there big, glaring use cases I’ve missed?

–Noah Kantrowitz

tl;dr Application deployment with Chef is going to be awesome!


Habitat 0.13.1 released
#3

I would agree with Alex. Trying to force MySQL, Redis and Riak (as examples) into the same generic DB cookbook abstraction will turn into a cookbook that in theory works universally, but in practice works well for no one, it seems to me. …for the same reasons that trying to force a MySQL data
model onto Riak will fail. These are fundamentally different systems, with only superficial similarities.

-J

Sent via iPhone

Is your email Premiere?

On Aug 30, 2011, at 22:26, Alex Soto apsoto@gmail.com wrote:

Thanks for the opportunity to discuss, always a good thing to hear different opinions.

As an engineer I totally understand the desire to DRY up these common patterns, but IMO these multi-stack/generic cookbooks add complexity to the cookbook and pull in lots of dependencies I don’t use, cluttering up my cookbook repo.

I see this being similar to the ‘rails’ database agnosticism of the migrations dsl. I’ve only used it once, and that was due to engineering curiosity of how well it worked using sqlite locally and a ‘real’ db in production. In practice, there has always been something that forced me to go outside the dsl and run db specific DDL. So in my experience you really don’t utilize this particular db agnosticism feature once you pick a stack, but it adds lot’s of engineering effort for the feature implementor to cover all the variations in platforms.

With databases’ in particular, each vendor has their peculiarities, so I see lot’s of rat holes in a single cookbook that covers different databases and deployment patterns. I understand that’s the reason for providers, but I think the engineering effort is better spent getting a kick-ass db specific cookbook versus a superficial cookbook that kinda/sorta works across db’s.

Alex

On Aug 30, 2011, at 7:39 PM, Noah Kantrowitz wrote:

As some people have noticed from my musings on IRC and elsewhere, I have embarked on a (probably long overdue) overhaul of the application and database cookbooks. This is largely orthogonal (for now) with the recent LWRPs added to the database cookbook, though in the end they would be removed. What follows is a set of syntax ideas for where I think this should head. Not all of this is implemented, but you can see what is in my cookbooks branch https://github.com/coderanger/cookbooks/tree/COOK-634.

Database clusters

In the new LWRP hierarchy the top level resource is a database_cluster. This defines an abstract model of a single-master-multi-slave cluster setup. Within a cluster is one or more database_servers, each of which defines a single type of database service (MySQL, Postgres, Redis, Cassandra, etc) mapped on to the containing cluster. In general I would expect most clusters to only contain a single database_server block, but this isn’t required and if you have a beefy primary box for, say, both Postgres and Redis and wanted to have both use the same backup machine you could put those both in the same cluster definition. Within a database_server you can define database and users, though this isn’t needed for all types of servers and the exact implementation of what those two constructs do is left up the backend plugin implementing the specified server type.

With all that laying down the model, lets look at some syntax:

database_cluster "prod"
master_role "prod_database_master"
slave_role “prod_database_slave”

database_server do
type :mysql
database “prod” do
engine "innodb"
end
user “myuser” do
password "xxx"
grant do
select "prod"
insert "prod"
update "prod"
end
end
end

database_server do
type :redis
end
end

This will create a cluster running a MySQL server with one database named prod and a Redis server (in the case of Redis there are no specific tuning params we need to worry about so far). This example shows a very verbose form of the desired syntax, below is an equivalent version using some more sugar and allowing for sane defaults:

database_cluster "prod"
mysql do
database do
engine "innodb"
end
user “myuser” do
password "xxx"
grant “prod” do
select
insert
update
end
end
end

redis
end

You can also pass Hashes to database and user instead of blocks if you want to feed them from an external data source (such as a data bag):

database_cluster "prod"
mysql do
database “prod”, {:engine => “innodb”}
user “myuser”, {:password => “xxx”, :grant => {:select => "prod, …}}
end

redis
end

As mentioned before, the actual implementation can choose how to handle databases and users. In the above examples, if you added any to the redis section it would simply be a no-op (or maybe a runtime error?). Also as a structural piece, database and user definitions will only ever be processed on the database master (are there databases where this isn’t the case and slaves should create users and such too?). The individual backends for this would live in the cookbook for that program, so the database cookbook would only hold the core infrastructure and plumbing to run things. The idea is that this resource block would be placed somewhere central and assigned to all servers so they can use it for reference (see below in the app cookbook section). Only nodes with the master or slave role would actually do anything. As a special case if the system detects that the master role actually doesn’t exist it will run in a degraded mode assuming a single-node cluster (so obviously the current node is the master).

The database cookbook would also grow an associated knife-database plugin to handle replication setup/teardown as for some databases you have to perform an out-of-band data sync before attaching the slave.

Seen in isolation, does this seem suitable flexible to cover most needs? I am mostly familiar with SQL databases and a few smaller NoSQL products, so for the greater NoSQL world is this still a viable model?

Application deployment

The model for the application LWRPs is generally simpler in terms of data, but is also more callback-driven. The top-level application resource just contains the central data used for all applications, and then any framework or platform specific bits are contained in sub-resources similar to the database_server arrangement. Below is an example of deploying the Radiant CMS:

application “radiant” do
path "/srv/radiant"
owner "nobody"
group "nogroup"
repository "git://github.com/radiant/radiant.git"
revision "master"
packages [“libxml2-dev”, “libxslt1-dev”, “libsqlite3-dev”]
migrate true

rails do
gems “bundler” => nil, “RedCloth” => "4.2.3"
database :mysql => “radiant_production” do
user "radiant"
reconnect true
encoding "utf8"
end
migration_command "bundle exec rake db:migrate"
end

unicorn do
port 8000
end
end

The global data maps very directly to fields in the existing application data bags, except that they are not Hashes indexed by environment name. In the rails sub-resource you see that again things mostly map to the data bag. The biggest exception is the database section which doesn’t actually state any information about the database. Instead things are looked up by reference to the information in the database resource. It would still be possible to specify all information manually if you aren’t using the database cookbook.

Application backends define actions as callbacks more or less, to provide code to execute at different points during the deploy process. The current callbacks are :before_compile, :before_deploy, :before_migrate, :before_symlink, :before_restart, and :after_restart. The last 4 just map to the existing callbacks in the deploy resource system. before_compile takes place as the first thing when the application resource executes (so when the provider is compiled), while before_deploy takes place during the execute phase in the provider, but after the main application folder structure is created.

The plan is to convert all the existing application cookbook recipes into a small wrapper just piping data bags into this LWRP structure, so they should continue to work as they always have.

So, what do people think? Is this a step in the right direction for application deployment? Do you like the idea but what a different syntax for it? Are there big, glaring use cases I’ve missed?

–Noah Kantrowitz

tl;dr Application deployment with Chef is going to be awesome!


#4

Can you point out anything specific that is incompatible with what I’ve said? The general structure of having callbacks for “create server”, “attach slave”, “create database”, “create user”, etc seems suitably generic to me as no assumption about the internals of those is made and most of the database DSL is just collection the params into hashes to send to the backend. If the end point of this is that it can only work for MySQL, Postgres, and SQL Server that still seems pretty useful to me at least.

–Noah

On Aug 30, 2011, at 10:12 PM, Jason J. W. Williams wrote:

I would agree with Alex. Trying to force MySQL, Redis and Riak (as examples) into the same generic DB cookbook abstraction will turn into a cookbook that in theory works universally, but in practice works well for no one, it seems to me. …for the same reasons that trying to force a MySQL data
model onto Riak will fail. These are fundamentally different systems, with only superficial similarities.

-J

Sent via iPhone

Is your email Premiere?

On Aug 30, 2011, at 22:26, Alex Soto apsoto@gmail.com wrote:

Thanks for the opportunity to discuss, always a good thing to hear different opinions.

As an engineer I totally understand the desire to DRY up these common patterns, but IMO these multi-stack/generic cookbooks add complexity to the cookbook and pull in lots of dependencies I don’t use, cluttering up my cookbook repo.

I see this being similar to the ‘rails’ database agnosticism of the migrations dsl. I’ve only used it once, and that was due to engineering curiosity of how well it worked using sqlite locally and a ‘real’ db in production. In practice, there has always been something that forced me to go outside the dsl and run db specific DDL. So in my experience you really don’t utilize this particular db agnosticism feature once you pick a stack, but it adds lot’s of engineering effort for the feature implementor to cover all the variations in platforms.

With databases’ in particular, each vendor has their peculiarities, so I see lot’s of rat holes in a single cookbook that covers different databases and deployment patterns. I understand that’s the reason for providers, but I think the engineering effort is better spent getting a kick-ass db specific cookbook versus a superficial cookbook that kinda/sorta works across db’s.

Alex

On Aug 30, 2011, at 7:39 PM, Noah Kantrowitz wrote:

As some people have noticed from my musings on IRC and elsewhere, I have embarked on a (probably long overdue) overhaul of the application and database cookbooks. This is largely orthogonal (for now) with the recent LWRPs added to the database cookbook, though in the end they would be removed. What follows is a set of syntax ideas for where I think this should head. Not all of this is implemented, but you can see what is in my cookbooks branch https://github.com/coderanger/cookbooks/tree/COOK-634.

Database clusters

In the new LWRP hierarchy the top level resource is a database_cluster. This defines an abstract model of a single-master-multi-slave cluster setup. Within a cluster is one or more database_servers, each of which defines a single type of database service (MySQL, Postgres, Redis, Cassandra, etc) mapped on to the containing cluster. In general I would expect most clusters to only contain a single database_server block, but this isn’t required and if you have a beefy primary box for, say, both Postgres and Redis and wanted to have both use the same backup machine you could put those both in the same cluster definition. Within a database_server you can define database and users, though this isn’t needed for all types of servers and the exact implementation of what those two constructs do is left up the backend plugin implementing the specified server type.

With all that laying down the model, lets look at some syntax:

database_cluster "prod"
master_role "prod_database_master"
slave_role “prod_database_slave”

database_server do
type :mysql
database “prod” do
engine "innodb"
end
user “myuser” do
password "xxx"
grant do
select "prod"
insert "prod"
update "prod"
end
end
end

database_server do
type :redis
end
end

This will create a cluster running a MySQL server with one database named prod and a Redis server (in the case of Redis there are no specific tuning params we need to worry about so far). This example shows a very verbose form of the desired syntax, below is an equivalent version using some more sugar and allowing for sane defaults:

database_cluster "prod"
mysql do
database do
engine "innodb"
end
user “myuser” do
password "xxx"
grant “prod” do
select
insert
update
end
end
end

redis
end

You can also pass Hashes to database and user instead of blocks if you want to feed them from an external data source (such as a data bag):

database_cluster "prod"
mysql do
database “prod”, {:engine => “innodb”}
user “myuser”, {:password => “xxx”, :grant => {:select => "prod, …}}
end

redis
end

As mentioned before, the actual implementation can choose how to handle databases and users. In the above examples, if you added any to the redis section it would simply be a no-op (or maybe a runtime error?). Also as a structural piece, database and user definitions will only ever be processed on the database master (are there databases where this isn’t the case and slaves should create users and such too?). The individual backends for this would live in the cookbook for that program, so the database cookbook would only hold the core infrastructure and plumbing to run things. The idea is that this resource block would be placed somewhere central and assigned to all servers so they can use it for reference (see below in the app cookbook section). Only nodes with the master or slave role would actually do anything. As a special case if the system detects that the master role actually doesn’t exist it will run in a degraded mode assuming a single-node cluster (so obviously the current node is the master).

The database cookbook would also grow an associated knife-database plugin to handle replication setup/teardown as for some databases you have to perform an out-of-band data sync before attaching the slave.

Seen in isolation, does this seem suitable flexible to cover most needs? I am mostly familiar with SQL databases and a few smaller NoSQL products, so for the greater NoSQL world is this still a viable model?

Application deployment

The model for the application LWRPs is generally simpler in terms of data, but is also more callback-driven. The top-level application resource just contains the central data used for all applications, and then any framework or platform specific bits are contained in sub-resources similar to the database_server arrangement. Below is an example of deploying the Radiant CMS:

application “radiant” do
path "/srv/radiant"
owner "nobody"
group "nogroup"
repository "git://github.com/radiant/radiant.git"
revision "master"
packages [“libxml2-dev”, “libxslt1-dev”, “libsqlite3-dev”]
migrate true

rails do
gems “bundler” => nil, “RedCloth” => "4.2.3"
database :mysql => “radiant_production” do
user "radiant"
reconnect true
encoding "utf8"
end
migration_command "bundle exec rake db:migrate"
end

unicorn do
port 8000
end
end

The global data maps very directly to fields in the existing application data bags, except that they are not Hashes indexed by environment name. In the rails sub-resource you see that again things mostly map to the data bag. The biggest exception is the database section which doesn’t actually state any information about the database. Instead things are looked up by reference to the information in the database resource. It would still be possible to specify all information manually if you aren’t using the database cookbook.

Application backends define actions as callbacks more or less, to provide code to execute at different points during the deploy process. The current callbacks are :before_compile, :before_deploy, :before_migrate, :before_symlink, :before_restart, and :after_restart. The last 4 just map to the existing callbacks in the deploy resource system. before_compile takes place as the first thing when the application resource executes (so when the provider is compiled), while before_deploy takes place during the execute phase in the provider, but after the main application folder structure is created.

The plan is to convert all the existing application cookbook recipes into a small wrapper just piping data bags into this LWRP structure, so they should continue to work as they always have.

So, what do people think? Is this a step in the right direction for application deployment? Do you like the idea but what a different syntax for it? Are there big, glaring use cases I’ve missed?

–Noah Kantrowitz

tl;dr Application deployment with Chef is going to be awesome!


#5

Hey Noah,

As you mentioned, at least restricting it to relational (SQL) servers simplifies things.

There’s nothing ‘wrong’ with your DSL and it’s abstractions. There definitely are patterns there that you’ve documented. However there are other use cases that aren’t listed where other implementation difficulties may lie.

  • adding/removing slave nodes
  • making sure that if a slave node is promoted to a master it doesn’t cause problems when chef next runs. (and vice versa, when a previous master recovers)
  • creating a db from existing backup (how many variations does that entail of what a ‘backup’ is)

That’s just a few that come to mind. That’s where making an all-encompassing DSL sounds great and looks cool, but is LOTS of work and unless you can realistically account for every db server before hand, every time you add a new platform, it’s going to have it’s issues that probably require some more refactoring of the existing DSL causing extra work for users as the DSL’s interface changes.

The engineer in me is not trying to discourage the discovery of these patterns, it’s just that my instincts tell me it’s a lot of work. I’m simply warning of the size of the effort and your time (of which there is a finite amount of) might have better results reducing the scope.

My experience as a chef user tells me that I’ve preferred using software/tool specific cookbooks vs the generic ones because there are less abstractions to wrap my head around and I can get things done faster.

Alex

On Aug 30, 2011, at 10:24 PM, Noah Kantrowitz wrote:

Can you point out anything specific that is incompatible with what I’ve said? The general structure of having callbacks for “create server”, “attach slave”, “create database”, “create user”, etc seems suitably generic to me as no assumption about the internals of those is made and most of the database DSL is just collection the params into hashes to send to the backend. If the end point of this is that it can only work for MySQL, Postgres, and SQL Server that still seems pretty useful to me at least.

–Noah

On Aug 30, 2011, at 10:12 PM, Jason J. W. Williams wrote:

I would agree with Alex. Trying to force MySQL, Redis and Riak (as examples) into the same generic DB cookbook abstraction will turn into a cookbook that in theory works universally, but in practice works well for no one, it seems to me. …for the same reasons that trying to force a MySQL data
model onto Riak will fail. These are fundamentally different systems, with only superficial similarities.

-J

Sent via iPhone

Is your email Premiere?

On Aug 30, 2011, at 22:26, Alex Soto apsoto@gmail.com wrote:

Thanks for the opportunity to discuss, always a good thing to hear different opinions.

As an engineer I totally understand the desire to DRY up these common patterns, but IMO these multi-stack/generic cookbooks add complexity to the cookbook and pull in lots of dependencies I don’t use, cluttering up my cookbook repo.

I see this being similar to the ‘rails’ database agnosticism of the migrations dsl. I’ve only used it once, and that was due to engineering curiosity of how well it worked using sqlite locally and a ‘real’ db in production. In practice, there has always been something that forced me to go outside the dsl and run db specific DDL. So in my experience you really don’t utilize this particular db agnosticism feature once you pick a stack, but it adds lot’s of engineering effort for the feature implementor to cover all the variations in platforms.

With databases’ in particular, each vendor has their peculiarities, so I see lot’s of rat holes in a single cookbook that covers different databases and deployment patterns. I understand that’s the reason for providers, but I think the engineering effort is better spent getting a kick-ass db specific cookbook versus a superficial cookbook that kinda/sorta works across db’s.

Alex

On Aug 30, 2011, at 7:39 PM, Noah Kantrowitz wrote:

As some people have noticed from my musings on IRC and elsewhere, I have embarked on a (probably long overdue) overhaul of the application and database cookbooks. This is largely orthogonal (for now) with the recent LWRPs added to the database cookbook, though in the end they would be removed. What follows is a set of syntax ideas for where I think this should head. Not all of this is implemented, but you can see what is in my cookbooks branch https://github.com/coderanger/cookbooks/tree/COOK-634.

Database clusters

In the new LWRP hierarchy the top level resource is a database_cluster. This defines an abstract model of a single-master-multi-slave cluster setup. Within a cluster is one or more database_servers, each of which defines a single type of database service (MySQL, Postgres, Redis, Cassandra, etc) mapped on to the containing cluster. In general I would expect most clusters to only contain a single database_server block, but this isn’t required and if you have a beefy primary box for, say, both Postgres and Redis and wanted to have both use the same backup machine you could put those both in the same cluster definition. Within a database_server you can define database and users, though this isn’t needed for all types of servers and the exact implementation of what those two constructs do is left up the backend plugin implementing the specified server type.

With all that laying down the model, lets look at some syntax:

database_cluster "prod"
master_role "prod_database_master"
slave_role “prod_database_slave”

database_server do
type :mysql
database “prod” do
engine "innodb"
end
user “myuser” do
password "xxx"
grant do
select "prod"
insert "prod"
update "prod"
end
end
end

database_server do
type :redis
end
end

This will create a cluster running a MySQL server with one database named prod and a Redis server (in the case of Redis there are no specific tuning params we need to worry about so far). This example shows a very verbose form of the desired syntax, below is an equivalent version using some more sugar and allowing for sane defaults:

database_cluster "prod"
mysql do
database do
engine "innodb"
end
user “myuser” do
password "xxx"
grant “prod” do
select
insert
update
end
end
end

redis
end

You can also pass Hashes to database and user instead of blocks if you want to feed them from an external data source (such as a data bag):

database_cluster "prod"
mysql do
database “prod”, {:engine => “innodb”}
user “myuser”, {:password => “xxx”, :grant => {:select => "prod, …}}
end

redis
end

As mentioned before, the actual implementation can choose how to handle databases and users. In the above examples, if you added any to the redis section it would simply be a no-op (or maybe a runtime error?). Also as a structural piece, database and user definitions will only ever be processed on the database master (are there databases where this isn’t the case and slaves should create users and such too?). The individual backends for this would live in the cookbook for that program, so the database cookbook would only hold the core infrastructure and plumbing to run things. The idea is that this resource block would be placed somewhere central and assigned to all servers so they can use it for reference (see below in the app cookbook section). Only nodes with the master or slave role would actually do anything. As a special case if the system detects that the master role actually doesn’t exist it will run in a degraded mode assuming a single-node cluster (so obviously the current node is the master).

The database cookbook would also grow an associated knife-database plugin to handle replication setup/teardown as for some databases you have to perform an out-of-band data sync before attaching the slave.

Seen in isolation, does this seem suitable flexible to cover most needs? I am mostly familiar with SQL databases and a few smaller NoSQL products, so for the greater NoSQL world is this still a viable model?

Application deployment

The model for the application LWRPs is generally simpler in terms of data, but is also more callback-driven. The top-level application resource just contains the central data used for all applications, and then any framework or platform specific bits are contained in sub-resources similar to the database_server arrangement. Below is an example of deploying the Radiant CMS:

application “radiant” do
path "/srv/radiant"
owner "nobody"
group "nogroup"
repository "git://github.com/radiant/radiant.git"
revision "master"
packages [“libxml2-dev”, “libxslt1-dev”, “libsqlite3-dev”]
migrate true

rails do
gems “bundler” => nil, “RedCloth” => "4.2.3"
database :mysql => “radiant_production” do
user "radiant"
reconnect true
encoding "utf8"
end
migration_command "bundle exec rake db:migrate"
end

unicorn do
port 8000
end
end

The global data maps very directly to fields in the existing application data bags, except that they are not Hashes indexed by environment name. In the rails sub-resource you see that again things mostly map to the data bag. The biggest exception is the database section which doesn’t actually state any information about the database. Instead things are looked up by reference to the information in the database resource. It would still be possible to specify all information manually if you aren’t using the database cookbook.

Application backends define actions as callbacks more or less, to provide code to execute at different points during the deploy process. The current callbacks are :before_compile, :before_deploy, :before_migrate, :before_symlink, :before_restart, and :after_restart. The last 4 just map to the existing callbacks in the deploy resource system. before_compile takes place as the first thing when the application resource executes (so when the provider is compiled), while before_deploy takes place during the execute phase in the provider, but after the main application folder structure is created.

The plan is to convert all the existing application cookbook recipes into a small wrapper just piping data bags into this LWRP structure, so they should continue to work as they always have.

So, what do people think? Is this a step in the right direction for application deployment? Do you like the idea but what a different syntax for it? Are there big, glaring use cases I’ve missed?

–Noah Kantrowitz

tl;dr Application deployment with Chef is going to be awesome!


#6

Hi Noah,

On Tue, Aug 30, 2011 at 11:24 PM, Noah Kantrowitz noah@coderanger.net wrote:

Can you point out anything specific that is incompatible with what I’ve said? The general structure of having callbacks for “create server”, “attach slave”, “create database”, “create user”, etc seems suitably generic to me as no assumption about the internals of those is made and most of the database DSL is just collection the params into hashes to send to the backend. If the end point of this is that it can only work for MySQL, Postgres, and SQL Server that still seems pretty useful to me at least.

I’m less qualified to speak on the overlap between MySQL and Postgres
for concepts like users and replication, just because I don’t have
much experience with Postgres, but my gut tells me there’s probably
enough difference in things like “master replication log name” and
"master replication log position" that the DSL will become non-trivial
quickly to cover all the bases. With systems like Riak, as Denis noted
the concept of a slave doesn’t really exist except in the enterprise
edition, and the concept of a user is pretty much non-existent.

We’ve tended to address “clusters” of MySQL pairs with data bags that
determine who is the master or not, rather than roles for masters and
slaves. Mostly, because for us those two types of MySQL servers have
very similar configurations. The differences are in what the cluster
is responsible for and who is the master, and we prefer to have that
all in once place in a single databag.

-J


#7

I agree with Jason too.
Noah, riak simply has another architecture it’s a distributed database.
Talking about master and slaves, again riak doesn’t have it as well it’s
impossible to create a user or create a database there.
Actually NoSQL database are another kind other than SQL.
I think that different databases have to many varieties and it’s hardly
possible to put all the functionality into one general recipe.
However the idea might suit for PostgreSQL, MySQL and perhaps other SQL
dbs. But I don’t if it’s practically needed?

Denis

On 08/31/2011 09:24 AM, Noah Kantrowitz wrote:

Can you point out anything specific that is incompatible with what I’ve said? The general structure of having callbacks for “create server”, “attach slave”, “create database”, “create user”, etc seems suitably generic to me as no assumption about the internals of those is made and most of the database DSL is just collection the params into hashes to send to the backend. If the end point of this is that it can only work for MySQL, Postgres, and SQL Server that still seems pretty useful to me at least.

–Noah

On Aug 30, 2011, at 10:12 PM, Jason J. W. Williams wrote:

I would agree with Alex. Trying to force MySQL, Redis and Riak (as examples) into the same generic DB cookbook abstraction will turn into a cookbook that in theory works universally, but in practice works well for no one, it seems to me. …for the same reasons that trying to force a MySQL data
model onto Riak will fail. These are fundamentally different systems, with only superficial similarities.

-J

Sent via iPhone

Is your email Premiere?

On Aug 30, 2011, at 22:26, Alex Sotoapsoto@gmail.com wrote:

Thanks for the opportunity to discuss, always a good thing to hear different opinions.

As an engineer I totally understand the desire to DRY up these common patterns, but IMO these multi-stack/generic cookbooks add complexity to the cookbook and pull in lots of dependencies I don’t use, cluttering up my cookbook repo.

I see this being similar to the ‘rails’ database agnosticism of the migrations dsl. I’ve only used it once, and that was due to engineering curiosity of how well it worked using sqlite locally and a ‘real’ db in production. In practice, there has always been something that forced me to go outside the dsl and run db specific DDL. So in my experience you really don’t utilize this particular db agnosticism feature once you pick a stack, but it adds lot’s of engineering effort for the feature implementor to cover all the variations in platforms.

With databases’ in particular, each vendor has their peculiarities, so I see lot’s of rat holes in a single cookbook that covers different databases and deployment patterns. I understand that’s the reason for providers, but I think the engineering effort is better spent getting a kick-ass db specific cookbook versus a superficial cookbook that kinda/sorta works across db’s.

Alex

On Aug 30, 2011, at 7:39 PM, Noah Kantrowitz wrote:

As some people have noticed from my musings on IRC and elsewhere, I have embarked on a (probably long overdue) overhaul of the application and database cookbooks. This is largely orthogonal (for now) with the recent LWRPs added to the database cookbook, though in the end they would be removed. What follows is a set of syntax ideas for where I think this should head. Not all of this is implemented, but you can see what is in my cookbooks branch https://github.com/coderanger/cookbooks/tree/COOK-634.

Database clusters

In the new LWRP hierarchy the top level resource is a database_cluster. This defines an abstract model of a single-master-multi-slave cluster setup. Within a cluster is one or more database_servers, each of which defines a single type of database service (MySQL, Postgres, Redis, Cassandra, etc) mapped on to the containing cluster. In general I would expect most clusters to only contain a single database_server block, but this isn’t required and if you have a beefy primary box for, say, both Postgres and Redis and wanted to have both use the same backup machine you could put those both in the same cluster definition. Within a database_server you can define database and users, though this isn’t needed for all types of servers and the exact implementation of what those two constructs do is left up the backend plugin implementing the specified server type.

With all that laying down the model, lets look at some syntax:

database_cluster "prod"
master_role "prod_database_master"
slave_role “prod_database_slave”

database_server do
type :mysql
database “prod” do
engine "innodb"
end
user “myuser” do
password "xxx"
grant do
select "prod"
insert "prod"
update "prod"
end
end
end

database_server do
type :redis
end
end

This will create a cluster running a MySQL server with one database named prod and a Redis server (in the case of Redis there are no specific tuning params we need to worry about so far). This example shows a very verbose form of the desired syntax, below is an equivalent version using some more sugar and allowing for sane defaults:

database_cluster "prod"
mysql do
database do
engine "innodb"
end
user “myuser” do
password "xxx"
grant “prod” do
select
insert
update
end
end
end

redis
end

You can also pass Hashes to database and user instead of blocks if you want to feed them from an external data source (such as a data bag):

database_cluster "prod"
mysql do
database “prod”, {:engine => “innodb”}
user “myuser”, {:password => “xxx”, :grant => {:select => "prod, …}}
end

redis
end

As mentioned before, the actual implementation can choose how to handle databases and users. In the above examples, if you added any to the redis section it would simply be a no-op (or maybe a runtime error?). Also as a structural piece, database and user definitions will only ever be processed on the database master (are there databases where this isn’t the case and slaves should create users and such too?). The individual backends for this would live in the cookbook for that program, so the database cookbook would only hold the core infrastructure and plumbing to run things. The idea is that this resource block would be placed somewhere central and assigned to all servers so they can use it for reference (see below in the app cookbook section). Only nodes with the master or slave role would actually do anything. As a special case if the system detects that the master role actually doesn’t exist it will run in a degraded mode assuming a single-node cluster (so obviously the current node is the master).

The database cookbook would also grow an associated knife-database plugin to handle replication setup/teardown as for some databases you have to perform an out-of-band data sync before attaching the slave.

Seen in isolation, does this seem suitable flexible to cover most needs? I am mostly familiar with SQL databases and a few smaller NoSQL products, so for the greater NoSQL world is this still a viable model?

Application deployment

The model for the application LWRPs is generally simpler in terms of data, but is also more callback-driven. The top-level application resource just contains the central data used for all applications, and then any framework or platform specific bits are contained in sub-resources similar to the database_server arrangement. Below is an example of deploying the Radiant CMS:

application “radiant” do
path "/srv/radiant"
owner "nobody"
group "nogroup"
repository "git://github.com/radiant/radiant.git"
revision "master"
packages [“libxml2-dev”, “libxslt1-dev”, “libsqlite3-dev”]
migrate true

rails do
gems “bundler” => nil, “RedCloth” => "4.2.3"
database :mysql => “radiant_production” do
user "radiant"
reconnect true
encoding "utf8"
end
migration_command "bundle exec rake db:migrate"
end

unicorn do
port 8000
end
end

The global data maps very directly to fields in the existing application data bags, except that they are not Hashes indexed by environment name. In the rails sub-resource you see that again things mostly map to the data bag. The biggest exception is the database section which doesn’t actually state any information about the database. Instead things are looked up by reference to the information in the database resource. It would still be possible to specify all information manually if you aren’t using the database cookbook.

Application backends define actions as callbacks more or less, to provide code to execute at different points during the deploy process. The current callbacks are :before_compile, :before_deploy, :before_migrate, :before_symlink, :before_restart, and :after_restart. The last 4 just map to the existing callbacks in the deploy resource system. before_compile takes place as the first thing when the application resource executes (so when the provider is compiled), while before_deploy takes place during the execute phase in the provider, but after the main application folder structure is created.

The plan is to convert all the existing application cookbook recipes into a small wrapper just piping data bags into this LWRP structure, so they should continue to work as they always have.

So, what do people think? Is this a step in the right direction for application deployment? Do you like the idea but what a different syntax for it? Are there big, glaring use cases I’ve missed?

–Noah Kantrowitz

tl;dr Application deployment with Chef is going to be awesome!


#8

Howdy, Noah –

I have a local, unreleased cookbook for building PostgreSQL 9.0 clusters
using synchronous replication (with HAProxy balancing between only those
slaves which are currently in sync with the master, with the help of
2ndQuadrant’s repmgr and a small locally-developed web service
querying/reporting on status as a layer-7 check for HAProxy). This cookbook
tracks configuration options which can be updated live in a separate config
file from those which require a service restart, such that it knows when a
service reload is adequate to apply changes. It also creates and distributes
SSH keys between database slaves in a cluster to allow repmgr to do its work
– with this cookbook correctly deployed, bringing up a new database slave
and bringing it fully into sync is completely automated.

I store the relevant configuration in attributes – this makes it easier for
other code to refer to the database configuration. As presently written, it
also leverages search to find eligible clients (for both firewall and
database-level access control setup); perhaps in the context of something
built for wider use, support for running in solo environments would trump
the convenience of relying on search. Also, it presently relies on my
(published) shorewall cookbook for firewall setup – a dependency which may
or may not be desired.

I haven’t been tracking the database cookbook you maintain. That said, would
any part of what I described above be useful to your work? Cleaning this
code up for release has been on my TODO list for a while; if I knew of
someone likely to make actual use in the near future, it would provide that
much more impetus to get it out the door.

Changes I might make to the proposed DSL, as a power user: In addition to
declaring “user”, declaring a “client” would permit access control setup.
For users and roles, it’s unclear to me how you distinguish between
configuring a database cluster and configuring databases within that
cluster (where in PostgreSQL terms a “database cluster” is a group of
databases stored under a single top-level directory and run by a single
instance of the daemon). In PostgreSQL, for instance, non-superuser
privileges are generally scoped to a single database, not cluster-wide
(though they’re replicated out across failover clusters, they don’t span
the whole of a local cluster)… whereas SR (like archive-shipping
replication) configuration is global to the complete database cluster [in,
again, the local/one-daemon sense]. If one is doing database-level
configuration in addition to cluster-level configuration (which my current
cookbook does), it would also be nice to have support for some
PostgreSQL-specific extensions – in particular, enabling languages [ie.
pl/pgsql] and extensions [ie. HSTORE] requires superuser privilege which one
doesn’t necessarily want to grant to users… and of course performance
tuning parameters need to be accessible. Some parameters extend beyond
configuring the database itself, as well – how far behind the master the
slaves are allowed to get matters not only to the slaves’ configuration, but
also to whatever layer you’re using for your clients to decide which slave
to connect to (in my case, the little HTTP service running layer-7 checks
for HAProxy). [There’s also local policy – how far behind the master’s
current state you want it to store archive logs for limits how far
out-of-sync a slave is able to get and still be able to catch up via SR as
opposed to being resynchronized out-of-band, as with rsync]

I’m not sure that the proposed DSL would be sufficient to cover the kind of
deployments I do, even then – I don’t see a way to indicate which node the
load balancer would live on, for instance; putting it also on the master
means that when you lose your master, you also lose your read-only slaves
– not great if you’re trying to write your application to gracefully
degrade. (That load balancer, being a single point of failure for all the
slaves [if you have clients using them for read-only queries, of course,
which is the case where it matters], ideally belongs deployed on multiple
machines sharing a common VIP – but IP failover is a different cookbook
which also needs a fair bit of cleanup before I can think about releasing
it. sigh)

…and then there’s all the policy decisions about how you want to decide
when to promote a slave to master. (Short form: Unless you have the ability
to create consistent filesystem snapshots which are immediately accessible
on your other nodes, “failing back” an old master into the cluster, even as
a slave, can be an expensive operation – making automated failovers for
large databases risky in environments without the relevant storage
capabilities, as they leave you in a reduced-redundancy configuration until
failback is completed). That’s a whole 'nother kettle of fish.

(Oh – and while we’re talking about database replication functionality
involving policy decisions hard to make on a one-size-fits-all basis,
there’s PITR – point-in-time recovery; supporting this properly on
PostgreSQL means using the “hot backup” mechanism and storing replay logs
from any backup you want to be able to run forward from. Managing backup and
archive storage in a PITR environment is fraught with peril and dangerous
edge conditions [if you let your log archives take up enough space that you
can’t run another backup successfully, including storage for the archives
generated during that new backup, you lose; if you have a temporary failure
in the archival system and the un-archived logs eat available space in the
partition holding your $PGDATA, you lose; etc]. If you’re ever thinking of
supporting PITR [and, as it’s one of the few things that will help you
recover from someone leaving a WHERE clause off their DELETE command without
losing everything that happened since the last backup prior, implementing it
tends to be worthwhile], ping me on IRC some time and I’ll try to dredge up
some experiences, horror stories and/or legends out of my memory. For
PostgreSQL versions prior to 9.0, SR isn’t available – so the hot backup +
log archival combination used for PITR was what folks used for running
standby nodes unless they were leveraging an external product such as
Slony).

I’d really like to see a public database cookbook that’s good enough for
power users, and am thrilled to see serious thought being given to the
subject. Let me know if I can help.

On Tue, Aug 30, 2011 at 9:39 PM, Noah Kantrowitz noah@coderanger.netwrote:

As some people have noticed from my musings on IRC and elsewhere, I have
embarked on a (probably long overdue) overhaul of the application and
database cookbooks. This is largely orthogonal (for now) with the recent
LWRPs added to the database cookbook, though in the end they would be
removed. What follows is a set of syntax ideas for where I think this should
head. Not all of this is implemented, but you can see what is in my
cookbooks branch https://github.com/coderanger/cookbooks/tree/COOK-634.

Database clusters

In the new LWRP hierarchy the top level resource is a database_cluster.
This defines an abstract model of a single-master-multi-slave cluster setup.
Within a cluster is one or more database_servers, each of which defines a
single type of database service (MySQL, Postgres, Redis, Cassandra, etc)
mapped on to the containing cluster. In general I would expect most clusters
to only contain a single database_server block, but this isn’t required and
if you have a beefy primary box for, say, both Postgres and Redis and wanted
to have both use the same backup machine you could put those both in the
same cluster definition. Within a database_server you can define database
and users, though this isn’t needed for all types of servers and the exact
implementation of what those two constructs do is left up the backend plugin
implementing the specified server type.

With all that laying down the model, lets look at some syntax:

database_cluster "prod"
master_role "prod_database_master"
slave_role “prod_database_slave”

database_server do
type :mysql
database “prod” do
engine "innodb"
end
user “myuser” do
password "xxx"
grant do
select "prod"
insert "prod"
update "prod"
end
end
end

database_server do
type :redis
end
end

This will create a cluster running a MySQL server with one database named
prod and a Redis server (in the case of Redis there are no specific tuning
params we need to worry about so far). This example shows a very verbose
form of the desired syntax, below is an equivalent version using some more
sugar and allowing for sane defaults:

database_cluster "prod"
mysql do
database do
engine "innodb"
end
user “myuser” do
password "xxx"
grant “prod” do
select
insert
update
end
end
end

redis
end

You can also pass Hashes to database and user instead of blocks if you want
to feed them from an external data source (such as a data bag):

database_cluster "prod"
mysql do
database “prod”, {:engine => “innodb”}
user “myuser”, {:password => “xxx”, :grant => {:select => "prod, …}}
end

redis
end

As mentioned before, the actual implementation can choose how to handle
databases and users. In the above examples, if you added any to the redis
section it would simply be a no-op (or maybe a runtime error?). Also as a
structural piece, database and user definitions will only ever be processed
on the database master (are there databases where this isn’t the case and
slaves should create users and such too?). The individual backends for this
would live in the cookbook for that program, so the database cookbook would
only hold the core infrastructure and plumbing to run things. The idea is
that this resource block would be placed somewhere central and assigned to
all servers so they can use it for reference (see below in the app cookbook
section). Only nodes with the master or slave role would actually do
anything. As a special case if the system detects that the master role
actually doesn’t exist it will run in a degraded mode assuming a single-node
cluster (so obviously the current node is the master).

The database cookbook would also grow an associated knife-database plugin
to handle replication setup/teardown as for some databases you have to
perform an out-of-band data sync before attaching the slave.

Seen in isolation, does this seem suitable flexible to cover most needs? I
am mostly familiar with SQL databases and a few smaller NoSQL products, so
for the greater NoSQL world is this still a viable model?

Application deployment

The model for the application LWRPs is generally simpler in terms of data,
but is also more callback-driven. The top-level application resource just
contains the central data used for all applications, and then any framework
or platform specific bits are contained in sub-resources similar to the
database_server arrangement. Below is an example of deploying the Radiant
CMS:

application “radiant” do
path "/srv/radiant"
owner "nobody"
group "nogroup"
repository "git://github.com/radiant/radiant.git"
revision "master"
packages [“libxml2-dev”, “libxslt1-dev”, “libsqlite3-dev”]
migrate true

rails do
gems “bundler” => nil, “RedCloth” => "4.2.3"
database :mysql => “radiant_production” do
user "radiant"
reconnect true
encoding "utf8"
end
migration_command "bundle exec rake db:migrate"
end

unicorn do
port 8000
end
end

The global data maps very directly to fields in the existing application
data bags, except that they are not Hashes indexed by environment name. In
the rails sub-resource you see that again things mostly map to the data bag.
The biggest exception is the database section which doesn’t actually state
any information about the database. Instead things are looked up by
reference to the information in the database resource. It would still be
possible to specify all information manually if you aren’t using the
database cookbook.

Application backends define actions as callbacks more or less, to provide
code to execute at different points during the deploy process. The current
callbacks are :before_compile, :before_deploy, :before_migrate,
:before_symlink, :before_restart, and :after_restart. The last 4 just map to
the existing callbacks in the deploy resource system. before_compile takes
place as the first thing when the application resource executes (so when the
provider is compiled), while before_deploy takes place during the execute
phase in the provider, but after the main application folder structure is
created.

The plan is to convert all the existing application cookbook recipes into a
small wrapper just piping data bags into this LWRP structure, so they should
continue to work as they always have.

So, what do people think? Is this a step in the right direction for
application deployment? Do you like the idea but what a different syntax for
it? Are there big, glaring use cases I’ve missed?

–Noah Kantrowitz

tl;dr Application deployment with Chef is going to be awesome!