Hey, Chefs –
As we’ve rolled out Chef to larger and larger sites, I’ve discovered that
CouchDB is a significant scaling problem. The time to compact the database
and views has risen (and risen, and risen), and search performance has
declined. I’ve heard rumblings that Hosted Chef is no longer using Couch
as it’s backend, which suggests that Opscode might have hit this pain as
well.
There’s obviously some tuning that we could do to Couch which would make it
behave better, but I need to scale it to approx. 40x what it’s currently
doing in order to roll out Chef to the remaining sites in my environment,
and I’m extremely skeptical that anything – even the most aggressive
tuning – will make that work.
I’m wondering a few things:
-
Are there existing branches of Chef (within Opscode or otherwise) that
aim to replace CouchDB with another database (be it a document-based store
or something more relational)?
-
Are any of those branches public, or are there plans to roll those
changes into the open source version?
-
If those branches don’t exist, how much interest is there on the list
in removing the dependency on CouchDB? Would other people be interested in
participating in a project like that?
-
Ian
–
Ian Marlier | Senior Systems Engineer
Brightcove, Inc.
One Cambridge Center, 12th Floor, Cambridge, MA 02142
imarlier@brightcove.com
On Feb 11, 2012, at 8:48 AM, Ian Marlier wrote:
Hey, Chefs --
As we've rolled out Chef to larger and larger sites, I've discovered that CouchDB is a significant scaling problem. The time to compact the database and views has risen (and risen, and risen), and search performance has declined. I've heard rumblings that Hosted Chef is no longer using Couch as it's backend, which suggests that Opscode might have hit this pain as well.
There's obviously some tuning that we could do to Couch which would make it behave better, but I need to scale it to approx. 40x what it's currently doing in order to roll out Chef to the remaining sites in my environment, and I'm extremely skeptical that anything -- even the most aggressive tuning -- will make that work.
I'm wondering a few things:
- Are there existing branches of Chef (within Opscode or otherwise) that aim to replace CouchDB with another database (be it a document-based store or something more relational)?
- Are any of those branches public, or are there plans to roll those changes into the open source version?
- If those branches don't exist, how much interest is there on the list in removing the dependency on CouchDB? Would other people be interested in participating in a project like that?
The short version; yes we are moving the backend from Sinatra (Ruby) + CouchDB to WebMachine (Erlang) + SQL (MySQL and Postgres specifically). No, this code isn't public yet because the migration is ongoing and it isn't really the kind of thing that we want to support others running right now. We are switch over API endpoints as they are written and swapping data into SQL usually as the endpoint is rewritten in Erlang, so right now it requires a lot of magic in the load balancers to keep everything flowing smoothly. The general goal is to try and open-source it once it is fully converted and in a situation where other people could run it without going insane, but there aren't specific plans or timelines yet. Wish I could give you something firmer, but hopefully that sheds some light on the situation.
--Noah
±-----------------------------------------------------------------------------
| On 2012-02-11 10:33:21, Noah Kantrowitz wrote:
|
| The short version; yes we are moving the backend from Sinatra (Ruby) + CouchDB to WebMachine (Erlang) + SQL (MySQL and Postgres specifically). No, this code isn’t public yet because the migration is ongoing and it isn’t really the kind of thing that we want to support others running right now. We are switch over API endpoints as they are written and swapping data into SQL usually as the endpoint is rewritten in Erlang, so right now it requires a lot of magic in the load balancers to keep everything flowing smoothly. The general goal is to try and open-source it once it is fully converted and in a situation where other people could run it without going insane, but there aren’t specific plans or timelines yet. Wish I could give you something firmer, but hopefully that sheds some light on the situation.
After losing multiple days of my life to CouchDB eating itself in the last few
weeks, I am avidly waiting for this to drop, whenever it does.
Cheers.
bdha
cyberpunk is dead. long live cyberpunk.
At the summit a couple times this discussion came up. One thing that
was offered as a short term solution was looking into big couch. I had
a test environment up on bigcouch, but I never got the chance to
really put it through its paces. Might be useful for you to give it a
shot.
On Sat, Feb 11, 2012 at 7:12 PM, Bryan Horstmann-Allen
bdha@mirrorshades.net wrote:
+------------------------------------------------------------------------------
| On 2012-02-11 10:33:21, Noah Kantrowitz wrote:
|
| The short version; yes we are moving the backend from Sinatra (Ruby) + CouchDB to WebMachine (Erlang) + SQL (MySQL and Postgres specifically). No, this code isn't public yet because the migration is ongoing and it isn't really the kind of thing that we want to support others running right now. We are switch over API endpoints as they are written and swapping data into SQL usually as the endpoint is rewritten in Erlang, so right now it requires a lot of magic in the load balancers to keep everything flowing smoothly. The general goal is to try and open-source it once it is fully converted and in a situation where other people could run it without going insane, but there aren't specific plans or timelines yet. Wish I could give you something firmer, but hopefully that sheds some light on the situation.
After losing multiple days of my life to CouchDB eating itself in the last few
weeks, I am avidly waiting for this to drop, whenever it does.
Cheers.
bdha
cyberpunk is dead. long live cyberpunk.
The short version; yes we are moving the backend from Sinatra (Ruby) + CouchDB to WebMachine (Erlang) + SQL (MySQL and Postgres specifically). No, this code isn't public yet because the migration is ongoing and it isn't really the kind of thing that we
A couple of thoughts.
- You may want to leverage zookeeper.
- You may want to consider torquebox.
I know there are downsides and upsides to everything but both of these
projects have tackled the scalability problem and are quite mature.
Torquebox especially seems very well suited for a chef server. It
comes with a robust messageque, background processing, caching etc as
well as clustering with jboss. As a bonus you can keep all your
sinatra code as it runs any rack application.