How to split up chef stack?

Ohai Chefs,

I’m wondering if anyone can point me to documentation or any write-ups on
their Life Experience™ with regards to splitting up the Chef server stack?

More specifically, I already have figured out how to split couch from one
web/api application server, which I found to be rather easy.

However, I’m wondering whether or not I need to centralize any other
components as I start to add more api servers. Can
rabbitmq/solr/chef-expander all run on each application server, or do any
of them need to be centralized on the couch server?

If it helps, I plan to round-robin the requests without stickiness.

Any advice would be most helpful!

Thank you,
Brian

On Fri, Oct 5, 2012 at 12:30 PM, Brian Hatfield
bhatfield@brightcove.com wrote:

However, I'm wondering whether or not I need to centralize any other
components as I start to add more api servers. Can
rabbitmq/solr/chef-expander all run on each application server, or do any of
them need to be centralized on the couch server?

Well you can't just trivially split rabit/solr/expander because you
won't ever get coherent search results and even if you did,
round-robin requests would be likely to get inconsistent search
results. If chef-expander is a bottleneck for you, you could configure
rabbit so multiple servers participate in the queue and run multiple
expanders across multiple servers. You'd still need a single solr
master though. I sorta doubt that chef-expander is your bottleneck
though. Splitting off solr by itself likely makes sense if you're
search-heavy as solr can eat a lot of memory. If that isn't enough you
could consider solr replication to read-slaves in the usual pattern
for scaling solr. Again, be aware of replication delay if you're
orchestrating via search.

From what I've heard, people's bottlenecks tend to be:
cpu for couchdb -> you already split this off
memory for the api server -> split off solr so it doesn't have to
share memory and if that isn't enough, start horizontally scaling api
servers on their own

Also be aware that Chef 11 will be a big change to the stack with
large performance improvements.

KC

Thanks for your response. I apologize for being unclear, but I guess what I
was asking was:

Assuming my architecture looks like:

chef-api-01 + chef-api-02

both pointing to

chef-couchdb-01

Does rabbit/solr/expander need to be on the couch server, so things are
consistent, or should they be on each api server? Or should there be some
other division?

Thank you again for your response!

Brian

On Fri, Oct 5, 2012 at 4:48 PM, KC Braunschweig kcbraunschweig@gmail.comwrote:

On Fri, Oct 5, 2012 at 12:30 PM, Brian Hatfield
bhatfield@brightcove.com wrote:

However, I'm wondering whether or not I need to centralize any other
components as I start to add more api servers. Can
rabbitmq/solr/chef-expander all run on each application server, or do
any of
them need to be centralized on the couch server?

Well you can't just trivially split rabit/solr/expander because you
won't ever get coherent search results and even if you did,
round-robin requests would be likely to get inconsistent search
results. If chef-expander is a bottleneck for you, you could configure
rabbit so multiple servers participate in the queue and run multiple
expanders across multiple servers. You'd still need a single solr
master though. I sorta doubt that chef-expander is your bottleneck
though. Splitting off solr by itself likely makes sense if you're
search-heavy as solr can eat a lot of memory. If that isn't enough you
could consider solr replication to read-slaves in the usual pattern
for scaling solr. Again, be aware of replication delay if you're
orchestrating via search.

From what I've heard, people's bottlenecks tend to be:
cpu for couchdb -> you already split this off
memory for the api server -> split off solr so it doesn't have to
share memory and if that isn't enough, start horizontally scaling api
servers on their own

Also be aware that Chef 11 will be a big change to the stack with
large performance improvements.

KC

On Fri, Oct 5, 2012 at 1:51 PM, Brian Hatfield bhatfield@brightcove.com wrote:

Assuming my architecture looks like:

chef-api-01 + chef-api-02

both pointing to

chef-couchdb-01

Does rabbit/solr/expander need to be on the couch server, so things are
consistent, or should they be on each api server? Or should there be some
other division?

You haven't specified what your goals for this are (redundancy, scale,
a specific bottleneck, etc) but I'd say keep it simple and put
everything on a single backend except the api layer. If you have
bottlenecks after that, you can do more.