Chef Replication vs HA - why would you choose to use replication?

I am trying to get chef replication working and have everything but nodes/clients on my second chef server.

Is this by design or am I missing something?

Thanks!
Jenn

Never mind. I see that replication isn’t what I thought it was. What is the point of having replicas? We want to have a chef server in the cloud and one in our DC. I assume HA is the only option?

Hi Jennifer,

Never mind. I see that replication isn't what I thought it was. What is the point of having replicas? We want to have a chef server in the cloud and one in our DC. I assume HA is the only option?

First, apologies for the frustration Chef Replication may have caused you.

"Replication" is definitely a poor name for what Chef Replication currently offers. Chef Replication is mostly for people
looking to distribute policy content to multiple data centers. That is, for people who want to be able to upload a a cookbook or role to a single server and have it distributed to other data centers. It is similar in spirit to GitHub - facebook/grocery-delivery: The Grocery Delivery utility for managing cookbook uploads to distributed Chef backends. but for people who want to interact with the API rather than with Git.

That said, we've definitely heard your frustration from others who are looking for something more than content distribution. Currently we are working on improving our HA story to make it easier to set up a high availability Chef Server.

We want to have a chef server in the cloud and one in our DC. I assume HA is the only option

Your options depends a bit on what your availability concerns are and what you want to be able to do. Mind sharing a few details about the architecture you are trying to put together and your goals?

Cheers,

Steven Danna
Software Engineer, Chef

I don’t mind at all.

We currently have a hybrid configuration (DC and AWS). However, we may eventually be completely in the cloud in the future (next 1-2 year away). We would like to have a chef server in each location/region/AZ. Each chef server should contain the same information as the other chef servers (cookbooks, nodes, clients, databags, environments and roles). Clients will register with the closest chef server (and that information will be replicated to the other servers). If the closest server is unavailable, new instances will register or existing instances will run their jobs against another chef server in the infrastructure. This is what is desired.

What are my options to get as close to this configuration?

Thank you!!

Hi,

Each chef server should contain the same information as the other chef servers (cookbooks, nodes, clients, data bags, environments and roles).

As you discovered with chef-sync, most of the available options (CI pipelines pushing policy changes to multiple servers, chef-sync, grocery delivery, etc) make what you want to do possible with the exception of nodes & clients.

The key question to answer is whether you need/want two nodes talking to different chef-servers to find each other via search. If you don't, then one of the existing content distribution methods can work for most of what you need and the gap that you would need to fill would be a way for the chef-client to register itself with the correct run list even when the server it is talking to hasn't seen that client before. The best way to do this depends a bit on how you provision machines and the workflows you want to support.

Cheers,

Steven

Great, we can determine the client register as long as the CI pipeline changes are replicated.

Thanks!

Also, if you ever need a beta user or test user… :smiley: