Chef server. Disaster recovering

vsyc · March 22, 2016, 7:36pm

Hello,

I wonder, lets say I have chef-server setup on ec2 instance and Postgres in RDS.Chef server manages around 10 Application nodes.

One day ec2 with chef server gets terminated and no way to restore. If I install a new chef server and point to old Postgres which is RDS with all data from old Chef server, will I be able to recover Chef at state it was prior termination or not?

What would be an options?

Thank you.

vsyc · March 24, 2016, 7:53pm

Is my question too trivial, or noone knows?

ameir · March 24, 2016, 8:11pm

Are you storing your Chef Bookshelf on S3? I can’t answer authoritatively,
but I think most state exists in Postgres and the Bookshelf. There’s data
in Redis and RabbitMQ as well, but I think that data is meant to be
short-lived and may be okay to lose if you’re not worried about a
sqeaky-clean cutover.

kallistec · March 24, 2016, 8:20pm

I’ll have to ping some of the server maintainers to find the exact current state of things, but what I know is this:

In the past, there were two sources of persistent data, the postgres db and the filesystem store for cookbook files. In addition there is a search index based on Solr which can be rebuilt from data in the database. If you’re going all-in on AWS, you can use RDS for the database, and S3 for the file store. If you do that, you can create a new server and at least all of your data will be there.

I know there was also some work done to let you use AWS elastic search (forget the brand name) instead of Solr, which would mean that you wouldn’t need to regenerate any data after recovering from a disaster and the Chef Server would be totally stateless. But I don’t recall if that work made it into a release yet.

There’s been a lot of work on this kind of stuff over the past few Chef Server releases so you may find more specific answers by reading the release notes for the past few Chef Server versions.

vsyc · April 2, 2016, 6:09pm

Thank you for your reply, it did help to clear things and I did make it work using S3 and PostgreSQL on AWS. Also I found in docs, that I can point VIP for rabbitmq as-well (https://docs.chef.io/config_rb_server.html#rabbitmq). Does it mean that I can separate Queueing as-well?

@kallistec Have you find out about ElasticSearch instead of Solr? That would help as-well, as I am using ElasticSearch for logs, and can dedicate some of it power for Chef.

Also couple questions on calculating CCR’s (I use this as reference https://docs.chef.io/server_components.html#ccrs-min):

if I put things onto RDS and S3 and rest on server, how CCR’s will be calculated?
Also in reference that I provided above, it showed on RAM extensive node, what about node with 4 CPU and 4 RAM, what CCR it will have?
Is there a formula to calculate CCR?

Thank you.

ssd · April 2, 2016, 7:36pm

Hi,

Have you find out about ElasticSearch instead of Solr? That would help as-well, as I am using ElasticSearch for logs, and can dedicate some of it power for Chef.

To use ElasticSearch as the search backend, you can put the following in chef-server.rb:

opscode_solr4['external'] = true
opscode_solr4['external_url'] = 'http://IP_FOR_ES:PORT_FOR_ES'
opscode_erchef['search_provider'] = 'elasticsearch'
opscode_erchef['search_queue_mode'] = 'batch'

chef-server-ctl reconfigure should create the required index in elasticsearch.

Also I found in docs, that I can point VIP for rabbitmq as-well (chef-server.rb Settings). Does it mean that I can separate Queueing as-well?

While I believe it is possible to run RabbitMQ externally, we don't test this confirmation and thus it likely isn't well supported by chef-server-ctl reconfigure and chef-server-ctl upgrade. If you are using ElasticSearch as your search backend and aren't using Chef Analytics, then it is possible to turn off RabbitMQ completely with the following configuration item:

rabbitmq['enable'] = false
rabbitmq['management_enabled'] = false
rabbitmq['queue_length_monitor_enabled'] = false
opscode_expander['enable'] = false
dark_launch['actions'] = false

If I put things onto RDS and S3 and rest on server, how CCR's will be calculated?

Also in reference that I provided above, it showed on RAM extensive node, what about node with 4 CPU and 4 RAM, what CCR it will have?

Is there a formula to calculate CCR?

If you are using S3 for cookbook storage, RDS for Postgresql, and ElasticSearch for the search index, I would expect that fewer resources would be needed on the chef-server VM itself. This is especially true if you offload search to ElasticSearch as a good deal of RAM is used by Solr. Unfortunately, I can't give you hard numbers for what you can expect to see.

If you do use ElasticSearch for your search backend, please do not hesitate to report issues either on GitHub or here. The ElasticSearch support is rather new and we welcome feedback on it.

I hope this helps,

Cheers,

Steven

vsyc · April 3, 2016, 5:28pm

Thank you for your help, it Indeed helped, I will try to go this direction and see how it works.

Meanwhile, I put PostgreSQL onto RDS and everything else onto virtual instance. When I destroyed instance and tried to re-create, I got following errors:

I think it was about opscodes_chef database already exists, and that I have to get rid of it.
After when I got rid of databsases, it started complain about users that has to be removed.

Is that expected, or without Solr/Elasticsearch I have to drop and create database again?

I can go through process one more time if it necessary and provide precise errors.

ameir · April 7, 2016, 5:30pm

Hi Vasilij,

Were you able to get this figured out? I am looking to try the same thing
in the near future, and was curious to see where you landed.

Thanks,
Ameir

vsyc · April 7, 2016, 11:06pm

@ameir I was able to plug external PostgreSQL to Chef, and I failed with S3, I opened tickets and awaiting response:

And I opened issued in GitHub (hope will not be closed):

github.com/chef/chef-server

Failed to upload cookbook onto S3

opened 11:11PM - 07 Apr 16 UTC

closed 09:55PM - 31 Jan 17 UTC

Deserved

## Description I am failing to upload cookbook onto S3. My bucket name is: **ch…efservertest** My bucket is located: Ireland (EU a.k.a s3-eu-west-1) I tried following actions to resolve it, but unsuccessfully: 1. Change uri to various combinations: https://chefservertest.s3-eu-west-1.amazonaws.com, https://chefservertest.s3.amazonaws.com, https://s3.amazonaws.com event tried not secured http://* 2. Give to user that associated to access_id FULL ACCESS. Did anyone have same problem and how you resolved it? Is it a bug or it me doing something wrong? Thank you very much. ## Chef Version Chef-Server 12.4.1 ## Platform Version Ubuntu 15.10 ## Replication Case I am using S3 as my cookbook storage. My config looks like that in **chef-server.rb**: ``` bookshelf['enable'] = false bookshelf['vip'] = "https://chefservertest.s3-eu-west-1.amazonaws.com" bookshelf['external_url'] = "https://chefservertest.s3-eu-west-1.amazonaws.com" bookshelf['access_key_id'] = "<access_key>" bookshelf['secret_access_key'] = "<access_secret>" opscode_erchef['s3_bucket'] = "chefservertest" ``` ## Client Output ``` ERROR: Failed to upload /home/vsyc/chef-repo/cookbooks/my_cookbook/recipes/default.rb (cfd5abf3cfb5bbb984969d714d713945) to https://chefservertest.s3-eu-west-1.amazonaws.com:443/chefservertest/organization-d823c08b57d78aaecbe51778e547e7db/checksum-cfd5abf3cfb5bbb984969d714d713945?AWSAccessKeyId=AKIAJ4ONIK7HESMKD24A&Expires=1459703319&Signature=9FS%2BTXbBkOOCcE3ezehKfspB%2Bx8%3D : 403 "Forbidden" <?xml version="1.0" encoding="UTF-8"?> <Error><Code>SignatureDoesNotMatch</Code><Message>The request signature we calculated does not match the signature you provided. Check your key and signing method.</Message><AWSAccessKeyId>AKIAJ4ONIK7HESMKD24A</AWSAccessKeyId><StringToSign>PUT z9Wr88+1u7mElp1xTXE5RQ== application/x-binary 1459703319 /chefservertest/chefservertest/organization-d823c08b57d78aaecbe51778e547e7db/checksum-cfd5abf3cfb5bbb984969d714d713945</StringToSign><SignatureProvided>9FS+TXbBkOOCcE3ezehKfspB+x8=</SignatureProvided><StringToSignBytes>50 55 54 0a 7a 39 57 72 38 38 2b 31 75 37 6d 45 6c 70 31 78 54 58 45 35 52 51 3d 3d 0a 61 70 70 6c 69 63 61 74 69 6f 6e 2f 78 2d 62 69 6e 61 72 79 0a 31 34 35 39 37 30 33 33 31 39 0a 2f 63 68 65 66 73 65 72 76 65 72 74 65 73 74 2f 63 68 65 66 73 65 72 76 65 72 74 65 73 74 2f 6f 72 67 61 6e 69 7a 61 74 69 6f 6e 2d 64 38 32 33 63 30 38 62 35 37 64 37 38 61 61 65 63 62 65 35 31 37 37 38 65 35 34 37 65 37 64 62 2f 63 68 65 63 6b 73 75 6d 2d 63 66 64 35 61 62 66 33 63 66 62 35 62 62 62 39 38 34 39 36 39 64 37 31 34 64 37 31 33 39 34 35</StringToSignBytes><RequestId>8C158B79D566C2C7</RequestId><HostId>oh/NNSCkM6DbwLFT2qySfvFEkiB2kadRNRvLyM8eoicVoWE8rWeLBKsi6eBRZBeiQgwjwUphRc0=</HostId></Error> ```

Also, if you are using Ubuntu 15.10, probably you should be aware:

And as mentioned above, if you instance/node goes down, you will not be able re-create it back, as it will ask you to clean DB.

Hope it helps.

Sincerely,
Vasilij

vsyc · April 20, 2016, 5:23pm

I am apologise for a spam, but I am waiting for 2 weeks on somebody just to give me a tip that I am wrong, even on chef-server repo in issues I can not get a clue of what might be wrong?

kdoonan · April 21, 2016, 3:47pm

Hi Vsyc,

I had similar problems, here’s the config I used that seems to work:

bookshelf['access_key_id'] = 'access_key'
bookshelf['secret_access_key'] = 'secret'
opscode_erchef['s3_bucket'] = 'my-bucket-name'
bookshelf['external_url'] = 'https://s3-eu-west-1.amazonaws.com'
bookshelf['vip'] = 's3-eu-west-1.amazonaws.com'
bookshelf['enable'] = false

I think the key was having https on the bookshelf['external_url'] attribute, but not on the bookshelf['vip']

Cheers
Kieran

vsyc · April 22, 2016, 3:35pm

Thank you @kdoonan your way did work for me as-well. The reason is that I add bucket name to vip and external_url, the reason for that, is that I had bucket name with periods, and it was complaining that I have to include bucket name into hostname as-well or something like that.

https can be in vip as-well, at least it worked for server 12.4.1

Topic		Replies	Views
Scaling erchef horizontally Chef Infra (archive)	14	547	May 2, 2014
Chef11 HA Chef Infra (archive)	20	629	February 16, 2013
Chef stability? Chef Infra (archive)	12	429	November 18, 2010
Decomposed Chef Server install Chef Infra (archive)	5	588	September 25, 2015
Backup & restore w/external db and bookshelf Chef Infra (archive)	0	459	November 10, 2016

Chef server. Disaster recovering

Related topics