Hi,
I have an environment of 10 servers and I needed a solution where I could send a command from a main server to all the others without going through ssh or having some service LISTENing on a port. After a lot of research I came upon chef and push-jobs server/client which I decided to try.
I used the following to set everything up:
https://docs.chef.io/push_jobs.html
https://docs.chef.io/install_push_jobs.html
Basically I have my chef server and push-jobs server running and configured properly and I have push-jobs client running on 2 nodes but I stumbled upon an issue I can't find a fix for and am hoping to get some help from the forum.
The issue I have is stability related. If I leave everything running for several hours and then check the logs I can see this:
[root@monitor chef]# knife node status
test1 available
If I send a job from the push-jobs server it says something like:
............................Quorum failed!
And when I check the logs of the push-jobs client running on node test1 I can see it constantly throwing stuff in the current log about re-connecting to the push-jobs server and one error:
Jul 8 17:52:48 localhost pushy-client: ERROR: [test1] No messages being received on command port in 4s. Possible encryption problem?
Only restarting fixes it but then after several hours it starts doing this again. I observed this on both test nodes I set this up on, the behavior is similar. Both are running same latest push-jobs client and running CentOS 7 and CentOS 6 64bit.
push-jobs server:
opscode-push-jobs-server-1.1.6-1.x86_64
CentOS 6
push-jobs client:
push-jobs-client-2.1.0-1.el6.x86_64
CentOS 6 and 7
chef-12.12.15-1.el6.x86_64
Please let me know how the above can be fixed.
Thanks in advance