Push-jobs client unstable


#1

Hi,

I have an environment of 10 servers and I needed a solution where I could send a command from a main server to all the others without going through ssh or having some service LISTENing on a port. After a lot of research I came upon chef and push-jobs server/client which I decided to try.

I used the following to set everything up:
https://docs.chef.io/push_jobs.html
https://docs.chef.io/install_push_jobs.html

Basically I have my chef server and push-jobs server running and configured properly and I have push-jobs client running on 2 nodes but I stumbled upon an issue I can’t find a fix for and am hoping to get some help from the forum.

The issue I have is stability related. If I leave everything running for several hours and then check the logs I can see this:

[root@monitor chef]# knife node status
test1 available

If I send a job from the push-jobs server it says something like:

…Quorum failed!

And when I check the logs of the push-jobs client running on node test1 I can see it constantly throwing stuff in the current log about re-connecting to the push-jobs server and one error:

Jul 8 17:52:48 localhost pushy-client: ERROR: [test1] No messages being received on command port in 4s. Possible encryption problem?

Only restarting fixes it but then after several hours it starts doing this again. I observed this on both test nodes I set this up on, the behavior is similar. Both are running same latest push-jobs client and running CentOS 7 and CentOS 6 64bit.

push-jobs server:
opscode-push-jobs-server-1.1.6-1.x86_64
CentOS 6

push-jobs client:
push-jobs-client-2.1.0-1.el6.x86_64
CentOS 6 and 7

chef-12.12.15-1.el6.x86_64

Please let me know how the above can be fixed.
Thanks in advance