Solr dieing every few days


#1

Hi Eeveryone,

I was wondering if anybody else has come across this behavior. Every few
days Solr seems to die resulting in the processes needing to be killed and
restarting chef-solr and chef-solr-indexer. The following appears in the
logs at the time when Solr becomes unusable (Centos 5.6, chef 0.9.16):

solr-indexer.log:
FATAL: POST to Solr ‘http://localhost:8983’ failed.
Chef::Exceptions::SolrConnectionError exception: Errno::ECONNREFUSED:
Connection refused - connect(2) attempting to contact http://localhost:8983

solr.log:
FATAL: Search Query to Solr '
http://localhost:8983/solr/select?q=domain%3Afal&start=0&rows=1000&sort=X_CHEF_id_CHEF_X+asc&wt=ruby&indent=off&fq=%2BX_CHEF_database_CHEF_X%3Achef+%2BX_CHEF_type_CHEF_X%3Anode
failed. Chef::Exceptions::SolrConnectionError exception:
Errno::ECONNREFUSED: Connection refused - connect(2) attempting to contact
http://localhost:8983
or
FATAL: Search Query to Solr '
http://localhost:8983/solr/select?q=domain%3Afal&start=0&rows=1000&sort=X_CHEF_id_CHEF_X+asc&wt=ruby&indent=off&fq=%2BX_CHEF_database_CHEF_X%3Achef+%2BX_CHEF_type_CHEF_X%3Anode
failed. Chef::Exceptions::SolrConnectionError exception: Timeout::Error:
Timeout::Error attempting to contact http://localhost:8983

I’ve set the jetty timeout to an hour in case to see if it gets rid of the
timeout errors.
Appreciate any help.

Thanks,
Mark


#2

Check out your OOM killer log entries. I had this happen on a test
chef server on an amazon EC2 t1.micro instance.

If this is your problem, unfortunately I suspect the answer is to get
more memory.

-Peter

On Wed, Nov 16, 2011 at 12:22 PM, Mark Rechler mrechler@brightcove.com wrote:

Hi Eeveryone,

I was wondering if anybody else has come across this behavior. Every few
days Solr seems to die resulting in the processes needing to be killed and
restarting chef-solr and chef-solr-indexer. The following appears in the
logs at the time when Solr becomes unusable (Centos 5.6, chef 0.9.16):

solr-indexer.log:
FATAL: POST to Solr ‘http://localhost:8983’ failed.
Chef::Exceptions::SolrConnectionError exception: Errno::ECONNREFUSED:
Connection refused - connect(2) attempting to contact http://localhost:8983

solr.log:
FATAL: Search Query to Solr
http://localhost:8983/solr/select?q=domain%3Afal&start=0&rows=1000&sort=X_CHEF_id_CHEF_X+asc&wt=ruby&indent=off&fq=%2BX_CHEF_database_CHEF_X%3Achef+%2BX_CHEF_type_CHEF_X%3Anode
failed. Chef::Exceptions::SolrConnectionError exception:
Errno::ECONNREFUSED: Connection refused - connect(2) attempting to contact
http://localhost:8983
or
FATAL: Search Query to Solr
http://localhost:8983/solr/select?q=domain%3Afal&start=0&rows=1000&sort=X_CHEF_id_CHEF_X+asc&wt=ruby&indent=off&fq=%2BX_CHEF_database_CHEF_X%3Achef+%2BX_CHEF_type_CHEF_X%3Anode
failed. Chef::Exceptions::SolrConnectionError exception: Timeout::Error:
Timeout::Error attempting to contact http://localhost:8983

I’ve set the jetty timeout to an hour in case to see if it gets rid of the
timeout errors.
Appreciate any help.

Thanks,
Mark


#3

Hi Peter,

Thank you for the reply. I was not seeing any OOM messages, but the timeout
let me to up the timeout time in Jetty to an hour and increased the heap
size to 2gb for solr. Everything seems to be running smoothly now.

Thanks,
Mark

On Thu, Nov 17, 2011 at 11:18 PM, Peter Norton pn+chef-list@knewton.comwrote:

Check out your OOM killer log entries. I had this happen on a test
chef server on an amazon EC2 t1.micro instance.

If this is your problem, unfortunately I suspect the answer is to get
more memory.

-Peter

On Wed, Nov 16, 2011 at 12:22 PM, Mark Rechler mrechler@brightcove.com
wrote:

Hi Eeveryone,

I was wondering if anybody else has come across this behavior. Every few
days Solr seems to die resulting in the processes needing to be killed
and
restarting chef-solr and chef-solr-indexer. The following appears in the
logs at the time when Solr becomes unusable (Centos 5.6, chef 0.9.16):

solr-indexer.log:
FATAL: POST to Solr ‘http://localhost:8983’ failed.
Chef::Exceptions::SolrConnectionError exception: Errno::ECONNREFUSED:
Connection refused - connect(2) attempting to contact
http://localhost:8983

solr.log:
FATAL: Search Query to Solr

http://localhost:8983/solr/select?q=domain%3Afal&start=0&rows=1000&sort=X_CHEF_id_CHEF_X+asc&wt=ruby&indent=off&fq=%2BX_CHEF_database_CHEF_X%3Achef+%2BX_CHEF_type_CHEF_X%3Anode

failed. Chef::Exceptions::SolrConnectionError exception:
Errno::ECONNREFUSED: Connection refused - connect(2) attempting to
contact
http://localhost:8983
or
FATAL: Search Query to Solr

http://localhost:8983/solr/select?q=domain%3Afal&start=0&rows=1000&sort=X_CHEF_id_CHEF_X+asc&wt=ruby&indent=off&fq=%2BX_CHEF_database_CHEF_X%3Achef+%2BX_CHEF_type_CHEF_X%3Anode

failed. Chef::Exceptions::SolrConnectionError exception: Timeout::Error:
Timeout::Error attempting to contact http://localhost:8983

I’ve set the jetty timeout to an hour in case to see if it gets rid of
the
timeout errors.
Appreciate any help.

Thanks,
Mark