RE: RE: Re: os_process_error in CouchDB

Ian / Adam,
On my current chef-server, my couch db is version 0.11.2

Randy

-----Original Message-----
From: IDROSSI@jw.org [mailto:IDROSSI@jw.org]
Sent: Thursday, March 15, 2012 2:06 PM
To: chef@lists.opscode.com
Cc: EMENDOZA@jw.org; KAKiner@jw.org; GJZULAUF@jw.org
Subject: [chef] RE: Re: os_process_error in CouchDB

Hi Adam,

couch.log is 125000 lines long, so I'll include the beginning
(http://pastie.org/3602674) and the end (http://pastie.org/3602677).
I'll post to CouchDB's mailing list too.

I have to add to my below description that after I rebuilt the
chef-server, I reloaded all of our cookbooks, roles and databags.

I do believe that Randy has the same issue, although I'm not sure what
version of CouchDB he is using.

Ian D. Rossi


From: Adam Jacob [adam@opscode.com]
Sent: Thursday, March 15, 2012 1:02 PM
To: chef@lists.opscode.com
Subject: [chef] Re: os_process_error in CouchDB

Can you show us the full stack trace from CouchDB?

Have you showed it to the folks on the CouchDB list?

Adam

On Wed, Mar 14, 2012 at 6:51 AM, IDROSSI@jw.org wrote:

We are seeing a strange error in CouchDB that causes Chef to become
unusable and unrecoverable. The knife command ceases to respond, and
the chef webui ceases to respond. /var/log/couch.log shows an
os_process_error with exit status 0.

This is the second time this has happened. The first time, it happened

to our chef-server that was running properly for several weeks. On
Monday, at about 11 AM EST, the error occurred and our chef-server
became urecoverable.
We tried to research and recover the issue for about a day.

We then rebuilt the chef-server this morning. During the
setup/installation, we encountered this issue
(http://tickets.opscode.com/browse/CHEF-2346)
which we had encountered in the past. We then applied the fix, by
increasing maxFieldLength in the mainIndex section of the chef solr
config file.

Very shortly after that, while do a chef run on a lab node, running a
knife command and trying to access the web UI all at the same time,
the os_process_error occurred again and the chef-server became
unusable.

Our chef-server is running on a vSphere VM with 2 cores (2 cores in 1
socket), 2GB of RAM. It's running Ubuntu 10.04 LTS, Chef 0.10.8 and
CouchDB 0.10. The VM was generated from a pre-existing VM that
originally had only 1 core.

Another detail about our environment that may be important is that we
use Centrify on our Linux server for Active Directory integration.
This is why we were affected by CHEF-2346. Since chef pulls in all
authorized users on a node as an automatic attribute, there can be
thousands of users in a list that gets gathered by chef.

Is perhaps CouchDB dying because of the size of the node data that we
are asking chef to gather? Has anyone else encountered this error?
Much thanks for any help. Let me know if I can provide any more
information.

Ian D. Rossi

--
Opscode, Inc.
Adam Jacob, Chief Customer Officer
T: (206) 619-7151 E: adam@opscode.com

I did read somewhere that CouchDB < 1.x.x can be problematic. I should also mention that we have since rebuilt our chef-server on Ubuntu 11.10 which includes CouchDB 1.0.1. We have no issues, but we are very interested in getting to the root cause of this problem, because we are still nervous.

Ian D. Rossi


From: Van Fossan,Randy [vanfossr@oclc.org]
Sent: Thursday, March 15, 2012 2:10 PM
To: chef@lists.opscode.com
Cc: EMENDOZA@jw.org; Kiner, Kari; Zulauf, Graham
Subject: [chef] RE: RE: Re: os_process_error in CouchDB

Ian / Adam,
On my current chef-server, my couch db is version 0.11.2

Randy

-----Original Message-----
From: IDROSSI@jw.org [mailto:IDROSSI@jw.org]
Sent: Thursday, March 15, 2012 2:06 PM
To: chef@lists.opscode.com
Cc: EMENDOZA@jw.org; KAKiner@jw.org; GJZULAUF@jw.org
Subject: [chef] RE: Re: os_process_error in CouchDB

Hi Adam,

couch.log is 125000 lines long, so I'll include the beginning
(http://pastie.org/3602674) and the end (http://pastie.org/3602677).
I'll post to CouchDB's mailing list too.

I have to add to my below description that after I rebuilt the
chef-server, I reloaded all of our cookbooks, roles and databags.

I do believe that Randy has the same issue, although I'm not sure what
version of CouchDB he is using.

Ian D. Rossi


From: Adam Jacob [adam@opscode.com]
Sent: Thursday, March 15, 2012 1:02 PM
To: chef@lists.opscode.com
Subject: [chef] Re: os_process_error in CouchDB

Can you show us the full stack trace from CouchDB?

Have you showed it to the folks on the CouchDB list?

Adam

On Wed, Mar 14, 2012 at 6:51 AM, IDROSSI@jw.org wrote:

We are seeing a strange error in CouchDB that causes Chef to become
unusable and unrecoverable. The knife command ceases to respond, and
the chef webui ceases to respond. /var/log/couch.log shows an
os_process_error with exit status 0.

This is the second time this has happened. The first time, it happened

to our chef-server that was running properly for several weeks. On
Monday, at about 11 AM EST, the error occurred and our chef-server
became urecoverable.
We tried to research and recover the issue for about a day.

We then rebuilt the chef-server this morning. During the
setup/installation, we encountered this issue
(http://tickets.opscode.com/browse/CHEF-2346)
which we had encountered in the past. We then applied the fix, by
increasing maxFieldLength in the mainIndex section of the chef solr
config file.

Very shortly after that, while do a chef run on a lab node, running a
knife command and trying to access the web UI all at the same time,
the os_process_error occurred again and the chef-server became
unusable.

Our chef-server is running on a vSphere VM with 2 cores (2 cores in 1
socket), 2GB of RAM. It's running Ubuntu 10.04 LTS, Chef 0.10.8 and
CouchDB 0.10. The VM was generated from a pre-existing VM that
originally had only 1 core.

Another detail about our environment that may be important is that we
use Centrify on our Linux server for Active Directory integration.
This is why we were affected by CHEF-2346. Since chef pulls in all
authorized users on a node as an automatic attribute, there can be
thousands of users in a list that gets gathered by chef.

Is perhaps CouchDB dying because of the size of the node data that we
are asking chef to gather? Has anyone else encountered this error?
Much thanks for any help. Let me know if I can provide any more
information.

Ian D. Rossi

--
Opscode, Inc.
Adam Jacob, Chief Customer Officer
T: (206) 619-7151 E: adam@opscode.com

Randy,

Is it possible to upgrade to 1.0.1 or greater? What OS you running? CouchDB 1.0.1 is included in Ubuntu 11.10.

ian D. Rossi


From: Van Fossan,Randy [vanfossr@oclc.org]
Sent: Thursday, March 15, 2012 2:10 PM
To: chef@lists.opscode.com
Cc: EMENDOZA@jw.org; Kiner, Kari; Zulauf, Graham
Subject: [chef] RE: RE: Re: os_process_error in CouchDB

Ian / Adam,
On my current chef-server, my couch db is version 0.11.2

Randy

-----Original Message-----
From: IDROSSI@jw.org [mailto:IDROSSI@jw.org]
Sent: Thursday, March 15, 2012 2:06 PM
To: chef@lists.opscode.com
Cc: EMENDOZA@jw.org; KAKiner@jw.org; GJZULAUF@jw.org
Subject: [chef] RE: Re: os_process_error in CouchDB

Hi Adam,

couch.log is 125000 lines long, so I'll include the beginning
(http://pastie.org/3602674) and the end (http://pastie.org/3602677).
I'll post to CouchDB's mailing list too.

I have to add to my below description that after I rebuilt the
chef-server, I reloaded all of our cookbooks, roles and databags.

I do believe that Randy has the same issue, although I'm not sure what
version of CouchDB he is using.

Ian D. Rossi


From: Adam Jacob [adam@opscode.com]
Sent: Thursday, March 15, 2012 1:02 PM
To: chef@lists.opscode.com
Subject: [chef] Re: os_process_error in CouchDB

Can you show us the full stack trace from CouchDB?

Have you showed it to the folks on the CouchDB list?

Adam

On Wed, Mar 14, 2012 at 6:51 AM, IDROSSI@jw.org wrote:

We are seeing a strange error in CouchDB that causes Chef to become
unusable and unrecoverable. The knife command ceases to respond, and
the chef webui ceases to respond. /var/log/couch.log shows an
os_process_error with exit status 0.

This is the second time this has happened. The first time, it happened

to our chef-server that was running properly for several weeks. On
Monday, at about 11 AM EST, the error occurred and our chef-server
became urecoverable.
We tried to research and recover the issue for about a day.

We then rebuilt the chef-server this morning. During the
setup/installation, we encountered this issue
(http://tickets.opscode.com/browse/CHEF-2346)
which we had encountered in the past. We then applied the fix, by
increasing maxFieldLength in the mainIndex section of the chef solr
config file.

Very shortly after that, while do a chef run on a lab node, running a
knife command and trying to access the web UI all at the same time,
the os_process_error occurred again and the chef-server became
unusable.

Our chef-server is running on a vSphere VM with 2 cores (2 cores in 1
socket), 2GB of RAM. It's running Ubuntu 10.04 LTS, Chef 0.10.8 and
CouchDB 0.10. The VM was generated from a pre-existing VM that
originally had only 1 core.

Another detail about our environment that may be important is that we
use Centrify on our Linux server for Active Directory integration.
This is why we were affected by CHEF-2346. Since chef pulls in all
authorized users on a node as an automatic attribute, there can be
thousands of users in a list that gets gathered by chef.

Is perhaps CouchDB dying because of the size of the node data that we
are asking chef to gather? Has anyone else encountered this error?
Much thanks for any help. Let me know if I can provide any more
information.

Ian D. Rossi

--
Opscode, Inc.
Adam Jacob, Chief Customer Officer
T: (206) 619-7151 E: adam@opscode.com

There's a rpm available for 1.0.1, I don't think it's in centos or epel. You should be able to get it via rpmfind. I agree with ian, you should ipgrade

Sent from a phone

On Mar 15, 2012, at 11:39 AM, IDROSSI@jw.org wrote:

Randy,

Is it possible to upgrade to 1.0.1 or greater? What OS you running? CouchDB 1.0.1 is included in Ubuntu 11.10.

ian D. Rossi


From: Van Fossan,Randy [vanfossr@oclc.org]
Sent: Thursday, March 15, 2012 2:10 PM
To: chef@lists.opscode.com
Cc: EMENDOZA@jw.org; Kiner, Kari; Zulauf, Graham
Subject: [chef] RE: RE: Re: os_process_error in CouchDB

Ian / Adam,
On my current chef-server, my couch db is version 0.11.2

Randy

-----Original Message-----
From: IDROSSI@jw.org [mailto:IDROSSI@jw.org]
Sent: Thursday, March 15, 2012 2:06 PM
To: chef@lists.opscode.com
Cc: EMENDOZA@jw.org; KAKiner@jw.org; GJZULAUF@jw.org
Subject: [chef] RE: Re: os_process_error in CouchDB

Hi Adam,

couch.log is 125000 lines long, so I'll include the beginning
(http://pastie.org/3602674) and the end (http://pastie.org/3602677).
I'll post to CouchDB's mailing list too.

I have to add to my below description that after I rebuilt the
chef-server, I reloaded all of our cookbooks, roles and databags.

I do believe that Randy has the same issue, although I'm not sure what
version of CouchDB he is using.

Ian D. Rossi


From: Adam Jacob [adam@opscode.com]
Sent: Thursday, March 15, 2012 1:02 PM
To: chef@lists.opscode.com
Subject: [chef] Re: os_process_error in CouchDB

Can you show us the full stack trace from CouchDB?

Have you showed it to the folks on the CouchDB list?

Adam

On Wed, Mar 14, 2012 at 6:51 AM, IDROSSI@jw.org wrote:

We are seeing a strange error in CouchDB that causes Chef to become
unusable and unrecoverable. The knife command ceases to respond, and
the chef webui ceases to respond. /var/log/couch.log shows an
os_process_error with exit status 0.

This is the second time this has happened. The first time, it happened

to our chef-server that was running properly for several weeks. On
Monday, at about 11 AM EST, the error occurred and our chef-server
became urecoverable.
We tried to research and recover the issue for about a day.

We then rebuilt the chef-server this morning. During the
setup/installation, we encountered this issue
(http://tickets.opscode.com/browse/CHEF-2346)
which we had encountered in the past. We then applied the fix, by
increasing maxFieldLength in the mainIndex section of the chef solr
config file.

Very shortly after that, while do a chef run on a lab node, running a
knife command and trying to access the web UI all at the same time,
the os_process_error occurred again and the chef-server became
unusable.

Our chef-server is running on a vSphere VM with 2 cores (2 cores in 1
socket), 2GB of RAM. It's running Ubuntu 10.04 LTS, Chef 0.10.8 and
CouchDB 0.10. The VM was generated from a pre-existing VM that
originally had only 1 core.

Another detail about our environment that may be important is that we
use Centrify on our Linux server for Active Directory integration.
This is why we were affected by CHEF-2346. Since chef pulls in all
authorized users on a node as an automatic attribute, there can be
thousands of users in a list that gets gathered by chef.

Is perhaps CouchDB dying because of the size of the node data that we
are asking chef to gather? Has anyone else encountered this error?
Much thanks for any help. Let me know if I can provide any more
information.

Ian D. Rossi

--
Opscode, Inc.
Adam Jacob, Chief Customer Officer
T: (206) 619-7151 E: adam@opscode.com

Yes,

However, I could not find a couchdb 1.0.x from a trusted source for
CentOS 5.x. We have built a new server (CentOS 6 and couchdb 1.0.2)
but do not have time to migrate right now. If anyone knows where I
can find couchdb 1.0.x for CentOS 5.x that is from a reputable source, I
am all for it.

A few weeks ago, I cloned the server (VM) and downloaded a couchdb rpm
and tested that I could in fact just upgrade couchdb and it worked. But
again, I need a reputable source.

Thanks
Randy

-----Original Message-----
From: Chris [mailto:grocerylist@gmail.com]
Sent: Thursday, March 15, 2012 2:49 PM
To: chef@lists.opscode.com
Subject: [chef] Re: RE: RE: RE: Re: os_process_error in CouchDB

There's a rpm available for 1.0.1, I don't think it's in centos or epel.
You should be able to get it via rpmfind. I agree with ian, you should
ipgrade

Sent from a phone

On Mar 15, 2012, at 11:39 AM, IDROSSI@jw.org wrote:

Randy,

Is it possible to upgrade to 1.0.1 or greater? What OS you running?
CouchDB 1.0.1 is included in Ubuntu 11.10.

ian D. Rossi


From: Van Fossan,Randy [vanfossr@oclc.org]
Sent: Thursday, March 15, 2012 2:10 PM
To: chef@lists.opscode.com
Cc: EMENDOZA@jw.org; Kiner, Kari; Zulauf, Graham
Subject: [chef] RE: RE: Re: os_process_error in CouchDB

Ian / Adam,
On my current chef-server, my couch db is version 0.11.2

Randy

-----Original Message-----
From: IDROSSI@jw.org [mailto:IDROSSI@jw.org]
Sent: Thursday, March 15, 2012 2:06 PM
To: chef@lists.opscode.com
Cc: EMENDOZA@jw.org; KAKiner@jw.org; GJZULAUF@jw.org
Subject: [chef] RE: Re: os_process_error in CouchDB

Hi Adam,

couch.log is 125000 lines long, so I'll include the beginning
(http://pastie.org/3602674) and the end (http://pastie.org/3602677).
I'll post to CouchDB's mailing list too.

I have to add to my below description that after I rebuilt the
chef-server, I reloaded all of our cookbooks, roles and databags.

I do believe that Randy has the same issue, although I'm not sure what

version of CouchDB he is using.

Ian D. Rossi


From: Adam Jacob [adam@opscode.com]
Sent: Thursday, March 15, 2012 1:02 PM
To: chef@lists.opscode.com
Subject: [chef] Re: os_process_error in CouchDB

Can you show us the full stack trace from CouchDB?

Have you showed it to the folks on the CouchDB list?

Adam

On Wed, Mar 14, 2012 at 6:51 AM, IDROSSI@jw.org wrote:

We are seeing a strange error in CouchDB that causes Chef to become
unusable and unrecoverable. The knife command ceases to respond, and
the chef webui ceases to respond. /var/log/couch.log shows an
os_process_error with exit status 0.

This is the second time this has happened. The first time, it
happened

to our chef-server that was running properly for several weeks. On
Monday, at about 11 AM EST, the error occurred and our chef-server
became urecoverable.
We tried to research and recover the issue for about a day.

We then rebuilt the chef-server this morning. During the
setup/installation, we encountered this issue
(http://tickets.opscode.com/browse/CHEF-2346)
which we had encountered in the past. We then applied the fix, by
increasing maxFieldLength in the mainIndex section of the chef solr
config file.

Very shortly after that, while do a chef run on a lab node, running a

knife command and trying to access the web UI all at the same time,
the os_process_error occurred again and the chef-server became
unusable.

Our chef-server is running on a vSphere VM with 2 cores (2 cores in 1

socket), 2GB of RAM. It's running Ubuntu 10.04 LTS, Chef 0.10.8 and
CouchDB 0.10. The VM was generated from a pre-existing VM that
originally had only 1 core.

Another detail about our environment that may be important is that we

use Centrify on our Linux server for Active Directory integration.
This is why we were affected by CHEF-2346. Since chef pulls in all
authorized users on a node as an automatic attribute, there can be
thousands of users in a list that gets gathered by chef.

Is perhaps CouchDB dying because of the size of the node data that we

are asking chef to gather? Has anyone else encountered this error?
Much thanks for any help. Let me know if I can provide any more
information.

Ian D. Rossi

--
Opscode, Inc.
Adam Jacob, Chief Customer Officer
T: (206) 619-7151 E: adam@opscode.com

This is where i got mine

http://pkgs.org/centos-5-rhel-5/rpmforge-x86_64/couchdb-1.0.1-1.el5.rf.x86_64.rpm.html

On Thu, Mar 15, 2012 at 1:03 PM, Van Fossan,Randy vanfossr@oclc.org wrote:

Yes,

However, I could not find a couchdb 1.0.x from a trusted source for
CentOS 5.x. We have built a new server (CentOS 6 and couchdb 1.0.2)
but do not have time to migrate right now. If anyone knows where I
can find couchdb 1.0.x for CentOS 5.x that is from a reputable source, I
am all for it.

A few weeks ago, I cloned the server (VM) and downloaded a couchdb rpm
and tested that I could in fact just upgrade couchdb and it worked. But
again, I need a reputable source.

Thanks
Randy

-----Original Message-----
From: Chris [mailto:grocerylist@gmail.com]
Sent: Thursday, March 15, 2012 2:49 PM
To: chef@lists.opscode.com
Subject: [chef] Re: RE: RE: RE: Re: os_process_error in CouchDB

There's a rpm available for 1.0.1, I don't think it's in centos or epel.
You should be able to get it via rpmfind. I agree with ian, you should
ipgrade

Sent from a phone

On Mar 15, 2012, at 11:39 AM, IDROSSI@jw.org wrote:

Randy,

Is it possible to upgrade to 1.0.1 or greater? What OS you running?
CouchDB 1.0.1 is included in Ubuntu 11.10.

ian D. Rossi


From: Van Fossan,Randy [vanfossr@oclc.org]
Sent: Thursday, March 15, 2012 2:10 PM
To: chef@lists.opscode.com
Cc: EMENDOZA@jw.org; Kiner, Kari; Zulauf, Graham
Subject: [chef] RE: RE: Re: os_process_error in CouchDB

Ian / Adam,
On my current chef-server, my couch db is version 0.11.2

Randy

-----Original Message-----
From: IDROSSI@jw.org [mailto:IDROSSI@jw.org]
Sent: Thursday, March 15, 2012 2:06 PM
To: chef@lists.opscode.com
Cc: EMENDOZA@jw.org; KAKiner@jw.org; GJZULAUF@jw.org
Subject: [chef] RE: Re: os_process_error in CouchDB

Hi Adam,

couch.log is 125000 lines long, so I'll include the beginning
(http://pastie.org/3602674) and the end (http://pastie.org/3602677).
I'll post to CouchDB's mailing list too.

I have to add to my below description that after I rebuilt the
chef-server, I reloaded all of our cookbooks, roles and databags.

I do believe that Randy has the same issue, although I'm not sure what

version of CouchDB he is using.

Ian D. Rossi


From: Adam Jacob [adam@opscode.com]
Sent: Thursday, March 15, 2012 1:02 PM
To: chef@lists.opscode.com
Subject: [chef] Re: os_process_error in CouchDB

Can you show us the full stack trace from CouchDB?

Have you showed it to the folks on the CouchDB list?

Adam

On Wed, Mar 14, 2012 at 6:51 AM, IDROSSI@jw.org wrote:

We are seeing a strange error in CouchDB that causes Chef to become
unusable and unrecoverable. The knife command ceases to respond, and
the chef webui ceases to respond. /var/log/couch.log shows an
os_process_error with exit status 0.

This is the second time this has happened. The first time, it
happened

to our chef-server that was running properly for several weeks. On
Monday, at about 11 AM EST, the error occurred and our chef-server
became urecoverable.
We tried to research and recover the issue for about a day.

We then rebuilt the chef-server this morning. During the
setup/installation, we encountered this issue
(http://tickets.opscode.com/browse/CHEF-2346)
which we had encountered in the past. We then applied the fix, by
increasing maxFieldLength in the mainIndex section of the chef solr
config file.

Very shortly after that, while do a chef run on a lab node, running a

knife command and trying to access the web UI all at the same time,
the os_process_error occurred again and the chef-server became
unusable.

Our chef-server is running on a vSphere VM with 2 cores (2 cores in 1

socket), 2GB of RAM. It's running Ubuntu 10.04 LTS, Chef 0.10.8 and
CouchDB 0.10. The VM was generated from a pre-existing VM that
originally had only 1 core.

Another detail about our environment that may be important is that we

use Centrify on our Linux server for Active Directory integration.
This is why we were affected by CHEF-2346. Since chef pulls in all
authorized users on a node as an automatic attribute, there can be
thousands of users in a list that gets gathered by chef.

Is perhaps CouchDB dying because of the size of the node data that we

are asking chef to gather? Has anyone else encountered this error?
Much thanks for any help. Let me know if I can provide any more
information.

Ian D. Rossi

--
Opscode, Inc.
Adam Jacob, Chief Customer Officer
T: (206) 619-7151 E: adam@opscode.com

--
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.

Randy,

Your other option is that you could install it from source. There is a guide at the bottom of this page:
http://wiki.apache.org/couchdb/Installing_on_RHEL5

Ian D. Rossi


From: Van Fossan,Randy [vanfossr@oclc.org]
Sent: Thursday, March 15, 2012 4:03 PM
To: chef@lists.opscode.com
Subject: [chef] RE: Re: RE: RE: RE: Re: os_process_error in CouchDB

Yes,

However, I could not find a couchdb 1.0.x from a trusted source for
CentOS 5.x. We have built a new server (CentOS 6 and couchdb 1.0.2)
but do not have time to migrate right now. If anyone knows where I
can find couchdb 1.0.x for CentOS 5.x that is from a reputable source, I
am all for it.

A few weeks ago, I cloned the server (VM) and downloaded a couchdb rpm
and tested that I could in fact just upgrade couchdb and it worked. But
again, I need a reputable source.

Thanks
Randy

-----Original Message-----
From: Chris [mailto:grocerylist@gmail.com]
Sent: Thursday, March 15, 2012 2:49 PM
To: chef@lists.opscode.com
Subject: [chef] Re: RE: RE: RE: Re: os_process_error in CouchDB

There's a rpm available for 1.0.1, I don't think it's in centos or epel.
You should be able to get it via rpmfind. I agree with ian, you should
ipgrade

Sent from a phone

On Mar 15, 2012, at 11:39 AM, IDROSSI@jw.org wrote:

Randy,

Is it possible to upgrade to 1.0.1 or greater? What OS you running?
CouchDB 1.0.1 is included in Ubuntu 11.10.

ian D. Rossi


From: Van Fossan,Randy [vanfossr@oclc.org]
Sent: Thursday, March 15, 2012 2:10 PM
To: chef@lists.opscode.com
Cc: EMENDOZA@jw.org; Kiner, Kari; Zulauf, Graham
Subject: [chef] RE: RE: Re: os_process_error in CouchDB

Ian / Adam,
On my current chef-server, my couch db is version 0.11.2

Randy

-----Original Message-----
From: IDROSSI@jw.org [mailto:IDROSSI@jw.org]
Sent: Thursday, March 15, 2012 2:06 PM
To: chef@lists.opscode.com
Cc: EMENDOZA@jw.org; KAKiner@jw.org; GJZULAUF@jw.org
Subject: [chef] RE: Re: os_process_error in CouchDB

Hi Adam,

couch.log is 125000 lines long, so I'll include the beginning
(http://pastie.org/3602674) and the end (http://pastie.org/3602677).
I'll post to CouchDB's mailing list too.

I have to add to my below description that after I rebuilt the
chef-server, I reloaded all of our cookbooks, roles and databags.

I do believe that Randy has the same issue, although I'm not sure what

version of CouchDB he is using.

Ian D. Rossi


From: Adam Jacob [adam@opscode.com]
Sent: Thursday, March 15, 2012 1:02 PM
To: chef@lists.opscode.com
Subject: [chef] Re: os_process_error in CouchDB

Can you show us the full stack trace from CouchDB?

Have you showed it to the folks on the CouchDB list?

Adam

On Wed, Mar 14, 2012 at 6:51 AM, IDROSSI@jw.org wrote:

We are seeing a strange error in CouchDB that causes Chef to become
unusable and unrecoverable. The knife command ceases to respond, and
the chef webui ceases to respond. /var/log/couch.log shows an
os_process_error with exit status 0.

This is the second time this has happened. The first time, it
happened

to our chef-server that was running properly for several weeks. On
Monday, at about 11 AM EST, the error occurred and our chef-server
became urecoverable.
We tried to research and recover the issue for about a day.

We then rebuilt the chef-server this morning. During the
setup/installation, we encountered this issue
(http://tickets.opscode.com/browse/CHEF-2346)
which we had encountered in the past. We then applied the fix, by
increasing maxFieldLength in the mainIndex section of the chef solr
config file.

Very shortly after that, while do a chef run on a lab node, running a

knife command and trying to access the web UI all at the same time,
the os_process_error occurred again and the chef-server became
unusable.

Our chef-server is running on a vSphere VM with 2 cores (2 cores in 1

socket), 2GB of RAM. It's running Ubuntu 10.04 LTS, Chef 0.10.8 and
CouchDB 0.10. The VM was generated from a pre-existing VM that
originally had only 1 core.

Another detail about our environment that may be important is that we

use Centrify on our Linux server for Active Directory integration.
This is why we were affected by CHEF-2346. Since chef pulls in all
authorized users on a node as an automatic attribute, there can be
thousands of users in a list that gets gathered by chef.

Is perhaps CouchDB dying because of the size of the node data that we

are asking chef to gather? Has anyone else encountered this error?
Much thanks for any help. Let me know if I can provide any more
information.

Ian D. Rossi

--
Opscode, Inc.
Adam Jacob, Chief Customer Officer
T: (206) 619-7151 E: adam@opscode.com