Hi Joe,
Do you have any error logs from erchef? If not error logs, do you have
request logs that show the response time of this request? The error code
110 from nginx doesn't always mean that the request timed out during the
connection phase. The request may have failed the read timeout instead.
You could also try configuring chef-shell to point directly at erchef (port
8000) instead of going through nginx. Is this an abnormally large databag
item?
On Fri, May 9, 2014 at 9:40 AM, Joe Nuspl nuspl@nvwls.com wrote:
My interpretation of
upstream timed out (110: Connection timed out) while connecting to upstream
upstream: "http://127.0.0.1:8000/data/bag/item”
Is that nginx tries to do a connect() but erchef does not do an accept()
in reasonable amount of time. Is this correct?
Joe
On May 8, 2014, at 7:47 PM, Mark Mzyk mmzyk@getchef.com wrote:
Hey Joe,
Based on the load your describing I wouldn't expect the Chef server to be
having difficulty, especially if the 10.x version, which was much more
inefficient, handled it. It's hard for me to tell from what you posted what
the issue might be. It sounds like the server works sometimes, but fails
other times under load? If you check the erchef logs, do they provide any
more info? It's also possible you could be hitting something like a
postgres connection limit, so I'd suggest checking the postgres logs as
well.
As far as docs on tuning, I don't believe we have anything specific to
open source. We lay out many of the options you can tweak here:
http://docs.opscode.com/config_rb_chef_server.html
Note there is a link at the bottom of that page to even more options.
Enterprise Chef has a tuning guide that might be of some help:
Server Tuning
While open source and enterprise chef share the same core, it's not a one
for one equivalence between options, so you might need to do some inference
to determine what applies and what doesn't. Also note that enterprise is
typically run in a tiered and HA setup, whereas open source is typically
run on a single host (which I infer is what you're doing, based on the
localhost url for erchef).
If that doesn't help, reply back with any questions you have and we'll get
it sorted out.
Mark Mzyk
Joe Nuspl nuspl@nvwls.com
May 8, 2014 at 10:18 PM
I know 11.1.0 is right around the corner but I need something sooner…
Running open source chef-server 11.0.12 on CentOS-6.
We’re seeing a bunch of nginx timeouts accessing data bags. For example:
If I’m understanding this correctly, nginx cannot create a connection to
erchef.
I’ve found very little on tuning chef-server. There is
erchef['ibrowse_max_sessions’] but that would be for outbound connections,
i.e erchef->solr. Is there a parameter for the number of incoming
connections to erchef?
I have 1500 clients with a 15 minute splay. So roughly 100 servers/minute
with an average end-to-end chef-client run time of 43 seconds.
The same server running chef-10 with 10 merbs was able to keep up without
issue. 11.0.8 was an improvement but it seems like 11.0.12 has regressed.
On this server we are not running into the depsolver issue.
Any help would be greatly appreciated.
Thanks.
Joe
--
Stephen Delano
Software Development Engineer
Opscode, Inc.
1008 Western Avenue
Suite 601
Seattle, WA 98104