Invalid signature/intermittant 401s on open source Chef Server

Hi all - I could use some guidance tracking down a frustrating intermittent
issue we’ve been having with open source Chef Server. This issue started
when we were running version 11.0.8 on CentOS 6.4 and has continued after
upgrading to 11.6.1. We interact with Chef Server frequently using chef-api
gem v0.5.0.

Here is an example of the error from the user’s view:

/usr/local/var/rbenv/versions/2.1.1/lib/ruby/gems/2.1.0/gems/chef-api-0.5.0/lib/chef-api/connection.rb:413:in
`error’: The Chef Server requires authorization. Please ensure you have
specified the correct client name and private key. If this error continues,
please verify the given client has the proper permissions on the Chef
Server. (ChefAPI::Error::HTTPUnauthorizedRequest)

{"error":["Invalid signature for user or client 'bethany'"]}

Corresponding logs on Chef server:

=> /var/log/chef-server/erchef/requests.log.2 <==
2014-10-16T21:19:42Z erchef@127.0.0.1 method=GET;
path=/cookbooks/pp-chef-server?num_versions=1; status=401; user=bethany;
req_id=8tlCJk/Z9R+mPVS/ztvVzw==; msg=bad_sig
https://github.com/opscode/chef_wm/blob/master/src/chef_wm_base.erl#L469;
req_time=3; rdbms_time=0; rdbms_count=2;

==> /var/log/chef-server/erchef/crash.log <==
2014-10-16 21:19:42 =ERROR REPORT====
{<<“method=GET; path=/cookbooks/pp-chef-server; status=401;
”>>,“Unauthorized”}

==> /var/log/chef-server/erchef/erchef.log <==
2014-10-16 21:19:42.950 [error] {<<"method=GET;
path=/cookbooks/pp-chef-server; status=401; ">>,“Unauthorized”}

It happens for GET and PUT requests for nodes, cookbooks, searches, and for
many different users in our organization. Re-trying the request always
works. I’ve yet to see a 401/bad_sig from using knife, but we also rarely
use knife. I’m currently running commands on a loop via knife to see if I
can trigger a 401 but so far have had no errors.

System load on the server is always low, plenty of available memory, and no
iowait or other disk-related performance issue markers
for /var/opt/chef-server/ which is a DRBD disk on SSD. Requests come in via
a Heartbeat-managed virtual IP but there is no additional layering of
load-balancing or proxy-ing.

Any ideas what might be causing the client to only occasionally present an
invalid signature? Should I be looking more closely at the chef-api gem
source rather than the chef server itself?

Bethany


Bethany Erskine
Senior Technical Operations Engineer
http://www.paperlesspost.com

I had this kind of issue on a drifting system, just a little less than 15 min late from chef server, sometimes a latency made the request late for more than 15 min.

Could.it be your case ? Time difference between user and chef server ?

---- Bethany Erskine a écrit ----

Hi all - I could use some guidance tracking down a frustrating intermittent issue we've been having with open source Chef Server. This issue started when we were running version 11.0.8 on CentOS 6.4 and has continued after upgrading to 11.6.1. We interact with Chef Server frequently using chef-api gem v0.5.0.

Here is an example of the error from the user's view:

/usr/local/var/rbenv/versions/2.1.1/lib/ruby/gems/2.1.0/gems/chef-api-0.5.0/lib/chef-api/connection.rb:413:in `error': The Chef Server requires authorization. Please ensure you have specified the correct client name and private key. If this error continues, please verify the given client has the proper permissions on the Chef Server. (ChefAPI::Error::HTTPUnauthorizedRequest)

{"error":["Invalid signature for user or client 'bethany'"]}

Corresponding logs on Chef server:

=> /var/log/chef-server/erchef/requests.log.2 <==

2014-10-16T21:19:42Z erchef@127.0.0.1 method=GET; path=/cookbooks/pp-chef-server?num_versions=1; status=401; user=bethany; req_id=8tlCJk/Z9R+mPVS/ztvVzw==; msg=bad_sig; req_time=3; rdbms_time=0; rdbms_count=2;

==> /var/log/chef-server/erchef/crash.log <==

2014-10-16 21:19:42 =ERROR REPORT====

{<<"method=GET; path=/cookbooks/pp-chef-server; status=401; ">>,"Unauthorized"}

==> /var/log/chef-server/erchef/erchef.log <==

2014-10-16 21:19:42.950 [error] {<<"method=GET; path=/cookbooks/pp-chef-server; status=401; ">>,"Unauthorized"}

It happens for GET and PUT requests for nodes, cookbooks, searches, and for many different users in our organization. Re-trying the request always works. I've yet to see a 401/bad_sig from using knife, but we also rarely use knife. I'm currently running commands on a loop via knife to see if I can trigger a 401 but so far have had no errors.

System load on the server is always low, plenty of available memory, and no iowait or other disk-related performance issue markers for /var/opt/chef-server/ which is a DRBD disk on SSD. Requests come in via a Heartbeat-managed virtual IP but there is no additional layering of load-balancing or proxy-ing.

Any ideas what might be causing the client to only occasionally present an invalid signature? Should I be looking more closely at the chef-api gem source rather than the chef server itself?

Bethany

--

Bethany Erskine

Senior Technical Operations Engineer

http://www.paperlesspost.com

I considered time drift, but all of our servers are synced via NTP
(including our development environments), and re-trying the request seconds
after the initial request always succeeds.

On Sat, Oct 18, 2014 at 4:57 AM, Tensibai Zhaoying tensibai@iabis.net
wrote:

I had this kind of issue on a drifting system, just a little less than 15
min late from chef server, sometimes a latency made the request late for
more than 15 min.

Could.it be your case ? Time difference between user and chef server ?

---- Bethany Erskine a écrit ----

Hi all - I could use some guidance tracking down a frustrating
intermittent issue we've been having with open source Chef Server. This
issue started when we were running version 11.0.8 on CentOS 6.4 and has
continued after upgrading to 11.6.1. We interact with Chef Server
frequently using chef-api gem v0.5.0.

Here is an example of the error from the user's view:

/usr/local/var/rbenv/versions/2.1.1/lib/ruby/gems/2.1.0/gems/chef-api-0.5.0/lib/chef-api/connection.rb:413:in
`error': The Chef Server requires authorization. Please ensure you have
specified the correct client name and private key. If this error continues,
please verify the given client has the proper permissions on the Chef
Server. (ChefAPI::Error::HTTPUnauthorizedRequest)

{"error":["Invalid signature for user or client 'bethany'"]}

Corresponding logs on Chef server:

=> /var/log/chef-server/erchef/requests.log.2 <==
2014-10-16T21:19:42Z erchef@127.0.0.1 method=GET;
path=/cookbooks/pp-chef-server?num_versions=1; status=401; user=bethany;
req_id=8tlCJk/Z9R+mPVS/ztvVzw==; msg=bad_sig
https://github.com/opscode/chef_wm/blob/master/src/chef_wm_base.erl#L469;
req_time=3; rdbms_time=0; rdbms_count=2;

==> /var/log/chef-server/erchef/crash.log <==
2014-10-16 21:19:42 =ERROR REPORT====
{<<"method=GET; path=/cookbooks/pp-chef-server; status=401;
">>,"Unauthorized"}

==> /var/log/chef-server/erchef/erchef.log <==
2014-10-16 21:19:42.950 [error] {<<"method=GET;
path=/cookbooks/pp-chef-server; status=401; ">>,"Unauthorized"}

It happens for GET and PUT requests for nodes, cookbooks, searches, and
for many different users in our organization. Re-trying the request always
works. I've yet to see a 401/bad_sig from using knife, but we also rarely
use knife. I'm currently running commands on a loop via knife to see if I
can trigger a 401 but so far have had no errors.

System load on the server is always low, plenty of available memory, and
no iowait or other disk-related performance issue markers
for /var/opt/chef-server/ which is a DRBD disk on SSD. Requests come in via
a Heartbeat-managed virtual IP but there is no additional layering of
load-balancing or proxy-ing.

Any ideas what might be causing the client to only occasionally present an
invalid signature? Should I be looking more closely at the chef-api gem
source rather than the chef server itself?

Bethany

--
Bethany Erskine
Senior Technical Operations Engineer
http://www.paperlesspost.com

--
Bethany Erskine
Senior Technical Operations Engineer

It doesn’t appear that you’re doing anything wrong, so it may be a tricky bug.

Can you file a bug report here, please? Issues · chef/chef-server · GitHub

Thanks,

--
Daniel DeLeo

On Monday, October 20, 2014 at 7:27 AM, Bethany Erskine wrote:

I considered time drift, but all of our servers are synced via NTP (including our development environments), and re-trying the request seconds after the initial request always succeeds.

On Sat, Oct 18, 2014 at 4:57 AM, Tensibai Zhaoying <tensibai@iabis.net (mailto:tensibai@iabis.net)> wrote:

I had this kind of issue on a drifting system, just a little less than 15 min late from chef server, sometimes a latency made the request late for more than 15 min.
Could.it (http://Could.it) be your case ? Time difference between user and chef server ?

---- Bethany Erskine a écrit ----

Hi all - I could use some guidance tracking down a frustrating intermittent issue we've been having with open source Chef Server. This issue started when we were running version 11.0.8 on CentOS 6.4 and has continued after upgrading to 11.6.1. We interact with Chef Server frequently using chef-api gem v0.5.0.

Here is an example of the error from the user's view:

/usr/local/var/rbenv/versions/2.1.1/lib/ruby/gems/2.1.0/gems/chef-api-0.5.0/lib/chef-api/connection.rb:413:in `error': The Chef Server requires authorization. Please ensure you have specified the correct client name and private key. If this error continues, please verify the given client has the proper permissions on the Chef Server. (ChefAPI::Error::HTTPUnauthorizedRequest)

{"error":["Invalid signature for user or client 'bethany'"]}

Corresponding logs on Chef server:

=> /var/log/chef-server/erchef/requests.log.2 <==
2014-10-16T21:19:42Z erchef@127.0.0.1 (mailto:erchef@127.0.0.1) method=GET; path=/cookbooks/pp-chef-server?num_versions=1; status=401; user=bethany; req_id=8tlCJk/Z9R+mPVS/ztvVzw==; msg=bad_sig (https://github.com/opscode/chef_wm/blob/master/src/chef_wm_base.erl#L469); req_time=3; rdbms_time=0; rdbms_count=2;

==> /var/log/chef-server/erchef/crash.log <==
2014-10-16 21:19:42 =ERROR REPORT====
{<<"method=GET; path=/cookbooks/pp-chef-server; status=401; ">>,"Unauthorized"}

==> /var/log/chef-server/erchef/erchef.log <==
2014-10-16 21:19:42.950 [error] {<<"method=GET; path=/cookbooks/pp-chef-server; status=401; ">>,"Unauthorized"}

It happens for GET and PUT requests for nodes, cookbooks, searches, and for many different users in our organization. Re-trying the request always works. I've yet to see a 401/bad_sig from using knife, but we also rarely use knife. I'm currently running commands on a loop via knife to see if I can trigger a 401 but so far have had no errors.

System load on the server is always low, plenty of available memory, and no iowait or other disk-related performance issue markers for /var/opt/chef-server/ which is a DRBD disk on SSD. Requests come in via a Heartbeat-managed virtual IP but there is no additional layering of load-balancing or proxy-ing.

Any ideas what might be causing the client to only occasionally present an invalid signature? Should I be looking more closely at the chef-api gem source rather than the chef server itself?

Bethany

--
Bethany Erskine
Senior Technical Operations Engineer
http://www.paperlesspost.com

--
Bethany Erskine
Senior Technical Operations Engineer
http://www.paperlesspost.com