Knide client delete leads to 403 "Forbidden" on Chef Server 12


#1

I have recently upgraded from Chef open source server 11 to Chef Server 12

We rebuild many of our hosts regularly.

In the past we simply run a 'knife client delete ’ and let the
validator recreate the client upon bootstrap. We leave the node intact and
it converges as per it’s saved run_list.

Under chef 12 when we try and do the same thing we get

[2015-01-04T13:51:00-08:00] ERROR: 403 “Forbidden”
[2015-01-04T13:51:00-08:00] FATAL: Chef::Exceptions::ChildConvergeError:
Chef run process exited unsuccessfully (exit code 1)

It appears that the server does not like the new client key in the sense
that it is not associated with the old node.

Any ideas how to get around this issue or suggestions for further debugging.


#2

On Jan 4, 2015, at 2:57 PM, Mark Selby mselby@thenextbigsound.com wrote:

I have recently upgraded from Chef open source server 11 to Chef Server 12

We rebuild many of our hosts regularly.

In the past we simply run a 'knife client delete ’ and let the validator recreate the client upon bootstrap. We leave the node intact and it converges as per it’s saved run_list.

Under chef 12 when we try and do the same thing we get

[2015-01-04T13:51:00-08:00] ERROR: 403 “Forbidden”
[2015-01-04T13:51:00-08:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)

It appears that the server does not like the new client key in the sense that it is not associated with the old node.

Any ideas how to get around this issue or suggestions for further debugging.

This gets to some details about how the Chef ACL system works. Basically when a new object is created, the client creating it gets special permissions so the creator can write even if the default ACL would be read-only. This is how a client can write to its own node object but not any others. When you leave the existing node object, the new client only kind of gets those new permissions. Parts of it should work since it should have the same name as the old one, but there have been persistent issues like this due to old and inconsistent code in places.

Short version: delete both the client and the node and use the -j option when you re-bootstrap to pass in the old data.

–Noah


#3

Noah - Thanks for your response.

This does add some complexity to my environment that did not exist before.
The complexity arises out of the idea that a nodes run_list did NOT have to
re created when we rebuild our hosts. We have automated the ‘knife client
delete’ as part of the kickstart/preseed process such that everything can
happen unattended.

In this new situation I believe that I have to create a process that
establishes a correct initial JSON file which includes the proper
environment and run_list for a node that is being rebuilt.

I was hoping to get additional info in two areas

(1) Is the inability to reset client keys and maintain permissions a known
Chef Server 12 bug and is there any plan to fix this?

(2) How do other people who rebuild hosts frequently in an automated
fashion re-establish environments and run_lists for nodes that get
refreshed?

Thanks!

On Sun, Jan 4, 2015 at 3:02 PM, Noah Kantrowitz noah@coderanger.net wrote:

On Jan 4, 2015, at 2:57 PM, Mark Selby mselby@thenextbigsound.com wrote:

I have recently upgraded from Chef open source server 11 to Chef Server
12

We rebuild many of our hosts regularly.

In the past we simply run a 'knife client delete ’ and let the
validator recreate the client upon bootstrap. We leave the node intact and
it converges as per it’s saved run_list.

Under chef 12 when we try and do the same thing we get

[2015-01-04T13:51:00-08:00] ERROR: 403 “Forbidden”
[2015-01-04T13:51:00-08:00] FATAL: Chef::Exceptions::ChildConvergeError:
Chef run process exited unsuccessfully (exit code 1)

It appears that the server does not like the new client key in the sense
that it is not associated with the old node.

Any ideas how to get around this issue or suggestions for further
debugging.

This gets to some details about how the Chef ACL system works. Basically
when a new object is created, the client creating it gets special
permissions so the creator can write even if the default ACL would be
read-only. This is how a client can write to its own node object but not
any others. When you leave the existing node object, the new client only
kind of gets those new permissions. Parts of it should work since it should
have the same name as the old one, but there have been persistent issues
like this due to old and inconsistent code in places.

Short version: delete both the client and the node and use the -j option
when you re-bootstrap to pass in the old data.

–Noah