Chef 12 Upgrade - Net::HTTPServerException: 403 "Forbidden"


#1

I upgraded two chef 11 servers to chef 12. After the upgrade, when running
chef-client on new nodes (existing ones are ok), the chef-client run fails
at the end of the run with:

Net::HTTPServerException: 403 “Forbidden”

There is more information on this issue here:
https://github.com/chef/chef-server/issues/63

This forbidden issue seems to be somewhat documented here:
https://docs.chef.io/errors.html#forbidden

The documentation says that I can enable permissions globally on the nodes.
However, I don’t see any way to do this in the nodes section. There is no
permissions tab under nodes. The permissions tab is ONLY available under
each node.

When I manually enable update permissions on clients, the chef-client runs
ok. In the issue above people are proposing solving this by running knife
acl against the nodes after the fact. That’s not a scalable solution since
our node names are based off instance id’s.

I need to know how to set update permissions globally on clients. Why isn’t
this part of the upgrade process? It seems reasonable to me that clients
would continue to need to send their data (i.e. an update) back to the chef
server at the end of the run.

Also, this isn’t documented as part of the upgrade process. The fact this
has happened with upgrades on two chef servers would tend to indicate to me
that this is typical behavior.

Thanks,
Doug


#2

Hi,

I’m sorry to hear your Chef Server upgrade is causing some problems.

After the upgrade, when running
chef-client on new nodes (existing ones are ok), the chef-client run fails
at the end of the run with:

Net::HTTPServerException: 403 “Forbidden”

Can you describe a bit about what your bootstrap process for new nodes look like? Changes in the permissions model from Chef 11 to Chef 12 are problematic for some bootstrap methods. In Chef 12, clients (by default) only have update permissions on nodes that they create. If you are using tooling that pre-creates the node objects, then the API client being used for the chef-client run will not be the creator of the node object and will have no permissions on the node.

While it is possible to change the permissions on the nodes container, allowing all clients to update all new nodes, this is not recommended [0]. Rather, it would be better if we worked through why clients are not getting permissions on their corresponding nodes in your case.

Why isn’t this part of the upgrade process? It seems reasonable to me that clients
would continue to need to send their data (i.e. an update) back to the chef
server at the end of the run.

Also, this isn’t documented as part of the upgrade process. The fact this
has happened with upgrades on two chef servers would tend to indicate to me
that this is typical behavior.

I agree that the changes in the permissions model from Chef 12 to Chef 11 should be more clearly documented. We do migrate permissions as part of the upgrade process and for many users this migration works.

From what I have seen from bug reports, Chef 11 to Chef 12 upgrades that run into permission problems usually fall into 3 buckets:

  1. A failure in the upgrade procedures fail to set permissions correctly. So far this appears rare but it is hard to know for sure since many of those who hit problems quickly work past the problem using the various work arounds in that thread.

  2. A bootstrap process that worked fine in Chef 11’s permissions model doesn’t work with Chef 12s permission model. For instance, if a bootstrap process is using knife node from file to populate node data before the first chef-client run but still allowing the API client to be created on the first run, it will hit a 403 when attempting to update the node.

  3. In-recipe operations that worked in Chef 11 no longer work in Chef 12, including operations like updating data bags, updating other node objects, etc.

Unfortunately (2) and (3) can often look like (1) without deeper investigation. For cases of (2) and (3), we are disappointed that we broke existing workflows but have decided that the path forward is to help people using those workflows adapt them to Chef Server 12 rather than add in special cases to the Chef Server 12 permission model.

If you are being affected by a failed upgrade, I welcome the opportunity to look, however based on the information we have so far, this looks like a case of (2).

Sincerely,

Steven

Notes:

[0] If you would like to allow any node to update any other node, you have to (1) update the permissions on all existing nodes and (2) update the permissions on the node container for all future nodes to get the correct permissions. Updating the permissions on the node container ensures this is a one-time fix. The easiest way to do this is likely with the knife-acl gem:

knife acl bulk add group clients nodes '.*' update
knife acl add group clients containers nodes update

However, as I mentioned above, I would recommend against doing that until we understand your bootstrap process and why it is failing under Chef 12. In most cases it is possible to make a minor modification to your bootstrap procedure and avoid opening up node permissions so widely.