Chef Infra Client certificate trust requirement

Hello everyone,
I am new to Chef, even though I have many years of experience in systems management. I hope that somebody from development will find this useful.
In no way this is a rant post, even though it might sound like one for somebody. I mean no disrespect and my intention is to provide feedback to hopefully make product better.

I'd like to talk about Client <-> Server SSL trust requirement. This one is very common nowadays. Lots of services/products/tools/etc require that communication is happening over https and that server certificate should be trusted by client. It makes sense in some scenarios, but not everywhere. I understand why https requirement for data-in-transit protection is so common, but trust... Think about it.
First, server certificate. In ideal world server has publicly trusted certificate, valid, rotated properly, and so on. In real life certificate is likely to be issued by internal CA or remain self-signed, validity period may expire (we all humans and we make mistakes, forget things) and I can think of bunch of other reasons why certificate is there but deemed not to be trusted by clients. Suddenly expired certificate should not prevent clients from keep working with server. Service has to be resilient - i.e. it should keep working even with some errors around. It is OK to log warnings, to somehow else bring administrator's attention to the fact that certificate trust is a generally good thing, but to stop working altogether is way too severe.
Second, Chef Infra Client is a configuration management agent. Ideally, this is the first and only thing that needs to be installed on a new server to onboard it. Trusted certificates management is part of configuration management and should be handled by Chef Infra Client as well. True, "knife bootstrap" has some workarounds for it, but this becomes especially important in various unattended installation scenarios were using "knife bootstrap" is not an option.
Besides, SSL certificate serves 2 purposes: data-in-transit protection (encryption part) and identity validation (trust chain part). What certificate trust validation protects from? It protects from rogue Chef Server. Is it likely scenario? That's debatable, but I'd say highly unlikely. Not impossible, but unlikely. Much less likely than certificate mishandling. And difficult in exploitation. How important it is to protect from such scenario? Well, it really depends on risks model, which is unique for every organization. If organization decides that it is something it wants, it should be able to enable and enforce it accepting all operational risks, but having it by default is too much in my opinion. Yes, is is possible to set "ssl_verify_mode :verify_none", but it is not default behavior meaning that many environments are susceptible to suddenly stop working due to human failing to properly manage certificates. This topic is about balance between resiliency and security and I am convinced that in Configuration Management area resilience is more important unless specifically decided otherwise by organization.
Don't you agree? Won't it make everyone's life better?

All sorts of configuration secrets pass from chef-server to chef-clients. Reducing the security around that communication by default doesn't seem like a great idea, IMO. :verify_none also isn't the only available option, you can explicitly add certs (or private CA certs) to the local trusted_certs folder which chef-client then honors.

Definitely a few ways of handling this - but I'd urge strongly not to use verify_none in a production environment.

Most common pattern to manage internal CA without ever breaking trust is to follow the below process:

  1. Install current CA cert during any one of image build / provisioning process / or bootstrap script (add to system store and /etc/chef/trusted_certs)
  2. Use chef to keep CA certs up to date (same locations as above)
  3. Use audit cookbook or other process to run inspec with regularity on host. Your profile should have an inspec control to alert if your CA cert is expiring in < 30 days

@rvaughn, but the fact whether client trusts or not trusts server certificate does not affect security of secrets passing from server to client in any way. This is common misconception. Data transfers are still encrypted, SSL/TLS channel is still established. Certificate trust has no relation to data encryption.

@rvaughn and @jvogt, bootstraping is only one side of the problem I am trying to highlight here. Yes, there are multiple ways how certificate can be made trusted during bootstrap, but they don't address another issue: mishandling sever certificate should not stop clients from communicating with server during software life-circle. Because sooner or later there will be a point where certificate is no longer valid. One can build however elaborated and intelligent system around it to update certificates in time, but the fact is, it always end up being dependent on administrator. Sooner or later certificate will expire. Or server name will change. Or update will be released which will tighten security considerations and certificates with certain hash algorithm or key length or something else will no longer be deemed as trusted. Or administrator will forget about it. Or will be on a vacation/business trip/hospital. Or IT staff will change and his replacement won't get to this system yet. Yes, in ideal world this should not occur, there should be documentation, deputies and so on, but the reality is - such problems do happen.
And all this for the sake of what? So that clients won't talk to a rogue Chef server if network has already been compromised...
Please understand me right, I am not saying that rogue Chef server is impossible scenario, but comparing to certificate mishandling this is unlikely scenario. And most likely scenarios should be addressed first.