Hello everyone,
I am new to Chef, even though I have many years of experience in systems management. I hope that somebody from development will find this useful.
In no way this is a rant post, even though it might sound like one for somebody. I mean no disrespect and my intention is to provide feedback to hopefully make product better.
I'd like to talk about Client <-> Server SSL trust requirement. This one is very common nowadays. Lots of services/products/tools/etc require that communication is happening over https and that server certificate should be trusted by client. It makes sense in some scenarios, but not everywhere. I understand why https requirement for data-in-transit protection is so common, but trust... Think about it.
First, server certificate. In ideal world server has publicly trusted certificate, valid, rotated properly, and so on. In real life certificate is likely to be issued by internal CA or remain self-signed, validity period may expire (we all humans and we make mistakes, forget things) and I can think of bunch of other reasons why certificate is there but deemed not to be trusted by clients. Suddenly expired certificate should not prevent clients from keep working with server. Service has to be resilient - i.e. it should keep working even with some errors around. It is OK to log warnings, to somehow else bring administrator's attention to the fact that certificate trust is a generally good thing, but to stop working altogether is way too severe.
Second, Chef Infra Client is a configuration management agent. Ideally, this is the first and only thing that needs to be installed on a new server to onboard it. Trusted certificates management is part of configuration management and should be handled by Chef Infra Client as well. True, "knife bootstrap" has some workarounds for it, but this becomes especially important in various unattended installation scenarios were using "knife bootstrap" is not an option.
Besides, SSL certificate serves 2 purposes: data-in-transit protection (encryption part) and identity validation (trust chain part). What certificate trust validation protects from? It protects from rogue Chef Server. Is it likely scenario? That's debatable, but I'd say highly unlikely. Not impossible, but unlikely. Much less likely than certificate mishandling. And difficult in exploitation. How important it is to protect from such scenario? Well, it really depends on risks model, which is unique for every organization. If organization decides that it is something it wants, it should be able to enable and enforce it accepting all operational risks, but having it by default is too much in my opinion. Yes, is is possible to set "ssl_verify_mode :verify_none", but it is not default behavior meaning that many environments are susceptible to suddenly stop working due to human failing to properly manage certificates. This topic is about balance between resiliency and security and I am convinced that in Configuration Management area resilience is more important unless specifically decided otherwise by organization.
Don't you agree? Won't it make everyone's life better?