Hi!
I’m using chef server managed from chef.io and once, and only once, a recipe used the default attributes instead of the node attributes. I can’t understand why that happened, and as this ended up causing downtime, I want to understand why it happened and how to prevent it.
The default attributes are defined like this in /attributes/default.rb
:
...
And the template that uses this attributes is, /templates/default/percona.cnf.erb
, like this:
<% node['asd_mysql']['settings'].sort.each do |key, value| %>
<% next unless value -%>
<%= key %><%=
case value
when TrueClass then ''
else " = #{value}"
end
%>
<% end %>
This template before the chef run that changed attributes to their default value, used the node’s attributes values in the recipe. This was working fine for years and using the node attributes, no change was made to the recipe, and suddenly once chef run and changed the node’s attributes to their default value.
This happened using Chef client: 11.14.2
After this problematic chef run, the node’s attributes were set to the default value in the recipe. Before this, it was using the node attributes configured that were different to the default value in the recipe.
I checked out the chef code and found commit e9f303b9f288c03baee9d8b40cca58838ff3c3a4, merged in a newer version of the one I’m running, that might be related. But I’m not sure that it happens when looking for the node attributes. If it does, maybe an option is that the chef server returned 5xx (or some network error, maybe, too), it might not handled the error correctly and eded up using the default attributes? Then updating the chef client to that version might help prevent it if that is the case
Does anyone know if that is involved in the call chain for looking at the node attributes or have some other clue on where to look or debug this issue that happened only once (we stopped chef on the machine for now)?
Thanks a lot!
Rodrigo