What compelling reason is there for an application cookbook to use a databag vs. a library cookbook or other artifact repo for that data?
I am hoping for actual use cases where a databag was required or significantly better suited.
Here are my thoughts. Please do correct me where I am wrong so I may learn. Or give me a pointer to the book I need to read to educate myself.
When I look at encrypted databags there is still issue of a secret written to the node which decrypts everything.
This feels like installing a combination lock on your door, and a lock box hanging off the knob which contains that combination, yet still placing the key to the box under the mat.
If we use an off node key management service, there must be some other validation that authorizes giving over my key without having one on disk. With key in ram, any other secret, like an MySQL password, can be delivered as a simple encrypted attribute. It can even be managed as a node attribute stored after convergence as a uniquely encrypted string with the help of that node’s unique key (from the key management) system. ( looking in archives I see its node.run_state ) keeping it out of chef policy files
The databag is not versioned. At convergence a node will use the data and if there is a problem there is no way mark that as bad and to use a different version for all future convergence that is the one current version there is no other.
If a new cookbook version depends on the databag having different content, that databag must be backward compatible with the old cookbook version already in play. Breaking change in a databag can not be pinned to a new cookbook version as there is only one “current” version.
Chef is not a database. If we need to pass real data artifacts, like some MySQL table structure or other data, should they not be handled by our build process and placed in the artifact repo for consumption?
If we want to manage account data, would that also not need to be versioned outside chef? Ideally through an LDAP system? ( best case there are no login accounts on your nodes beyond the initial baseline OS root, everything else would be some sort of daemon role account like smtp, oracle, and so on, using LDAP and SUDO to enable acting manually [but who wants to act manually or even login to even one node, never mind thousands of nodes] ).