We are exploring different options to consume chef as a deployment tool for infrastructure and applications. The challenge or problem right now is segregation of duties. We have different teams working on different levels, for example:
Base team - OS related stuff
Security team - Security related stuff
Platform team - For management of the existing stack
Application team - For application deployments
We will be having all 4 set of teams mentioned above to be consuming chef and want to make sure that there is no conflict in the roles they will perform across the teams while consuming the service. Is there a recommendation or approach from chef that we can follow. We plan to go with single organization though.
Generally this kind of thing is handled using a Berks-based workflow with each of those teams controlling a set of non-overlapping role(oid) cookbooks. There is some level of complexity because you need to ensure that any shared cookbooks or community code being pulled in as a dependency has cross compatible versions for all four teams. Good CI goes a very long way on this, as you’ll catch any potential dependency conflicts early on.
Another option to consider if you really want all four teams to be separate is to use four different Chef configurations on each machine. This is a bit funky to manage but with some tooling it can work. This also makes the workflow easier since each team is fully isolated from each other, other than that their code runs on the same machines in the end.
In either case, make sure you watch our for control conflicts where two different recipes are trying to “own” the same resource, usually resulting in it toggling back and forth between two states.
By way of metaphor, consider building a house. Laying the foundation, electrical, plumbing, etc. are all different disciplines and have a lot of non-shared responsibilities. But the foundation has to match the shape of the house, and you don't want to put an electrical outlet in the bathtub.
A shared Ci system will pull all of your changes together and test them to catch the metaphorical equivalents of these issues before you roll things out to production. For a more technology oriented example, suppose an app team uses a library that wants to read the shadow file, but the security team sets the permissions on that file to 000. You want to catch that before you deploy the app to prod, and when you catch it, the app team and security team need to meet to figure out a plan to fix the issue. The more you try to avoid this, the less value you can get from automation.