Like you said, the greatest barrier to anything new is FUD. I think in your
case the best way forward is through education but to answer your question
here are a few work arounds I have seen in my day.
I appreciate everyone’s response to this thread. It has been a pretty
good discussion and I have gathered some great information from it.
You both are correct. I however, was most interested in the broader
discussion of how others are handling it in their environment.
I wasn’t necessarily looking for “this is how you should do it”, “this is
how you should handle your coworker” or anything like that. Just a broader
discussion of how each were using it in their environment. It helps me jog
some fresh ideas for furthering our implementation here, and helping
increase not only my maturity, but my team as a whole.
Here, we have a young team, who is in the process of migrating from being
“SysOps” or the “Operations Team” to “Development Operations, DevOps”.
There is a bit of culture clash, our engineering team has a ton of talent,
but it is older talent, with older more proven processes to how they do
things, that many of us would consider antiquated. They are very frightened
by the idea of continuous integration, maybe even threatened. Our
environment is evolving, and when I joined the team 6 months ago, I came
with a deep desire to help them go from using an adhoc deployment perl
script to using an automated workflow and true Infrastructure as Code.
Doing so means teaching people who have never written ruby code or worked
with chef how to do so, also, teaching people who have never been around
continuous integration, to understand continuous integration and test
driven infrastructure.
I get looked at like I have a third eye when I say, write a test before
you write any other code.
It’s a steep learning curve, I know because I have been through it. I am
working to increase the teams knowledge and maturity, but there are going
to be bumps along the way.
The suggestion of not running chef-client automatically on our nodes
actually infuriated me at first. Instead of popping off a half thought out
angry email to our CTO, I decided to take some time to think about it and
ask for help from you guys in thinking about it. Again I really appreciate
everyone’s involvement in this thread.
Phillip Roberts | Sr. Linux Systems Administrator
San Mateo | Ann Arbor | New York | London
O 734.922.7014 | C 614.423.9871 <614.423.9871> *| *www.MyBuys.comhttp://www.mybuys.com/
[image: cid:image001.png@01CDED83.57EED120]
From: Greg Zapp [mailto:greg.zapp@gmail.com]
Sent: Tuesday, January 14, 2014 2:20 AM
To: chef@lists.opscode.com
Subject: [chef] Re: Re: Re: Re: Re: RE: Re: Re: Automated check-ins or
not...
Well, Phillip did said "I am being slightly vague on purpose, because I
am looking for full case examples from others using chef and how they are
using it."
-Greg
On Tue, Jan 14, 2014 at 8:01 PM, Lamont Granquist lamont@opscode.com
wrote:
Yeah, but he's talking about a more fundamental problem with his
management/co-workers not being okay with the fundamental idea of an
automated job running which might change system config.
You're off on a completely different planet where you've accepted the
basic premise of "DevOps" (for lack of a better term) and its a question
not of "should we do it?" but "how aggressive?" and thats influenced by how
well along the road to continuous integration / continuous deployment you
are, which would be like trying to explain quantum mechanics to a cave man.
On 1/13/14 5:28 PM, Greg Zapp wrote:
My cookbooks hook into our orchestration server via REST calls to pull
down information about which sites should be configured, etc. During POC
build out I had Chef run every minute, but most of my machines are Windows
servers and Chef is very CPU hungry there. We have modified our
orchestration server to set the updated time for the "pool" when any
resource contained in the "pool" is modified. I wrapped Chef in a .Net
app/service that will first check if the pool has been changed since the
last successful Chef run. This is how we chose to mitigate Chef's CPU
hunger and allow for faster converge times.
-Greg
On Tue, Jan 14, 2014 at 1:56 PM, Lamont Granquist lamont@opscode.com
wrote:
Yeah, places that I've been where managers have been afraid of config
management (CFengine at the time) running on a schedule has resulted in an
accretion of changes over time, and then once enough changes got queued up
that we had to run it on a server and the change window was scheduled and
it was approved by our CRB board and appropriate offerings were burned to
the gods of ITIL, the changes would often wind up causing outages because
so many changes hit the server and it was hard to determine the impact
ahead of time. But the outages were all contained to change windows and
were approved, so I guess that makes it okay.
A tactic that I've used in the past has been to run CM only once per day
and run it with a 12-hour random splay and time it for 8pm-8am. Changes
can be committed during the business day and they don't immediately take
effect, then they can get tested or pushed out manually. And if anything
goes wrong, it'll start hitting servers at 8pm and you have a longer window
before it hits your entire infrastructure and more time for you to get
monitoring alerts and stop the changes rolling out. If you just run Chef
every 30 minutes with a 5 minute random splay, then its likely that by the
time your monitoring alerts you and you start taking action that the change
has hit your entire infrastructure. By only doing the "scheduled" runs
once per day you still keep the deltas between runs small, you allow
yourself some time to stop your CM tool before it all rolls out, and you
also reduce the load on your chef server infrastructure (or on our HEC
infrastructure).
The other thing is that if you only run Chef once a week or once a month
on-demand, then you're not getting the "self-repairing" and SOX/PCI-DSS
"prevent control" features of configuration management. If you're running
it nightly then any junior SA or malicious attacker that logs into the
server and manually changes the state of critical files will have those
changes immediately rolled back. That produces prevent controls that
auditors really like. That also trains your junior SAs to not make with
the typey-typey on the keyboard and to use the CM program -- otherwise they
tend to fall back to old behaviors of making changes on the console and
then its not their fault they did that, its going to be Chef's fault that
it rolled those changes back when its eventually run and reverts those
changes and the service crashes.
On 1/13/14 1:32 PM, David Petzel wrote:
We had quite a few discussions about this as well and at the end of the
day we opted for the ability to do both on-demand as well as scheduled.
There were concerns that without a scheduled check-in the amount of drift
in systems could become large over time on servers that don't routinely get
deployments done. With that drift comes a slew of unknown issues. By
enforcing a schedule run we could be sure that hand modified configurations
didn't stick around very long.
We've setup a report to notify us if a node has not checked-in in the last
day. This helps us catch cases where the schedule run might be failing and
other notification mechanisms might not be catching it (it some nasty
compile error super early in the run)
From there we extended an existing in house tool that lets anyone with
access request a chef run without needing access to the servers.
On Mon, Jan 13, 2014 at 4:16 PM, Phillip Roberts proberts@mybuys.com
wrote:
The problem isn’t my coworker, the problem is a lack of understanding
the tool.
Chef is my baby, and I am perfectly fine with automated check-in’s,
however, just like any business, there are politics at play. There are
fears due to a lack of understanding as well.
I am purposely asking for others use cases because I am interested in them
to help me form my arguments as to why chef nodes should be checking in
(running chef-client) automatically.
I am not asking for anyone to tell me whether we should be using chef, or
how we should be using chef, I am interested in how it is being used in
other environments. I have seen plenty of other environments where I have
implemented chef, however, in all cases, I have implemented chef and the
policies that surround chef. In all cases, this question has never come up,
or this argument.
I appreciate the responses thus far.
Thanks,
Phillip Roberts | Sr. Linux Systems Administrator
San Mateo | Ann Arbor | New York | London
O 734.922.7014 *| C *614.423.9871 <614.423.9871> *| *www.MyBuys.comhttp://www.mybuys.com/
[image: cid:image001.png@01CDED83.57EED120]
From: Christopher Armstrong [mailto:chris@chrisarmstrong.me]
Sent: Monday, January 13, 2014 4:09 PM
To: chef@lists.opscode.com
Subject: [chef] Re: Re: Automated check-ins or not...
Chef as a tool is used for orchestration, converging nodes to a desired
state. If your coworker doesn't want nodes checking in automatically, then
perhaps Chef isn't the ideal tool for you. What does your use case look
like?
On Mon, Jan 13, 2014 at 1:05 PM, Ranjib Dey dey.ranjib@gmail.com wrote:
by check in do you mean chef runs or chef registrations. I am aware of 3
different ways
-
on demand: use rundeck, or mco or capistrano like tools to invoke chef
run. pros: on demand :-), which helps if you deploy your application via
chef. also you can eliminate the need of a validation certificate. cons:
requires additional tooling, special security considerations etc.
-
as service : specify a splay time, and use the standard init scripts to
run chef client as service. pros: no additional configuration required, no
dependency on any other tools. cons: memory leak, stale processes used to
be a pain.
-
as a scheduled job : use cron or rufus like system to run chef on
periodic interval. pros: simple, less prone to memory leaks., cons: infra
has to be designed as evantually consistent, on demand application
deployment can not be done., additional considerations needed on deciding
cron times on individual servers, else u'll storm the chef server.
i have used pretty much all three of these. and i think all of them has
merits. choose any one depending upon what you do, how you are doing it and
how comfortable you are with chef and those tools. most of the issues with
running chef as service are now sorted (or workarounds are known).
best
ranjib
On Mon, Jan 13, 2014 at 12:52 PM, Phillip Roberts proberts@mybuys.com
wrote:
I am interested in hearing what others are doing in terms of allowing
nodes to automatically check in with chef or not. It has recently come up
as a concern with a party in our company, he would prefer to not see nodes
check in automatically with chef (I currently have a cron job that runs
chef-client every X number of minutes).
I am just interested in hearing how others manage this, I am not certain
that I think that manually running chef-client is a good solution.
I am being slightly vague on purpose, because I am looking for full case
examples from others using chef and how they are using it.
Thanks,
Phillip Roberts | Sr. Linux Systems Administrator
San Mateo | Ann Arbor | New York | London
O 734.922.7014 *| C *614.423.9871 <614.423.9871> *| *www.MyBuys.comhttp://www.mybuys.com/
[image: cid:image001.png@01CDED83.57EED120]