Knife performance on Mac OS X

Does anyone know why knife commands take >20 seconds on Mac OS X?

It’s not my machine alone – my colleague’s Mac has the same issue. I’m running El Capitan and ChefDK 0.8.0. He’s on Yosemite and some indeterminate release of the ChefDK. Both are current MacBook Pros with core i7s and flash storage.

Knife commands entered on our server are lightning fast. We doubt it’s a network issue between our Macs and our server.

Maybe a path incorrectly set?

Suggestions appreciated – there’s lots I prefer to do on the client but the lag time makes it inconvenient.

Howdy @yobyot! It’s most likely because knife scans all gems looking for plugins. One option to fix this would be to run gem cleanup, but that’s just going to delay the issue from coming back for a while. The ultimate solution is this issue, which was merged to Chef master on August 25th, 2015. Once this change is released, knife will no longer scan all gems; it, “stores the paths to knife plugins in a specially formatted plugin_manifest entry. This lowers the overhead of subsequent knife invocations.” (link to source)

So in the interim, I’d run gem cleanup, and look for the long term improvement in an upcoming Chef release. Cheers!

Hi,

Knife commands entered on our server are lightning fast. We doubt it's a network issue between our Macs and our server.

Running your knife command with -VV can often help you determine if
this is network slowness. However, I think that the
most common causes for knife slowness are:

  • Poor ruby/rubygems performance in the face of many installed gems
  • Large JSON payloads being returned from the API that take time to process.

If you are seeing this slowness on simple commands that only return a
small amount of data, the latter likely isn't the case. If you
install
ChefDK 0.9.0, you can try out two new features of knife to help
determine the cause:

  • knife rehash: This caches the paths of plugins on disk and thus
    avoids much of the rubygems slowness.
  • knife null: A knife plugin that does nothing, useful for testing
    speed issues such as this.

In my case, knife rehash provides a healthy speedup:

> time knife null
WARNING: No knife configuration file found

real    0m3.488s
user    0m3.256s
sys     0m0.227s

 > knife rehash
WARNING: No knife configuration file found
Using knife-rehash will speed up knife's load time by caching the
location of subcommands on disk.
However, you will need to update the cache by running `knife rehash`
anytime you install a new knife plugin.
Knife subcommands are cached in
/Users/sdanna/.chef/plugin_manifest.json. Delete this file to disable
the caching.

 > time knife null
WARNING: No knife configuration file found

real    0m0.524s
user    0m0.431s
sys     0m0.083s

0.5 seconds is still too slow in my book, but I have a fairly
pathological local setup currently.

I hope this helps!

Cheers,

Steven

1 Like

Apologies for what looks like a duplicate response with the same info Martin sent. It appears discourse’s email ingestion is rather slow.

What I do to combat this is use bundler in deployment mode. I run knife via an alias for “bundle exec knife” and then it only sees (and scans) the gems in vendor/bundle in my chef repo - which is just the ones I need for knife plugins.

In fact I believe this was included in Chef Client 12.5.x / ChefDK 0.9.0.

1 Like

It looks like there is some kind of regression in performance in knife though between 12.2.1 and 12.4.1:

https://github.com/chef/chef/issues/3940

I’ve also noticed it being substantially slower when using it via rvm gemsets outside of chef-dk. It’d probably still be good to track this down and fix the non-knife-rehash/non-chefdk case since it seems to have gotten dramatically worse and there probably is something we can do to make it better – and its possible that it would make the other cases faster as well (or at least we should find what it is and validate that knife rehash and appbundler/chef-dk aren’t affected by the same bug).

Hi,

It'd probably still be good to track this down and fix the non-knife-rehash/non-chefdk case since it seems to have gotten dramatically worse and there probably is something we can do to make it better

Totally agree. It seems likely to me that there are a few issues in play.

Some quick instrumentation of require revealed that on my machine, old versions knife-ec2 and knife-openstack (0.10.0) were still installed in my chefdk gem installation. These were accounting for the majority of the load time. In fact, if I remove both and then install knife-openstack 0.10.0 it creates a particularly bad pathological case:

Require on ["fog"] took 25.385403 from /Users/sdanna/.chefdk/gem/ruby/2.1.0/gems/knife-openstack-0.10.0/lib/chef/knife/openstack_base.rb:20:in `<top (required)>'

At first I thought this an issue with requiring an older version of fog. However, I can also create pathological cases with other gem combinations.

I wonder if a recent commit has created a case where it becomes substantially more difficult to determine the correct dependency solution somewhere.

This doesn't explain what iroller is seeing in that ticket unfortunately.

Cheers,

Steven

Well, whatever the problem is it’s truly painful and I wish Opscode would fix it immediately.

Consider developing a cookbook, a step at a time for a newbie (that’s me). I upload a recipe, it fails, then I make changes and I have to wait 30-45 seconds between iterations just for knife to do something simple.

This, combined with all the stuff that just doesn’t work in Windows (like knife ec2 winrm bootstrapping) is making my life with Chef like being in a kitchen on fire in which I’ve been tied to the oven.

Hi Yobyot,

If you’re regularly uploading your cookbook and testing it on a server you
might want to consider using Test Kitchen (http://kitchen.ci/), for a
quicker iteration of your cookboo development.

The Chef Docs also have handy guide on getting you started
https://docs.chef.io/kitchen.html

Hope that helps!

Hi,

Well, whatever the problem is it's truly painful and I wish Opscode would fix it immediately.

Is trying ChefDK 0.9 and knife rehash an option for you? If your experience is similar to other users who have tried it, it should provide some substantial relief for the pain.

Cheers,

Steven

1 Like

While I appreciate the suggestion, this is just more setup, more stuff to digest and more (probable) issues. I don’t think 30 seconds for knife to do something – anything – should mean that I have to digest still more to workaround it.

I think it should be fixed. :smile:

Thanks. But after installing ChefDK 0.9, knife rehash is, apparently, an invalid subcommand on Mac OS X.

AFAIK ChefDK 0.9 isn't a workaround -- it contains the version of the Chef gem with the permanent fix. You could probably also achieve the same goal via using newer Ruby gems, if you match versions to what ChefDK 0.9 ships with.

Also whilst I appreciate that using test kitchen may seem like a “work
around” at this point and certainly more tech to learn. It’s a pretty good
work flow when you get into the swing of things.

Any questions don’t hesitate to ask!

1 Like

Hi,

I think it should be fixed. :smile:

Yes, we agree. For instance, in my second to last message to this thread I dedicated some time to starting an investigation into why knife execution times have slowed down.

knife rehash will likely always lead to some speed improvement, but we still want the default case to be not-pathologically-slow.

But after installing ChefDK 0.9, knife rehash is, apparently, an invalid subcommand on Mac OS X.

If you are interested in pursuing this path further, the first thing I'd recommend is confirm that you have Chef 12.5.1 installed and that it is what you are executing when you run knife:

knife -v
chef -v
ls -al $(which knife)

Cheers,

Steven

Thanks for your help.

After the previous suggestions, I meticulously followed the Mac OS X uninstall instructions and installed the latest .dmg.

Here’s the output:

$ knife -v
Chef: 12.5.1
$ chef -v
Chef Development Kit Version: 0.9.0
chef-client version: 12.5.1
berks version: 4.0.1
kitchen version: 1.4.2
$ ls -al $(which knife)
-rwxr-xr-x  1 root  wheel  1529 Oct  8 00:26 /opt/chefdk/bin/knife

I thought it might be interesting to see exactly how long a simple knife node list command can take on Mac OS X.

Here’s a link to a YouTube video, in which I’ve blanked out the returned node names for security reasons, but which is otherwise un-retouched.

As you can see, this is the definition of “pathologically” slow. :grin:

Hi,

Chef: 12.5.1

Knife rehash should definitely be available with this version of knife. Could you run:

knife rehash -VV

and copy over the full error message you are getting with the stack trace?

Cheers,

Steven

@ssd @yobyot I had the same problem. It took me ~32 seconds to run a simple knife null.

But… 30 sec is the DNS resolution timeout! This is OSX and not Linux (forget about easily debugging network calls), the easiest method to debug is puts, preferably with timestamps and caller filename/lineno (OK, I could hooked require too, but that was too clever for me). That’s what I did.

You’ll break to tears: the culprit was a Socket::gethostbyname hidden behind a Utils::getservername call in WEBrick::Config::General constant setup.

The long story short if you add your hostname to /etc/hosts, you’ll be fine.