How to diagnose issue with chef search returning wrong results?

IgnatZapolsky · March 28, 2017, 10:41am

Hi,

We are running Chef 11 community edition at the moment, and some time ago search started to return weird results - more nodes are returned from search than actually there.

Example:

 knife search node "chef_environment:environment_foo AND roles:service_foo"

This search works OK for most environments, but for 2 envs it returns more nodes that expected. In one of these 2 envs it’s almost every node within environment and in another it’s just few extra nodes.

I wonder if there is a way to diagnose what’s wrong with search?

I’ve tried to re-generate solr index already and that did not fix the issue.

With regards,
Ignat Zapolsky

majormoses · March 31, 2017, 7:51pm

Does the same happen if you use AND between your search terms?

knife search node "chef_environment:environment_foo AND roles:service_foo"

Are all of these “extra” nodes alive? If not it sounds like you need something to ensure that when nodes are decommissioned that you remove from chef server. This is the concept: https://github.com/eheydrick/aws-cleaner although implementation is specific to aws but could be applied to any environment.

IgnatZapolsky · April 3, 2017, 10:35am

Hi,

Thanks for the response - somehow AND condition got missing from my original post - i.e. we have that condition in the query and for some environments that works correctly (or as expected ) but for others - does not.

Also most of these “extra” nodes are alive.
And thanks for pointing out clean-up code for chef - that’s gold!

majormoses · April 3, 2017, 4:00pm

Glad that was helpful. I have not run chef 11 in a while and its possible that there is a bug that was fixed in 12, is it possible to see if you can replicate with chef 12 in a test environment. Can you try running the same within a recipe or using chef-shell to see if its something specific to knife?

IgnatZapolsky · April 3, 2017, 4:13pm

Well, we first encountered that from within a recipe - when suddenly search started to return a lot more nodes than it used to do.

So right now there is no difference between knife search node and recipe-based search.

Not sure if we have chef 12 up and running, and also it seems that problem is subtle - it did not happen for the first year or so, then it happened at small scale, now it’s bigger.

We are currently migrating to 12 since it has so much on offer, so we’ll see if that will be reproducible.

majormoses · April 3, 2017, 5:02pm

OK, other than that what can you tell me about these extra nodes? Are they always the same extra nodes or does it change each chef-client converge?

IgnatZapolsky · April 3, 2017, 5:04pm

During chef-client converge search results don’t change unless there were changes to nodes (i.e. we added / removed nodes within affected environment).

I’ve also tried to export node objects from chef server into json files and they did not have extra roles or info that could be attributed to bad search results.

Not sure if I checked out everything, though.

majormoses · April 3, 2017, 5:08pm

Are any of the environments sharing a chef server?

IgnatZapolsky · April 3, 2017, 5:09pm

In fact, they do - quite a number of envs are sharing the same chef server.

With regards,
Ignat Zapolsky

majormoses · April 3, 2017, 5:14pm

Can you share a gist on an example extra node and an expected node (redact anything sensitive) and the role you are searching on? It would also be helpful to see as any roles that include a role in run lists)?

IgnatZapolsky · April 4, 2017, 4:07pm

Hi,

I’ve create a gist:

Wonder if that’s enough ?

<Node name 1> is good node and <Node name 2> is bad node.

majormoses · April 4, 2017, 4:22pm

Hmm that looks good to me, can you post the contents of the role identify_server just to be sure we didnt miss anything. At this point (given your already existing efforts and that chef 11 is old) I would say either reaching out to chef support for in depth debugging or even better look at upgrading to chef 12. I would suggest given the problems you have rather than migrating chef 11 to 12 it might be best to set up chef 12 from scratch, upload chef artifacts, and work out a client migration strategy.

IgnatZapolsky · April 5, 2017, 10:03am

I’ve added role to the gist: https://gist.github.com/iggyzap/f507cf90f8800e45f2913a62f5f92b67#file-identity_server-json

ssd · April 5, 2017, 12:44pm

Hi,

Is it possible that the there are attributes somewhere else in the node data that have the name chef_environment or roles? That is, if you have something like node["attr_a"]["attr_b"]["roles"] that could result in some false positives because all 'leaf nodes' are indexed and thus can conflict with top level attributes.

Sincerely,

Steven

IgnatZapolsky · April 5, 2017, 12:47pm

Hi Steven,

Thank you for your response,

I’ve looked into that hypothesis earlier - no, we don’t have attribute name clash. Also if we would have attribute name clash, then search won’t work for all environments consistently, but that’s not the case.

With regards,
Ignat Zapolsky

Topic		Replies	Views
Search returns for nodes sometimes reduced or empty Chef Infra (archive)	8	2388	June 11, 2012
Chef Solr/search issue Chef Infra (archive)	3	316	August 12, 2011
Is search broken? Chef Infra (archive)	3	311	October 28, 2011
Possible chef server issue with node search Chef Infra (archive)	0	308	December 19, 2013
Knife search node "role:*" returns wrong results Chef Infra (archive)	1	305	January 16, 2014

How to diagnose issue with chef search returning wrong results?

Related topics