Search returns for nodes sometimes reduced or empty

Hi

I use search quite heavily in my Chef recipes, mainly to return a list
of nodes based on a provided environment and role. I then assign
attributes to the results of these searches, which are then references
in configuration files on other nodes. An example might be, a web host
might need to know which app hosts it can pass work off to, and the app
hosts might be an occasionally shifting number. So, great, dynamic
assignment of values in the config based on search.

However, I have hit two areas where the search really isn’t working for
us as intended/expected. And I don’t know if this is my fauly
expectations, my faulty implementation , or something wrong in Chef.

  1. Scenario 1 - Nodes vanish from search when I update chef via knife.
    I make updates to files in the git repo and then import them to chef
    couch/solr using:
    “knife cookbook upload”, “knife environment from file”, "knife node
    from file, "knife role from file"
    It appears as though running at least some of these commands
    re-creates the nodes as new objects with no roles (rather than updates
    the existing objects) so that the nodes have no roles until the
    chef-client is successfully run on the client host.

  2. Scenario 2 - Nodes vanish from search results intermittently.
    It appears as though intermittently search will not return a node in
    the search results for a role that I know it has. I can manually
    re-populate the roles by running the chef-client executable on the host
    in question.

Both these together mean that at times, a node will not be returned as
the result of a search, meaning the config value used elsewhere is null,
which can cause massive problems for our environments.

Can someone advise if I am doing something wrong in my use of search,
if the above can be explained by any known bugs, or how I can resolve
these issues please. I am unclear on what determines whether a host has
a role - in my mind it should be whether a host has a role assigned in
the config, but sometimes nodes are not returned in the search when they
are assigned in the config, and appear in search only after a
successul client run is performed manually.

Cheers

Dan

Are you using Hosted Chef, or a local installation? Which version?

If you're running your own Chef installation, try editing
/var/lib/chef/solr/conf/solrconfig.xml and changing both maxFieldLength
parameters from 10000 to something much larger (100000 or even higher).
Then restart chef-solr, and do knife index rebuild. That worked for me,
when I was seeing similar behavior (see chef mailing list thread from ~1
week back).

On Wed, Jun 6, 2012 at 7:34 AM, Dan Adams dan@wesuckatcomputers.com wrote:

Hi

I use search quite heavily in my Chef recipes, mainly to return a list of
nodes based on a provided environment and role. I then assign attributes to
the results of these searches, which are then references in configuration
files on other nodes. An example might be, a web host might need to know
which app hosts it can pass work off to, and the app hosts might be an
occasionally shifting number. So, great, dynamic assignment of values in
the config based on search.

However, I have hit two areas where the search really isn't working for us
as intended/expected. And I don't know if this is my fauly expectations, my
faulty implementation , or something wrong in Chef.

  1. Scenario 1 - Nodes vanish from search when I update chef via knife.
    I make updates to files in the git repo and then import them to chef
    couch/solr using:
    "knife cookbook upload", "knife environment from file", "knife node from
    file, "knife role from file"
    It appears as though running at least some of these commands re-creates
    the nodes as new objects with no roles (rather than updates the existing
    objects) so that the nodes have no roles until the chef-client is
    successfully run on the client host.

  2. Scenario 2 - Nodes vanish from search results intermittently.
    It appears as though intermittently search will not return a node in the
    search results for a role that I know it has. I can manually re-populate
    the roles by running the chef-client executable on the host in question.

Both these together mean that at times, a node will not be returned as the
result of a search, meaning the config value used elsewhere is null, which
can cause massive problems for our environments.

Can someone advise if I am doing something wrong in my use of search, if
the above can be explained by any known bugs, or how I can resolve these
issues please. I am unclear on what determines whether a host has a role -
in my mind it should be whether a host has a role assigned in the config,
but sometimes nodes are not returned in the search when they are assigned
in the config, and appear in search only after a successul client run is
performed manually.

Cheers

Dan

--
Ian Marlier | Senior Systems Engineer
Brightcove, Inc.
290 Congress Street, 4th Floor, Boston, MA 02110
imarlier@brightcove.com

Hi Dan,

On Wed, Jun 6, 2012 at 12:34 PM, Dan Adams dan@wesuckatcomputers.com wrote:

  1. Scenario 1 - Nodes vanish from search when I update chef via knife.
    I make updates to files in the git repo and then import them to chef
    couch/solr using:
    "knife cookbook upload", "knife environment from file", "knife node from
    file, "knife role from file"
    It appears as though running at least some of these commands re-creates
    the nodes as new objects with no roles (rather than updates the existing
    objects) so that the nodes have no roles until the chef-client is
    successfully run on the client host.

On this point, are you referring to the "roles" and "recipes" attributes?
These are updated during a chef-client run, in the process of taking
the run_list on the node and expanding it into the full set of recipes
and roles that will be applied.

So it's expected that the contents of those attributes will only
reflect a changed run_list after the next successful chef-client run.

I'm not sure whether it is considered correct for those attributes to
be blanked during "knife node from file" (if that is what's
happening).

Zac

Hi Ian

Thank you, this is Chef server local install, from gems, version 0.10.8

I will try making that adjustment as recommended and see if this fixes
the issue for me

Many thanks

Dan

On 06.06.2012 14:45, Ian Marlier wrote:

Are you using Hosted Chef, or a local installation? Which version?

If you're running your own Chef installation, try editing
/var/lib/chef/solr/conf/solrconfig.xml and changing both
maxFieldLength parameters from 10000 to something much larger (100000
or even higher). Then restart chef-solr, and do knife index rebuild. That worked for me, when I was seeing similar behavior
(see chef mailing list thread from ~1 week back).

On Wed, Jun 6, 2012 at 7:34 AM, Dan Adams <dan@wesuckatcomputers.com
[1]> wrote:

Hi

I use search quite heavily in my Chef recipes, mainly to return a
list of nodes based on a provided environment and role. I then
assign attributes to the results of these searches, which are then
references in configuration files on other nodes. An example might
be, a web host might need to know which app hosts it can pass work
off to, and the app hosts might be an occasionally shifting number.
So, great, dynamic assignment of values in the config based on
search.

However, I have hit two areas where the search really isn't working
for us as intended/expected. And I don't know if this is my fauly
expectations, my faulty implementation , or something wrong in Chef.

  1. Scenario 1 - Nodes vanish from search when I update chef via
    knife.
    I make updates to files in the git repo and then import them to
    chef couch/solr using:
    "knife cookbook upload", "knife environment from file", "knife node
    from file, "knife role from file"
    It appears as though running at least some of these commands
    re-creates the nodes as new objects with no roles (rather than
    updates the existing objects) so that the nodes have no roles until
    the chef-client is successfully run on the client host.

  2. Scenario 2 - Nodes vanish from search results intermittently.
    It appears as though intermittently search will not return a node
    in the search results for a role that I know it has. I can manually
    re-populate the roles by running the chef-client executable on the
    host in question.

Both these together mean that at times, a node will not be returned
as the result of a search, meaning the config value used elsewhere
is null, which can cause massive problems for our environments.

Can someone advise if I am doing something wrong in my use of
search, if the above can be explained by any known bugs, or how I
can resolve these issues please. I am unclear on what determines
whether a host has a role - in my mind it should be whether a host
has a role assigned in the config, but sometimes nodes are not
returned in the search when they are assigned in the config, and
appear in search only after a successul client run is performed
manually.

Cheers

Dan

Hi Zac

Thanks for the response - no, I set a new, manually-set attribute
(nothing from ohai) that contained an array of the nodes returned via
searching on a combinatino of environment and role. Ie i search for all
app servers in the staging environment, and get back an array of hashes
containing the node information for 1-4 hosts. The problem is that if I
have, say, 4 app servers that have that role assigned in that
environment, sometimes search returns 4 and sometimes 3, sometimes 2.

I am hoping that the fix suggested by Ian works, since he says it
worked for him. However, given that I don't really need search as such

  • I don't care what nodes are currently reporting as having the role,
    just what hosts are assigned that role in their definitions file - is
    there another better way that I can access this information from recipes
    other than via search?

Cheers

Dan

On 06.06.2012 15:30, Zac Stevens wrote:

Hi Dan,

On Wed, Jun 6, 2012 at 12:34 PM, Dan Adams
dan@wesuckatcomputers.com wrote:

  1. Scenario 1 - Nodes vanish from search when I update chef via
    knife.
    I make updates to files in the git repo and then import them to chef
    couch/solr using:
    "knife cookbook upload", "knife environment from file", "knife node
    from
    file, "knife role from file"
    It appears as though running at least some of these commands
    re-creates
    the nodes as new objects with no roles (rather than updates the
    existing
    objects) so that the nodes have no roles until the chef-client is
    successfully run on the client host.

On this point, are you referring to the "roles" and "recipes"
attributes?
These are updated during a chef-client run, in the process of taking
the run_list on the node and expanding it into the full set of
recipes
and roles that will be applied.

So it's expected that the contents of those attributes will only
reflect a changed run_list after the next successful chef-client run.

I'm not sure whether it is considered correct for those attributes to
be blanked during "knife node from file" (if that is what's
happening).

Zac

Hi Dan,

On Wed, Jun 6, 2012 at 4:00 PM, Dan Adams dan@wesuckatcomputers.com wrote:

Thanks for the response - no, I set a new, manually-set attribute (nothing
from ohai) that contained an array of the nodes returned via searching on a
combinatino of environment and role. Ie i search for all app servers in the
staging environment, and get back an array of hashes containing the node
information for 1-4 hosts. The problem is that if I have, say, 4 app servers
that have that role assigned in that environment, sometimes search returns 4
and sometimes 3, sometimes 2.

I may have been unclear in my question - sorry if below I repeat
something you already know.
When you are searching, what is your search query? Are you searching
for "roles:yourrole" or something else?

I am hoping that the fix suggested by Ian works, since he says it worked for
him. However, given that I don't really need search as such - I don't care
what nodes are currently reporting as having the role, just what hosts are
assigned that role in their definitions file - is there another better way
that I can access this information from recipes other than via search?

Each node has a run_list, which contains both recipes and roles.
Roles can provide more run_list items - including other roles, which
may add still more. So, when chef-client runs, it looks at the
run_list and expands the items on it as far as it can. This process
produces the "expanded run list", and chef-client puts these expanded
lists into the "roles" and "recipes" attributes.

You can see this by comparing the following:
knife node show -a run_list
knife node show -a roles
knife node show -a recipes

Say I have a role called "physical-server". This search will only
find nodes that have role[physical-server] on their run_list, and have
successfully run and saved their state back to the server:

knife search node "roles:physical-server"

On the other hand, this search will find all nodes with that item on
their run_list, whether they have had a successful run or not:

knife search node "run_list:role[physical-server]"

I don't use "knife node from file", and can't readily confirm that
that empties data out of the node in the way you describe - if it
does, that's going to cause you problems if you're searching the
contents of a node's "roles". I've tried "knife node edit", and that
appears to leave the "roles" list unchanged.

Zac

Hi Zac

Thanks for your reply, you've been very helpful.

On 06.06.2012 16:30, Zac Stevens wrote:

I may have been unclear in my question - sorry if below I repeat
something you already know.
When you are searching, what is your search query? Are you searching
for "roles:yourrole" or something else?

I have been using something along the lines of this:
search(:node, "chef_environment:#{node['chef_environment']} AND
role:xyz")

as given here on the manual:
http://wiki.opscode.com/display/chef/Search#Search-FindNodeswithaRoleintheRunList

in my recipes to return a list of nodes with the given environment and
role. However, I'm guessing that you are telling me that there are two
ways to search for role, one of which supplies the roles assigned to the
node in its config as imported, and the other that is dynamically
updated based on the outcome of the last run? I am very much looking to
return the former via search - a permanent value that is correct
regardless of the outcome of the last client run, and that is never null
or empty.

Say I have a role called "physical-server". This search will only
find nodes that have role[physical-server] on their run_list, and
have
successfully run and saved their state back to the server:

knife search node "roles:physical-server"

On the other hand, this search will find all nodes with that item on
their run_list, whether they have had a successful run or not:

knife search node "run_list:role[physical-server]"

Aha! So I should simply change my search from the above listed current
form to the below? Brilliant!
search(:node, "chef_environment:#{node['chef_environment']} AND
run_list:role[xyz]")

It definitely sounds like you and Ian have both solved one part each of
the problems I was experiencing - that I was looking at the dynamic
value when I wanted the permanent one (independent of last run status),
and that the reason the dynamic one was letting me down so often was due
to a bug in the shipped value for maxFieldLength in solr.

Thanks very much indeed, I'm pretty confident that the combination of
Ian's suggested fix, and this config change above will solve the issues
I was seeing!

Cheers

Dan

Do note that the nodes environment is accessed with "node.chef_environment" method which is not a node attribute. The "node['chef_environment']" is an attribute that won't exist unless you create it.

On Jun 6, 2012, at 11:17, Dan Adams dan@wesuckatcomputers.com wrote:

Hi Zac

Thanks for your reply, you've been very helpful.

On 06.06.2012 16:30, Zac Stevens wrote:

I may have been unclear in my question - sorry if below I repeat
something you already know.
When you are searching, what is your search query? Are you searching
for "roles:yourrole" or something else?

I have been using something along the lines of this:
search(:node, "chef_environment:#{node['chef_environment']} AND role:xyz")

as given here on the manual:
http://wiki.opscode.com/display/chef/Search#Search-FindNodeswithaRoleintheRunList

in my recipes to return a list of nodes with the given environment and role. However, I'm guessing that you are telling me that there are two ways to search for role, one of which supplies the roles assigned to the node in its config as imported, and the other that is dynamically updated based on the outcome of the last run? I am very much looking to return the former via search - a permanent value that is correct regardless of the outcome of the last client run, and that is never null or empty.

Say I have a role called "physical-server". This search will only
find nodes that have role[physical-server] on their run_list, and have
successfully run and saved their state back to the server:

knife search node "roles:physical-server"

On the other hand, this search will find all nodes with that item on
their run_list, whether they have had a successful run or not:

knife search node "run_list:role[physical-server]"

Aha! So I should simply change my search from the above listed current form to the below? Brilliant!
search(:node, "chef_environment:#{node['chef_environment']} AND run_list:role[xyz]")

It definitely sounds like you and Ian have both solved one part each of the problems I was experiencing - that I was looking at the dynamic value when I wanted the permanent one (independent of last run status), and that the reason the dynamic one was letting me down so often was due to a bug in the shipped value for maxFieldLength in solr.

Thanks very much indeed, I'm pretty confident that the combination of Ian's suggested fix, and this config change above will solve the issues I was seeing!

Cheers

Dan

Hey Joshua (or someone else from opscode). Are there downfalls to setting
the maxfieldlength high as described? I'm having an issue with 10.10 where
a search for an attribute comes up with 1 of 2 nodes that have it and it's
causing deployment problems.

What's even more strange, if I search using the GUI on the chef server I
get 1 node and if I search with knife I get the other node. But both
searches only return 1 node.

Anyway, I don't want to make the suggested change until I understand what
affect it will have. I'm not even sure if my problem is the same but it
sounds like it may be.

MG

On Sun, Jun 10, 2012 at 8:56 PM, Joshua Timberman joshua@opscode.comwrote:

Do note that the nodes environment is accessed with
"node.chef_environment" method which is not a node attribute. The
"node['chef_environment']" is an attribute that won't exist unless you
create it.

On Jun 6, 2012, at 11:17, Dan Adams dan@wesuckatcomputers.com wrote:

Hi Zac

Thanks for your reply, you've been very helpful.

On 06.06.2012 16:30, Zac Stevens wrote:

I may have been unclear in my question - sorry if below I repeat
something you already know.
When you are searching, what is your search query? Are you searching
for "roles:yourrole" or something else?

I have been using something along the lines of this:
search(:node, "chef_environment:#{node['chef_environment']} AND
role:xyz")

as given here on the manual:

http://wiki.opscode.com/display/chef/Search#Search-FindNodeswithaRoleintheRunList

in my recipes to return a list of nodes with the given environment and
role. However, I'm guessing that you are telling me that there are two ways
to search for role, one of which supplies the roles assigned to the node in
its config as imported, and the other that is dynamically updated based on
the outcome of the last run? I am very much looking to return the former
via search - a permanent value that is correct regardless of the outcome of
the last client run, and that is never null or empty.

Say I have a role called "physical-server". This search will only
find nodes that have role[physical-server] on their run_list, and have
successfully run and saved their state back to the server:

knife search node "roles:physical-server"

On the other hand, this search will find all nodes with that item on
their run_list, whether they have had a successful run or not:

knife search node "run_list:role[physical-server]"

Aha! So I should simply change my search from the above listed current
form to the below? Brilliant!
search(:node, "chef_environment:#{node['chef_environment']} AND
run_list:role[xyz]")

It definitely sounds like you and Ian have both solved one part each of
the problems I was experiencing - that I was looking at the dynamic value
when I wanted the permanent one (independent of last run status), and that
the reason the dynamic one was letting me down so often was due to a bug in
the shipped value for maxFieldLength in solr.

Thanks very much indeed, I'm pretty confident that the combination of
Ian's suggested fix, and this config change above will solve the issues I
was seeing!

Cheers

Dan