New Chef nodes not showing up in chef-server webui nodes list

Hi all,

I have a new chef environment I have set up and am running into some problems. My first problem is that when I add new nodes to chef-server, they are not showing up in the nodes list in the Chef-Server webui.

OS: RHEL7.2 (Server and clients)
Client Node software versions:

  • chef-12.15.19-1.el7.x86_64

Server software versions:

  • chef-server-core-12.9.1-1.el7.x86_6
  • chef-manage-2.4.3-1.el7.x86_64
  • chefdk-0.18.26-1.el7.x86_64

This was a fresh installation of chef-server and chef-manage. When I first installed it and added a node (node1), it showed up in the nodes page in chef-manage:
https://chefserver/organizations/myorganisation/nodes/

However, upon adding more nodes (node2), they would not show up. If I access the nodes page directly like follows, then I can see it in the list:
https://chefserver/organizations/myorganisation/nodes/node2

At some point after discovering this problem, I had to remove and re-add node1. It too now suffers the same problem and is not visible in the list.

When I bootstrapped the nodes, I had to do it manually as politics dictates that we may not ssh in to a server with a user with root privileges. I did this by installing chef-client, and populating /etc/chef with the validation cert, the trusted certs and a client.rb.

When I first run the chef-client, it connects to my chef-server successfully and it is clear that the node is added to the server for the following reasons:

  • You can see the node when you access the URL directly as stated above
  • You can see all added nodes with knife node list
  • You can see individual node info with knife node show node1/node2
  • When accessing the node from the webui, I am able to configure the node (Edit run_list etc)

Any ideas? I have asked this on Stack Overflow and in IRC but got nothing so far. I’m kinda stumped.

Thanks,

Tom…

Do they show up if you do knife search "*:*"? Chef Manage uses the search API to list most objects, so if they show up in search, they should show up in Manage.

If the nodes are missing with knife search, it’s probably a problem with Solr indexing. If the nodes are there with knife search, it could be a bug in Chef Manage, or perhaps it could be something permissions related.

Does running chef-server-ctl tail or chef-manage-ctl tail give you any clues?

Not sure how to solve your problem but hopefully this can get you closer to correctly diagnosing it.

Nope, they do not show up when I run the knife search command:

[chef-repo]$ knife search "*:*"
0 items found

So this suggests it’s a problem with the Solr indexing.

Upon someone else’s suggestion, I did try chef-server-ctl tail but I did not see anything helpful. I am assuming that the problem is with indexing the node when it’s added (As I was previously able to see the initial node that was created). The logs when adding a node show the following: (Shortened for brevity, but I can add the full thing if needed)

==> /var/log/opscode/nginx/access.log <==
"POST /organizations/myorganisation/clients HTTP/1.1" 201 - Client validation?
"GET /organizations/myorganisation/nodes/node1 HTTP/1.1" 404 - It’s not in the server yet so 404 makes sense to me

The below would make sense to me as the node is not there yet.
==> /var/log/opscode/opscode-erchef/crash.log <==
=ERROR REPORT====
{<<"method=GET; path=/organizations/myorganisation/nodes/node1; status=404; ">>,"Not Found"}
==> /var/log/opscode/opscode-erchef/erchef.log <==
[error] {<<"method=GET; path=/organizations/myorganisation/nodes/node1; status=404; ">>,"Not Found"}
==> /var/log/opscode/opscode-erchef/current <==
[error] {<<"method=GET; path=/organizations/myorganisation/nodes/node1; status=404; ">>,"Not Found"}

Trying to add the node?
==> /var/log/opscode/nginx/access.log <==
"POST /organizations/myorganisation/nodes HTTP/1.1" 201 node1 was in the post data
==> /var/log/opscode/nginx/internal-chef.access.log <==
"GET /organizations/myorganisation/principals/node1 HTTP/1.1" 200 "0.012" 591 "-" "opscode-reporting-server reporting pubkey"
"GET /organizations/myorganisation/nodes/node1/_identifiers HTTP/1.1" 200 "0.009" 131 "-" "opscode-reporting-server reporting pubkey"
==> /var/log/opscode/nginx/access.log <==
"POST /organizations/myorganisation/reports/nodes/node1/runs HTTP/1.1" 201
"POST /organizations/myorganisation/environments/_default/cookbook_versions HTTP/1.1" 200
"PUT /organizations/myorganisation/nodes/node1 HTTP/1.1" 200
"POST /organizations/myorganisation/reports/nodes/node1/runs/af33770b-938d-46db-a525-bfd360203b1e HTTP/1.1" 200

==> /var/log/opscode/oc_bifrost/requests.log.1 <==
A whole bunch of GETs PUTs and POSTs for actors, objects and containers. All with status=20x

Looked for the node, didn’t find the node, added the node, looked for the node, found the node…
==> /var/log/opscode/opscode-erchef/requests.log.1 <== erchef@127.0.0.1 method=POST; path=/organizations/myorganisation/clients; status=201 erchef@127.0.0.1 method=GET; path=/organizations/myorganisation/nodes/node1; status=404 erchef@127.0.0.1 method=POST; path=/organizations/myorganisation/nodes; status=201 erchef@127.0.0.1 method=GET; path=/organizations/myorganisation/principals/node1 erchef@127.0.0.1 method=GET; path=/organizations/myorganisation/nodes/node1/_identifiers erchef@127.0.0.1 method=POST; path=/organizations/myorganisation/environments/_default/cookbook_versions; status=200 erchef@127.0.0.1 method=PUT; path=/organizations/myorganisation/nodes/node1; status=200

==> /var/log/opscode/opscode-reporting/reporting.log.1 <==
oc_reporting@127.0.0.1 method=POST; path=/organizations/myorganisation/reports/nodes/node1/runs; status=201;
oc_reporting@127.0.0.1 method=POST; path=/organizations/myorganisation/reports/nodes/node1/runs/af33770b-938c-46db-a525-bfd360203b1e; status=200;

So one thing of note is that i’m not seeing anything here from Solr, but the service is up…
[chef-repo]$ sudo chef-server-ctl status run: bookshelf: (pid 11006) 72303s; run: log: (pid 5561) 1186370s run: nginx: (pid 11049) 72302s; run: log: (pid 5772) 1186366s run: oc_bifrost: (pid 11055) 72302s; run: log: (pid 5366) 1186375s run: oc_id: (pid 11090) 72301s; run: log: (pid 5383) 1186374s run: opscode-erchef: (pid 11123) 72300s; run: log: (pid 5629) 1186369s run: opscode-expander: (pid 11156) 72299s; run: log: (pid 5502) 1186371s run: opscode-reporting: (pid 11161) 72299s; run: log: (pid 20573) 950445s run: opscode-solr4: (pid 11255) 72297s; run: log: (pid 5460) 1186372s run: postgresql: (pid 11304) 72296s; run: log: (pid 5288) 1186376s run: rabbitmq: (pid 11321) 72296s; run: log: (pid 5177) 1186378s run: redis_lb: (pid 11367) 72295s; run: log: (pid 5768) 1186366s

So where do I look from here?

Thanks,

Tom…

I wrote a long reply but Akimet blocked it. I hope someone can push it through.

Without writing it out again (I didn’t copy it first), I will summarise…

knife search returned no results. Nothing looked out of the ordinary in the logs, but I did see that there were no logs from Solr. Solr is running.

Any thoughts?

Thanks,

Tom…

So I created a new environment at home in virtual machines on CentOS7. It all works correctly as expected. I also noted Solr logs where the node was added & indexed in Solr.

Obviously this is not happening on my server at work. How best should I troubleshoot why Solr is apparently not getting called?

You could try doing a reindex: https://docs.chef.io/ctl_chef_server.html#reindex

I did that early on in my troubleshooting steps. It made no difference :confused:

Well, I have resolved the issue. I don’t know what the root cause of the problem was though. Here are some facts for anyone reading this in the future.

When I initially installed, I was panicked by the amount of packages that were automatically installed by chef-server-ctl so I deleted it to have a rethink (Politics).

Then I decided to go ahead and re-install. This is the install that I was having issue with,

After getting nowhere with this, I decided to delete and re-install. It was at this point that I realised my delete may not have been thorough. I deleted again but this time did a more thorough job, looking through the file system for any remnants.

Upon the latest re-install, everything appears to be working correctly. New nodes are now added to solr. Some other problems (Server would start randomly issuing 502 errors) have gone away.

Think I am good for now. Thanks for your assistance!

Tom…
“Have you tried turning it off and on again?” - sigh :slight_smile: