Chef client bootstrap problem

Hi All,

I am getting this error when I try to bootstrap a node from Chef Work Station

[ec2-user@Splunk-Server chef-repo]$ sudo knife bootstrap 10.17.6.105 --ssh-user ec2-user --identity-file ./Krishna.pem -N Node --sudo
Creating new client for Node
Creating new node for Node
Connecting to 10.17.6.105
10.17.6.105 -----> Existing Chef installation detected
10.17.6.105 Starting the first Chef Client run…
10.17.6.105 Starting Chef Client, version 12.16.42
10.17.6.105 [2016-12-06T22:28:45+00:00] ERROR: Error connecting to https:// ip-10-17-6-172.ops-dev.us-east-1.company.com/organizations/krishna/nodes/Node, retry 1/5
10.17.6.105 [2016-12-06T22:28:50+00:00] ERROR: Error connecting to https:// ip-10-17-6-172.ops-dev.us-east-1.company.com/organizations/krishna/nodes/Node, retry 2/5
10.17.6.105 [2016-12-06T22:28:55+00:00] ERROR: Error connecting to https:// ip-10-17-6-172.ops-dev.us-east-1.company.com/organizations/krishna/nodes/Node, retry 3/5
10.17.6.105 [2016-12-06T22:29:00+00:00] ERROR: Error connecting to https:// ip-10-17-6-172.ops-dev.us-east-1.company.com/organizations/krishna/nodes/Node, retry 4/5
10.17.6.105 [2016-12-06T22:29:05+00:00] ERROR: Error connecting to https:// ip-10-17-6-172.ops-dev.us-east-1.company.com/organizations/krishna/nodes/Node, retry 5/5
10.17.6.105
10.17.6.105 ================================================================================
10.17.6.105 Chef encountered an error attempting to load the node data for "Node"
10.17.6.105 ================================================================================
10.17.6.105
10.17.6.105 Networking Error:
10.17.6.105 -----------------
10.17.6.105 Error connecting to https:// ip-10-17-6-172.ops-dev.us-east-1.company.com/organizations/krishna/nodes/Node - Failed to open TCP connection to ip-10-17-6-172.ops-dev.us-east-1.company.com:443 (getaddrinfo: Name or service not known)
10.17.6.105
10.17.6.105 Your chef_server_url may be misconfigured, or the network could be down.
10.17.6.105
10.17.6.105 Relevant Config Settings:
10.17.6.105 -------------------------
10.17.6.105 chef_server_url "https:// ip-10-17-6-172.ops-dev.us-east-1.company.com/organizations/krishna"
10.17.6.105
10.17.6.105 Platform:
10.17.6.105 ---------
10.17.6.105 x86_64-linux
10.17.6.105
10.17.6.105
10.17.6.105 Running handlers:
10.17.6.105 [2016-12-06T22:29:10+00:00] ERROR: Running exception handlers
10.17.6.105 Running handlers complete
10.17.6.105 [2016-12-06T22:29:10+00:00] ERROR: Exception handlers complete
10.17.6.105 Chef Client failed. 0 resources updated in 26 seconds
10.17.6.105 [2016-12-06T22:29:10+00:00] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out
10.17.6.105 [2016-12-06T22:29:10+00:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report
10.17.6.105 [2016-12-06T22:29:10+00:00] ERROR: Error connecting to https:// ip-10-17-6-172.ops-dev.us-east-1.company.com/organizations/krishna/nodes/Node - Failed to open TCP connection to ip-10-17-6-172.ops-dev.us-east-1.company.com:443 (getaddrinfo: Name or service not known)
10.17.6.105 [2016-12-06T22:29:10+00:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)

I have verified that I can successfully connect to port 443 of Chef server.

[ec2-user@Splunk-Server chef-repo] telnet ip-10-17-6-172.ops-dev.us-east-1.company.com 443 Trying 10.17.6.172... Connected to ip-10-17-6-172.ops-dev.us-east-1.company.com. Escape character is '^]'. ^C^C^CConnection closed by foreign host. [ec2-user@Splunk-Server chef-repo]

You have a space between https:// and ip-10-17-6... in your config’s chef_server_url.

Hi Thomay,

:slight_smile:

It not a typo! This forum doesn’t allow me to paste more than 2 links in a post. So I edited https://ip-10-17-6-172.ops-dev.us-east-1.company.com/organizations/krishna/nodes/Node to https:// ip-10-17-6-172.ops-dev.us-east-1.company.com/organizations/krishna/nodes/Node.

Otherwise it forum won’t allow me to post it!

Any help would be appreciated!

It may be due to the fact that you are trying to use the SSL with knife. knife seem to have a neck of being rigid with the SSL transaction. Did you tried to
knife ssl check
knife ssl fetch

And if your chef server allows http connection, try to edit the 'chef_server_url' in the knife.rb to use 'http' instead of 'https' . (note this is just for testing/debugging). Given the server url, i will bet that the SSL cert will be a self signed one

This error is occurring on the host 10.17.6.105:

This is running on a different system, whatever "Splunk-Server" is:

So the networking problem is that the host 10.17.6.105 doesn't seem to have working DNS. You need to SSH into that box to repro the issue.

1 Like

The SSL Check is successful.

[ec2-user@Splunk-Server chef-repo]$ knife ssl check
Connecting to host ip-10-17-6-172.ops-dev.us-east-1.company.com:443
Successfully verified certificates from `ip-10-17-6-172.ops-dev.us-east-1.company.com

[ec2-user@Splunk-Server chef-repo]$ knife ssl fetch
WARNING: Certificates from ip-10-17-6-172.ops-dev.us-east-1.company.com will be fetched and placed in your trusted_cert
directory (/home/ec2-user/chef-repo/.chef/trusted_certs).

Knife has no means to verify these are the correct certificates. You should
verify the authenticity of these certificates after downloading.

Adding certificate for ip-10-17-6-172.ops-dev.us-east-1.company.com in /home/ec2-user/chef-repo/.chef/trusted_certs/ip-10-17-6-172_ops-dev_us-east-1_company_com.crt

[ec2-user@Splunk-Server chef-repo] sudo knife bootstrap 10.17.6.105 --ssh-user ec2-user --identity-file ./Krishna.pem -N Node --sudo Node Node exists, overwrite it? (Y/N) Y Client Node exists, overwrite it? (Y/N) Y Creating new client for Node Creating new node for Node Connecting to 10.17.6.105 10.17.6.105 -----> Existing Chef installation detected 10.17.6.105 Starting the first Chef Client run... 10.17.6.105 Starting Chef Client, version 12.16.42 10.17.6.105 [2016-12-07T20:31:34+00:00] ERROR: Error connecting to https:// ip-10-17-6-172.ops-dev.us-east-1.company.com/organizations/krishna/nodes/Node, retry 1/5 10.17.6.105 [2016-12-07T20:31:39+00:00] ERROR: Error connecting to https:// ip-10-17-6-172.ops-dev.us-east-1.company.com/organizations/krishna/nodes/Node, retry 2/5 10.17.6.105 [2016-12-07T20:31:44+00:00] ERROR: Error connecting to https:// ip-10-17-6-172.ops-dev.us-east-1.company.com/organizations/krishna/nodes/Node, retry 3/5 10.17.6.105 [2016-12-07T20:31:49+00:00] ERROR: Error connecting to https:// ip-10-17-6-172.ops-dev.us-east-1.company.com/organizations/krishna/nodes/Node, retry 4/5 10.17.6.105 [2016-12-07T20:31:54+00:00] ERROR: Error connecting to https:// ip-10-17-6-172.ops-dev.us-east-1.company.com/organizations/krishna/nodes/Node, retry 5/5 10.17.6.105 10.17.6.105 ================================================================================ 10.17.6.105 Chef encountered an error attempting to load the node data for "Node" 10.17.6.105 ================================================================================ 10.17.6.105 10.17.6.105 Networking Error: 10.17.6.105 ----------------- 10.17.6.105 Error connecting to https:// ip-10-17-6-172.ops-dev.us-east-1.company.com/organizations/krishna/nodes/Node - Failed to open TCP connection to ip-10-17-6-172.ops-dev.us-east-1.company.com:443 (getaddrinfo: Name or service not known) 10.17.6.105 10.17.6.105 Your chef_server_url may be misconfigured, or the network could be down. 10.17.6.105 10.17.6.105 Relevant Config Settings: 10.17.6.105 ------------------------- 10.17.6.105 chef_server_url "https:// ip-10-17-6-172.ops-dev.us-east-1.company.com/organizations/krishna" 10.17.6.105 10.17.6.105 Platform: 10.17.6.105 --------- 10.17.6.105 x86_64-linux 10.17.6.105 10.17.6.105 10.17.6.105 Running handlers: 10.17.6.105 [2016-12-07T20:31:59+00:00] ERROR: Running exception handlers 10.17.6.105 Running handlers complete 10.17.6.105 [2016-12-07T20:31:59+00:00] ERROR: Exception handlers complete 10.17.6.105 Chef Client failed. 0 resources updated in 26 seconds 10.17.6.105 [2016-12-07T20:31:59+00:00] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out 10.17.6.105 [2016-12-07T20:31:59+00:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report 10.17.6.105 [2016-12-07T20:31:59+00:00] ERROR: Error connecting to https:// ip-10-17-6-172.ops-dev.us-east-1.company.com/organizations/krishna/nodes/Node - Failed to open TCP connection to ip-10-17-6-172.ops-dev.us-east-1.company.com:443 (getaddrinfo: Name or service not known) 10.17.6.105 [2016-12-07T20:31:59+00:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1) [ec2-user@Splunk-Server chef-repo]

Yes!!

That fixed the problem. The client was not able to resolve chef server DNS. I added a host entry in client and it worked like a charm!!!

@krishnar will you please specify in detail how you added host entry in client ?
I have encountered same error of TCP connection failure

not sure on your issue, but this seems to have been fixed by adding the chef server;s hostname and ip address on the failing node’s /etc/hosts file (its its a linux). For windows it i think its at C:\windows\drivers\etc\hosts (i will not surprise if this is a wrong path). The node should be able to resolve the chef server

Can you please give me a detailed account on how you solved this problem as i am facing a similar issue.

Please login to your bootstrapping node and add Chef server IP and hostname in /etc/hosts. Save and check whether node machine able to ping chef-server or not and then try bootstrapping.

To fix this specific issue (where the client can’t resolve the Chef server), SSH into the specific box that is failing to bootstrap, and then edit its /etc/hosts file with an entry for your Chef server’s URL. For example, if my Chef server had an IP of 10.9.8.7, and its URL was chef-server.example.com, I would add the following entry to my hosts file:

chef-server.example.com 10.9.8.7

Then save and close the file.

I tried doing what you said but unfortunately the same error still occurs.

I am also receiving same error. I added the hosts entry on client machine still same error. Can you please help ?

check if the target host has wget installed, also check the /etc/resolve.conf for correct entries ( compare with a working node )