Chef bootstrap stops and completes after rendering chunk steps


#1

Chef bootstrap has been working for all our Windows servers so far except for one, where the bootstrap process stops and ends after the rendering chunk steps. There are no errors, it just completes without finishing the bootstrap process. Any ideas?

Bootstrapping Chef on 10.1.12.40 10.1.12.40 Rendering "C:\Users\BUILDE~1.COR\AppData\Local\Temp\bootstrap-17620-1473358455.bat" chunk 1 10.1.12.40 Rendering "C:\Users\BUILDE~1.COR\AppData\Local\Temp\bootstrap-17620-1473358455.bat" chunk 2 10.1.12.40 Rendering "C:\Users\BUILDE~1.COR\AppData\Local\Temp\bootstrap-17620-1473358455.bat" chunk 3 10.1.12.40 Rendering "C:\Users\BUILDE~1.COR\AppData\Local\Temp\bootstrap-17620-1473358455.bat" chunk 4 10.1.12.40 Rendering "C:\Users\BUILDE~1.COR\AppData\Local\Temp\bootstrap-17620-1473358455.bat" chunk 5 10.1.12.40 Rendering "C:\Users\BUILDE~1.COR\AppData\Local\Temp\bootstrap-17620-1473358455.bat" chunk 6 10.1.12.40 Rendering "C:\Users\BUILDE~1.COR\AppData\Local\Temp\bootstrap-17620-1473358455.bat" chunk 7 10.1.12.40 Rendering "C:\Users\BUILDE~1.COR\AppData\Local\Temp\bootstrap-17620-1473358455.bat" chunk 8


#2

What version of windows is the node running?


#3

Windows Server 2008 Standard SP2


#4

I ran into that at $JOB-2; restarting the Windows Remote Management (WinRM) service helped.

Nathan Clemons
DevOps Engineer
Moxie Cloud Services (MCS)

O +1.425.467.5075
M +1.360.861.6291
E nclemons@gomoxie.com
W www.gomoxie.comhttp://www.gomoxie.com/


#5

I restarted the winrm service and unfortunately that did not help.

net stop winrm
net start winrm

#6

This is something we have seen in windows server 2008 but not 2008 R2. However we released knife-windows 1.6.0 earlier this week and it includes many fixes to various edge cases, including this break on 2008 (r1), and may impact your issue.

Do note that knife-windows 1.6.0 uses newer winrm v2 gems that may not be activated in a chef-dk environment. So there are really 2 options you have to use it today:

1: Use a chef-dk prerelease from our current channel which bundles knife-windows 1.6.0:

curl https://omnitruck.chef.io/install.sh | sudo bash -s -- -c current -P chefdk -v 0.18.13

or

. { iwr -useb https://omnitruck.chef.io/install.ps1 } | iex; install -channel current -project chefdk -version 0.18.13

2: Use bundle exec with a Gemfile including gem "knife-windows"

If you are not familiar with bundler workflows, I recommend the first option.

Matt


#7

Hi Matt,

Thanks the reply and tip about chef-dk pre-release. I think for now we’re going to work with the teams using Server 2008 Standard to upgrade their systems to a more supported platform. If that does not work then we’ll re-visit the pre-release chef-dk idea.

Question: when will the pre-release code you mention be part of an official release version?

Keith


#8

Should be next week. It won’t be that exact version but likely the last successful build from this thursday or friday.


#9

Dumb question: does the latest ChefDK need to be installed on the Chef Linux server or the Windows PCs where we run the bootstrap commands?


#10

You want to install it on the PC where you bootstrap from.


#11

OK, I installed the latest ChefDK on our Server 2012 machine and still getting same issue when trying to bootstrap Server 2008 Standard machine.


#12

if you run:

knife winrm -m 10.1.12.40 "dir %TEMP%" -x administrator -P your_password

Do you see the .bat file listed?

If so, what happens if you run:

knife winrm -m 10.1.12.40 "%TEMP%/that_bat_file.bat" -x administrator -P your_password

if nothing at all, then try:

knife winrm -m 10.1.12.40 "$env:TEMP/that_bat_file.bat" -x administrator -P your_password --winrm-shell powershell

I’m curious here if the PSRP protocol used for the powershell shell will get around this.

Note that if you are running this in a powershell console, make sire to preceed the above $env with a tick(`).

If that still does not work, try running the .bat file directly RDP’d to the box and let me know what you find. Thanks!


#14

I ran the first command on the Server 2012 machine and see 3 .bat files:

bootstrap-10632-1473384594.bat
bootstrap-132-1473385395.bat
bootstrap-2740-1473366715.bat


#15

I’m asking to run them from the 2012 server where you are trying to bootstrap from. The three files are probably from 3 different attempts. I’d use the newest one.


#16

Hi Matt,

Sorry for the delay. I could only run the .bat file on the 2008 server via RDP. Here is the error output:

C:\Users\BUILDE~1.COR\AppData\Local\Temp>(echo.{"run_list":[]}) 1>C:\chef\first-boot.json
Starting chef to bootstrap the node...

C:\Users\BUILDE~1.COR\AppData\Local\Temp>SET     "PATH=C:\oracle\product\client\x86\11.2.0.3\bin;C:\orac
le\product\11.2.0\client_x86_1\bin;c:\oracle\product\11.2.0\client_x64_1\bin;C:\Windows\system32;C:\
Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Windows\System32\Win
dowsPowerShell\v1.0\;C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\;C:\Program Files\Mi
crosoft SQL Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL Server\100\DTS\Binn\;C:\ruby\bin;C
:\opscode\chef\bin;C:\opscode\chef\embedded\bin"

C:\Users\BUILDE~1.COR\AppData\Local\Temp>chef-client -c c:/chef/client.rb -j c:/chef/first-boot.json

Starting Chef Client, version 12.14.89
[2016-09-30T17:37:44-07:00] INFO: *** Chef 12.14.89 ***
[2016-09-30T17:37:44-07:00] INFO: Platform: x64-mingw32
[2016-09-30T17:37:44-07:00] INFO: Chef-client pid: 4920
[2016-09-30T17:37:53-07:00] ERROR: SSL Validation failure connecting to host: server.domain.com - 
SSL_connect returned=1 errno=0 state=error: certificate verify failed

================================================================================
Chef encountered an error attempting to load the node data for "10.1.12.40"
================================================================================

Unexpected Error:

OpenSSL::SSL::SSLError: SSL Error connecting to https://server.domain.com/organizations/wa
g-bellevue/nodes/10.1.12.40 - SSL_connect returned=1 errno=0 state=error: certificate verify failed

Platform:

x64-mingw32


Running handlers:
[2016-09-30T17:37:53-07:00] ERROR: Running exception handlers
Running handlers complete
[2016-09-30T17:37:53-07:00] ERROR: Exception handlers complete
Chef Client failed. 0 resources updated in 08 seconds
[2016-09-30T17:37:53-07:00] FATAL: Stacktrace dumped to c:/chef/cache/chef-stacktrace.out
[2016-09-30T17:37:53-07:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report
[2016-09-30T17:37:53-07:00] FATAL: OpenSSL::SSL::SSLError: SSL Error connecting to https://server.domain.com/organizations/wag-bellevue/nodes/10.1.12.40- SSL_connect returned=1 errno=0
 state=error: certificate verify failed

#17

Ok I fired up a 2008 standard vm and reproduced this. Its a regression introduced in winrmv2 and only affects 2008 standard. 2008 standard does not work well with a UTF-8 codepage and we need to revert to 437. We will fix this in the next release.


#18

Thanks Matt. Fortunately, most of our servers are Server 2012 and 2008 R2, and we’ve already successfully bootstrapped 150+ nodes. We just have a small number that are running 2008 Standard and will try those again when the fix is out.


#19

Hi Matt, is there an ETA for when the fix for Server 2008 Standard will be released? We have a work item tracking this issue and was curious about a timeline. Thanks!


#20

There is a v1.7.0 coming up https://github.com/chef/knife-windows/pull/401 that fixes this. For Windows 2008, you just need to add --winrm-codepage 437


#21

How/where do I add --winrm-codepage 437? Do I add this to the bootstrap process?