Kitchen times out waiting "to fetch windows admin password" on our EC2 Win 2022 AMI

Hi All,

I'm trying to diagnose a problem I'm having with kitchen using Win Server 2022 in EC2. My organization makes its own golden AMIs, and we've just started rolling our Win Server 2022 golden AMI using a similar process as we make our Win Server 2016 and Win Server 2019 ones.

The problem comes when I try to use kitchen to test one of our cookbooks with our 2022 AMI. Kitchen is able to create an instance and wait for it to become ready, but then I get a bunch of the following messages:

Waited 875/3500s for instance <i-084acc55312e3b0e6> to fetch windows admin password.
Waited 880/3500s for instance <i-084acc55312e3b0e6> to fetch windows admin password.
Waited 885/3500s for instance <i-084acc55312e3b0e6> to fetch windows admin password.
...

Normally (with our 2016, 2019, or even with the basic 2022 AMI that AWS provides in their catalog), we see that line once and then Chef starts converging. It's just with the 2022 one that we rolled starting from the AWS one. (Also note: I haven't tested any of our other cookbooks with this our 2022 AMI. Just diagnosing this one cookbook)

I'm not sure where to even start looking for a solution. Where is kitchen trying to obtain this password from (is it stored with the instance details in AWS? Is it stored on my workstation and AWS didn't set it on the instance?). Is it assured that this message means that kitchen can't even find what should be the password, or could this also mean that it's trying the password, but it's just not getting the instance to respond with it? Any other ideas?

Had the same problem, but I got the solution, I'm Away right now but when I get back home I will write it out in detail. It's a bit too complex to type on my phone

Windows Ready Notification
Tuesday, September 20, 2022
7:32 PM
When running test kitchen against your custom AMI in AWS, you can see in your logs that it seems that Test Kitchen is waiting on the window operating system to complete its boot up. However it never seems to complete and the process will eventually time out.

If you turn Kitchen Destroy off and check the instance number that TK is waiting on , and you try to connect and log into that system with RDP or AWS console you will likely find that it is ready and waiting all the time.

What is happening is that in your custom AMI, as it occurred in mine, there is a command that Windows is supposed to send to Amazon to identify its "Ready"

Here is what I did to resolve the issue.

  1. Build your Windows machine you wish to use for your custom GOLD image AMI.
  2. Login and navigate to this file at this path
    "C:\ProgramData\Amazon\EC2-Windows\Launch\Scripts\SendWindowsIsReady.ps1"
  3. Make note of the path and file, you will need to create a scheduled task to ensure that once the machine is up it executes that powershell script to tell Amazon its ready for work.
  4. Attached is a screen shot of the scheduled task, it should be set to run at system startup.
  5. I tried to paste the XML in but it lost all formatting, so I have attached a screen shot so you can see how to type out the XML in the format you need.
  6. Once you have the scheduled task created then you can log out
  7. Now you can create your goldenimage, and from now on when the system starts up and complete the service startup cycle it will execute that script and tell amazon it is ready for work and your Test Kitchen should execute fine.
    XML picture

screen shot

Secondly in regards to your password, in your goldenimage , set the admin password to something you can always remember, then in your YAML file you can enter in the password you want test kitchen to use. like this
transport:
name: 'winrm'
username: Administrator
password: "your-password-here"

Thank you!!!!

Googling for that command, eventually, led me to (what appears to be) a solution. When you google for this script, you will usually find pages which mention three scripts that are suggested to run at the end of your AMI building:

C:\ProgramData\Amazon\EC2-Windows\Launch\Scripts\SendWindowsIsReady.ps1 -Schedule
C:\ProgramData\Amazon\EC2-Windows\Launch\Scripts\InitializeInstance.ps1 -Schedule
C:\ProgramData\Amazon\EC2-Windows\Launch\Scripts\SysprepInstance.ps1 -NoShutdown

Interestingly, our Packer instructions had the last two, but not the first, so I added that. But that didn't solve the problem, so I connected to the instance to find that that folder didn't even exist. Checking the build logs also revealed that execution of those three scripts was failing (due to file not found). So, the AMI wasn't even being sysprepped.

Investigating that, I eventually learned that those scripts above are part of EC2Launch v1, a package that AWS uses to trigger user scripts and communicate with AWS when instances get started/stopped/restarted. Starting with Windows 2022, AWS is now using EC2Launch v2, which uses a Windows service and a CLI tool to configure its behavior, instead of using those Powershell scripts. (As an aside, AWS has been making EC2Launch v2 available in earlier versions of Windows, if you search for AMIs labeled "EC2LaunchV2-Windows_Server-*")

Eventually, what fixed the problem was replacing the powershell script invocation in the Packer instructions with

C:\Progra~1\Amazon\EC2Launch\ec2launch.exe sysprep

as described in this post on Hashicorp's forum: Creating a Windows 2019 AMI using Packer being driven with CodeBuild. And now, our AMI's are able to converge with Test Kitchen.

I hope this helps some folks in the future, as I expect this change in EC2Launch with Win2022 is going to bite more people.