You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When deploying opsman to vSphere, it fails to boot 15% of the time. It
happens very early in the boot process, apparently even before loading
the kernel. When viewing the opsman's VM's console, the symptom is a
flashing cursor in the upper left hand side of the screen.
This commit fixes that failure by waiting 80 seconds for the opsman VM
to report its IP address to vCenter, and if it hasn't reported its IP
address by then, it sends a hardware reset to the VM. An opsman VM
typically reports its IP address to vCenter 43 seconds after being
powered-on.
We verified this fix by successfully deploying & booting opsman 146
times in a row.
More about the boot failure:
- The boot failure only occurs the very first time an opsman is booted;
subsequent boots will always succeed. We tested 100 shutdown/boots to
confirm.
- The failure was seen both on vSphere 7 and vSphere 8.
- Sending a reset or a ctl-alt-del to the machine within the first few
seconds of being powered-on reduced but did not eliminate the failure.
This fix should have negligible impact on the length of time to deploy
opsman.
Typical output when resetting a failed initial boot:
```
Executing: "govc vm.info -vm.ipath=/dc/vm/pcf_vms/om.tas.nono.io -waitip"
This could take a few moments...
VM hasn't acquired IP, is probably stuck, resetting VM to free it
Executing: "govc vm.power -vm.ipath=/dc/vm/pcf_vms/om.tas.nono.io -reset"
This could take a few moments...
govc[stdout]: Reset VirtualMachine:vm-42616... OK
```
0 commit comments