
OPERATIONAL DEFECT DATABASE
...

...
LCM upgrade failed with error "VxRail Platform Service on the host xxxxxxxx is not installed or ready"lcm-web.log: 2023-06-23 00:54:03,590 INFO [LCM] [lcm-node-0] c.v.c.c.v.PlatformServiceDOClient [PlatformServiceDOClient.java:224] Get DO platform service status for {NODE Service Tag}, try count is 60 2023-06-23 00:54:03,590 INFO [LCM] [lcm-node-0] c.d.v.l.d.p.r.d.HostDO [HostDO.java:240] Get platform status from configured host by sn via HostDO service 2023-06-23 00:54:04,250 INFO [LCM] [lcm-core-0] c.d.v.l.d.p.r.d.VxrailSystemDO [VxrailSystemDO.java:106] operation status info cache hit LcmNodeUpgrade-4d3137cb-3e9d-4b42-9a4e-b442ff3bde8b 2023-06-23 00:54:04,358 INFO [LCM] [lcm-node-0] c.d.v.l.d.p.r.d.HostDO [HostDO.java:251] Get platform status false Unlike KB Dell VxRail: Upgrade failing on ESXi host with error "VxRail Platform Service on host is not installed/ready", when we run the command below, the ServiceCacheReady returns 'true'. curl --capath /var/lib/vmware-marvin/trust/lin -u root https://<esxi-hostname>:9090/rest/ps/private/v1/status {"ServiceCacheReady": true, "BMCConnected": true} From the VxVerify output, there is no iDRAC nor platform service alert for this node.
Sometimes after a node reboots during an upgrade, it may take a long time for the iDRAC Service Module (ISM) to be in a running state.This can also occur when the customer has set the DCISM to manually run.The LCM will try to check the ISM and platform service status after a node connects to vCenter. The default maximum number of attempts count is 60. That is once every 10 seconds. This can be seen in the lcm-web.log above.If the service starts too slowly, it may timeout and fails to detect that the service has started.
To resolve this issue, extend the number of retry times, and let the LCM wait a little longer. 1. SSH to VxRail Manager and edit the file commons-application.properties. vi /usr/lib/vmware-marvin/marvind/webapps/lcm/WEB-INF/classes/commons-application.properties 2. Change the parameters lcmProperties.platformServiceClient.script.retry.times.get.status from 60 to 150. lcmProperties.platformServiceClient.script.retry.times.get.status=150 This allows the LCM to detect the ISM to be running in time.3. Reboot services. systemctl restart vmware-marvin systemctl restart runjars 4. Validate whether the sfcbd-watchdog is set to auto run (start and stop with host) or manually (Start and stop manually), on each node. chkconfig --list |grep sfcbd If the output "sfcbd-watchdog" returns "off", enable the "Start and stop with host" on the node.
Click on a version to see all relevant bugs
Dell Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.