...
SDDC Manager update to VCF 4.5.0.0 fails in "VMware Cloud Foundation Service and Platform Upgrades" step. Below error is reported in SDDC Manager UI: orCheck /var/log/vmware/capengine/cap-update/workflow.log indicate "Task validate failed" due to unexpected free space in volume group. (OR) Check the below two log files /var/log/vmware/capengine/cap-update-cleanup/workflow.log/var/log/vmware/capengine/cap-update/workflow.log for errors in reclaiming snapshot disks (example error messages below) Failed to reclaim snapshot disk <device_name> from VG <volume_group>. Error : exit status 5Failed to reclaim snapshot disk <device_name> from VG <volume_group>. Error : exit status 126Failed to reclaim snapshot disk <device_name> from VG <volume_group>. Error : exit status 127 Task Failed Error 2022/10/31 09:19:49.463490 validate.go:99: Debug: vgname:[data_vg] actualVFreeSize: [24996] vFreeSize:[26214] toleranceAllowed:[3932] 2022/10/31 09:19:49.527247 validate.go:99: Debug: vgname:[lcmmount_vg] actualVFreeSize: [124568] vFreeSize:[104857] toleranceAllowed:[15728] 2022/10/31 09:19:49.527298 progress.go:11: Validate failed. VFree size of the volume group lcmmount_vg mismtaches the expectation. Actual: [124568] Expected: [104857]. 2022/10/31 09:19:49.527490 task_progress.go:24: Validate failed. VFree size of the volume group lcmmount_vg mismtaches the expectation. Actual: [124568] Expected: [104857]. 2022/10/31 09:19:49.556785 workflow_manager.go:198: Task validate failed. Error: Validate failed. VFree size of the volume group lcmmount_vg mismtaches the expectation. Actual: [124568] Expected: [104857]. 2022/10/31 09:19:49.556950 workflow_manager.go:138: Stopping workflow execution as task validate failed reclaim-vfree error 1 2022/11/03 21:12:26.914537 reclaimvfree.go:242: Executing command: vgreduce data_vg /dev/sdg1 2022/11/03 21:12:27.014444 reclaimvfree.go:253: Executing command: pvremove -y -ff /dev/sdg1 2022/11/03 21:12:27.126447 reclaimvfree.go:264: Executing command: parted -s -a opt /dev/sdg rm 1 2022/11/03 21:12:27.167333 progress.go:11: Reclaimed snapshot /dev/sdg1 2022/11/03 21:12:27.167401 reclaimvfree.go:242: Executing command: vgreduce lcmmount_vg /dev/sdg2 2022/11/03 21:12:27.167730 task_progress.go:24: Reclaimed snapshot /dev/sdg1 2022/11/03 21:12:27.286985 reclaimvfree.go:253: Executing command: pvremove -y -ff /dev/sdg2 2022/11/03 21:12:27.374610 reclaimvfree.go:264: Executing command: parted -s -a opt /dev/sdg rm 2 2022/11/03 21:12:27.400884 progress.go:11: Reclaimed snapshot /dev/sdg2 2022/11/03 21:12:27.401049 reclaimvfree.go:242: Executing command: vgreduce lcmmount_vg /dev/sdg2 2022/11/03 21:12:27.401154 task_progress.go:24: Reclaimed snapshot /dev/sdg2 2022/11/03 21:12:27.478621 progress.go:11: Failed to reclaim snapshot disk /dev/sdg2 from VG lcmmount_vg. Error : exit status 5 2022/11/03 21:12:27.478859 task_progress.go:24: Failed to reclaim snapshot disk /dev/sdg2 from VG lcmmount_vg. Error : exit status 5 2022/11/03 21:12:27.491478 workflow_manager.go:198: Task reclaim-vfree failed. Error: Failed to reclaim snapshot disk /dev/sdg2 from VG lcmmount_vg. Error : exit status 5 2022/11/03 21:12:27.491630 workflow_manager.go:138: Stopping workflow execution as task reclaim-vfree failed reclaim-vfree error 2 2022/11/03 20:40:06.100186 reclaimvfree.go:242: Executing command: vgreduce data_vg /dev/sdg1 2022/11/03 20:40:06.292377 reclaimvfree.go:253: Executing command: pvremove -y -ff /dev/sdg1 2022/11/03 20:40:06.444020 reclaimvfree.go:264: Executing command: parted -s -a opt /dev/sdg rm 1 2022/11/03 20:40:06.538938 progress.go:11: Reclaimed snapshot /dev/sdg1 2022/11/03 20:40:06.539027 reclaimvfree.go:242: Executing command: vgreduce lcmmount_vg /dev/sde /dev/sdg2 2022/11/03 20:40:06.539239 task_progress.go:24: Reclaimed snapshot /dev/sdg1 2022/11/03 20:40:06.772812 progress.go:11: Failed to reclaim snapshot disk /dev/sde /dev/sdg2 from VG lcmmount_vg. Error : exit status 126 2022/11/03 20:40:06.773629 task_progress.go:24: Failed to reclaim snapshot disk /dev/sde /dev/sdg2 from VG lcmmount_vg. Error : exit status 126 2022/11/03 20:40:06.819900 workflow_manager.go:198: Task reclaim-vfree failed. Error: Failed to reclaim snapshot disk /dev/sde /dev/sdg2 from VG lcmmount_vg. Error : exit status 126 2022/11/03 20:40:06.819970 workflow_manager.go:138: Stopping workflow execution as task reclaim-vfree failed reclaim-vfree error 3 2022/11/07 09:35:18.875054 reclaimvfree.go:242: Executing command: vgreduce lcmmount_vg /dev/sdc /dev/sdg2 2022/11/07 09:35:18.875229 task_progress.go:24: Reclaimed snapshot /dev/sdg2 2022/11/07 09:35:18.941316 progress.go:11: Failed to reclaim snapshot disk /dev/sdc /dev/sdg2 from VG lcmmount_vg. Error : exit status 127 2022/11/07 09:35:18.941490 task_progress.go:24: Failed to reclaim snapshot disk /dev/sdc /dev/sdg2 from VG lcmmount_vg. Error : exit status 127 2022/11/07 09:35:18.959857 workflow_manager.go:198: Task reclaim-vfree failed. Error: Failed to reclaim snapshot disk /dev/sdc /dev/sdg2 from VG lcmmount_vg. Error : exit status 127 2022/11/07 09:35:18.959911 workflow_manager.go:138: Stopping workflow execution as task reclaim-vfree failed
This KB provides instructions to work around update failures caused due to Unused free space in Volume Group(s)Multiple Physical Volume(disk) in Volume Group(s)
The presence of multiple PVs in a volume group causes this failure. To assert this, Login to SDDC Manager as root userRun "vgs" command to check free space available in the volume group(s) Run "vgs" command to check if there are multiple PVs in a Volume group (AND) run "lsblk" to check if /storage/lvm_snapshot mount point is mounted or not available.
Currently there is no resolution. We are working on this
Pre-requisite: Take a Snapshot of the SDDC-Manager VM before proceeding with the workaround. Procedure: Download and copy the attached script (update_failure_workaround.sh) to SDDC manager at /home/vcf location. (Script can be found in the KB attachments)Login to SDDC Manager as vcf user and switch to root user Assign execute permission to the script using the following command cd /home/vcf chmod +x update_failure_workaround.sh Run the below command to identify the Snapshot Device Name grep "Configured" /var/log/vmware/capengine/cap-required-hardware-addition/workflow.log | grep "/storage/lvm_snapshot" example output: Configured disk "/dev/sdg" in the appliance and mounted on /storage/lvm_snapshot Perform the cleanup using the following command ./update_failure_workaround.sh <Snapshot Device> example usage: ./update_failure_workaround.sh /dev/sdg example output: please check for the "Success" at the end. INFO Remove Snapshots if present . . . . INFO Mount all filesystems mentioned in fstab INFO lvm_snapshot is mounted successfully INFO Cleanup Done . INFO altered cap update workflows INFO Success Retry the upgrade from the SDDC Manager UI Once the update finishes, remove the workaround script by running the below command rm /home/vcf/update_failure_workaround.sh