...
VM creation workflow failed with below errors: 2019-08-27 15:38:40.409] [I] Disk 2 Datastore: Example-vSAN-Datastore-1[2019-08-27 15:38:42.277] [I] Datastore: datastore-327516 is not compliant with policy 1e901cf8-3fb6-4401-a53a-f36a0bc2c806[2019-08-27 15:38:46.520] [E] Error in (Workflow:Find Compliant Datastores / Find Compliant Datastores (item3)#47) Can't find compliant datastore for vmhome During VM creation from the vCenter server via vSphere client, the vSAN datastore will be listed as "Incompatible".The "On-disk format version" on the disk-group from Host is shown as empty.The log file /var/run/log/vmkernel.log will have following log pattern: 2019-08-27T20:19:06.141Z cpu30:16298917)MemSched: 14642: uw.16298917 (117469717) extraMin/extraFromParent: 193/193, vsan-scripts (789) childEmin/eMinLimit: 51077/512002019-08-27T20:19:06.141Z cpu30:16298917)MemSched: 14635: Admission failure in path: vsan-scripts/python.16298917/uw.162989172019-08-27T20:19:06.141Z cpu30:16298917)MemSched: 14642: uw.16298917 (117469717) extraMin/extraFromParent: 193/193, vsan-scripts (789) childEmin/eMinLimit: 51077/512002019-08-27T20:19:06.141Z cpu30:16298917)MemSched: 14635: Admission failure in path: vsan-scripts/python.16298917/uw.162989172019-08-27T20:19:06.141Z cpu30:16298917)MemSched: 14642: uw.16298917 (117469717) extraMin/extraFromParent: 256/256, vsan-scripts (789) childEmin/eMinLimit: 51077/512002019-08-27T20:19:06.141Z cpu30:16298917)MemSched: 14635: Admission failure in path: vsan-scripts/python.16298917/uw.162989172019-08-27T20:19:06.141Z cpu30:16298917)MemSched: 14642: uw.16298917 (117469717) extraMin/extraFromParent: 193/193, vsan-scripts (789) childEmin/eMinLimit: 51077/512002019-08-27T20:19:06.141Z cpu30:16298917)MemSched: 14635: Admission failure in path: vsan-scripts/python.16298917/uw.162989172019-08-27T20:19:06.141Z cpu30:16298917)MemSched: 14642: uw.16298917 (117469717) extraMin/extraFromParent: 193/193, vsan-scripts (789) childEmin/eMinLimit: 51077/512002019-08-27T20:19:06.141Z cpu30:16298917)MemSched: 14635: Admission failure in path: vsan-scripts/python.16298917/uw.162989172019-08-27T20:19:06.141Z cpu30:16298917)MemSched: 14642: uw.16298917 (117469717) extraMin/extraFromParent: 256/256, vsan-scripts (789) childEmin/eMinLimit: 51077/512002019-08-27T20:19:06.141Z cpu30:16298917)MemSched: 14635: Admission failure in path: vsan-scripts/python.16298917/uw.16298917++ VOBD.log reported multiple crash dumps. 2019-08-27T01:19:52.934Z: [UserWorldCorrelator] 4892856737536us: [vob.uw.core.dumped] /bin/python3.5(16073983) /var/core/python-zdump.0032019-08-27T01:19:54.805Z: [UserWorldCorrelator] 4892858608293us: [vob.uw.core.dumped] /bin/esxcfg-advcfg(16041217) /var/core/esxcfg-advcfg-zdump.0032019-08-27T01:26:31.742Z: [UserWorldCorrelator] 4893255511330us: [vob.uw.core.dumped] /bin/python3.5(15747859) /var/core/python-zdump.0002019-08-27T01:26:45.602Z: [UserWorldCorrelator] 4893269369854us: [vob.uw.core.dumped] /bin/esxcfg-advcfg(16075670) /var/core/esxcfg-advcfg-zdump.0002019-08-27T01:32:57.460Z: [UserWorldCorrelator] 4893641196016us: [vob.uw.core.dumped] /bin/esxcfg-advcfg(16076867) /var/core/esxcfg-advcfg-zdump.0012019-08-27T01:33:00.455Z: [UserWorldCorrelator] 4893644190338us: [vob.uw.core.dumped] /bin/python3.5(16076993) /var/core/python-zdump.0012019-08-02T03:00:11.763Z: [UserWorldCorrelator] 2739061434437us: [vob.uw.core.dumped] /bin/python3.5(9008562) /var/core/python-zdump.0002019-08-02T03:00:11.763Z: [UserWorldCorrelator] 2739296903918us: [esx.problem.application.core.dumped] An application (/bin/python3.5) running on ESXi host has crashed (1 time(s) so far). A core file may have been created at /var/core/python-zdump.000.2019-08-02T03:00:12.848Z: [UserWorldCorrelator] 2739062519249us: [vob.uw.core.dumped] /bin/cmmds-tool(9008563) /var/core/cmmds-tool-zdump.0002019-08-02T03:00:12.848Z: [UserWorldCorrelator] 2739297988884us: [esx.problem.application.core.dumped] An application (/bin/cmmds-tool) running on ESXi host has crashed (5 time(s) so far). A core file may have been created at /var/core/cmmds-tool-zdump.000. Following crash dumps in /var/core on the ESXi host can be observed: $ ls -altotal 194drwxrwxr-x 2 root root 339 Aug 29 02:34 .drwxrwxr-x 6 root root 85 Aug 29 02:34 ..-rw-rw-r-- 1 root root 1617920 Aug 29 02:34 cmmds-tool-zdump.000-rw-rw-r-- 1 root root 1617920 Aug 29 02:34 cmmds-tool-zdump.001-rw-rw-r-- 1 root root 1630208 Aug 29 02:34 cmmds-tool-zdump.003-rw-rw-r-- 1 root root 2453504 Aug 29 02:34 esxcfg-advcfg-zdump.000-rw-rw-r-- 1 root root 2437120 Aug 29 02:34 esxcfg-advcfg-zdump.001-rw-rw-r-- 1 root root 2424832 Aug 29 02:34 esxcfg-advcfg-zdump.003-rw-rw-r-- 1 root root 1040384 Aug 29 02:34 python-zdump.000-rw-rw-r-- 1 root root 1482752 Aug 29 02:34 python-zdump.001-rw-rw-r-- 1 root root 1826816 Aug 29 02:34 python-zdump.003 Log /var/run/log/hostd.log reports out of memory events: 2019-08-27T17:33:57.357Z error hostd[54081B70] [Originator@6876 sub=SysCommandPosix opID=5192a947-7f-a84a user=vpxuser:VSPHERE.LOCAL\vpxd-extension-624a1e51-ab4a-4131-91f0-236023a6756e] Failed to ForkExec /usr/bin/sh: Cannot allocate memory2019-08-27T17:33:57.357Z error hostd[54081B70] [Originator@6876 sub=SysCommandPosix opID=5192a947-7f-a84a user=vpxuser:VSPHERE.LOCAL\vpxd-extension-624a1e51-ab4a-4131-91f0-236023a6756e] Invocation of process: '/usr/bin/sh' failed: Resource error - Process Creation - Error: VmkuserCompat_ForkExec - Error Cannot allocate memory2019-08-27T17:33:57.357Z error hostd[54081B70] [Originator@6876 sub=SOAP command stub adapter /usr/bin/sh opID=5192a947-7f-a84a user=vpxuser:VSPHERE.LOCAL\vpxd-extension-624a1e51-ab4a-4131-91f0-236023a6756e] Error Resource error - Process Creation - Error: VmkuserCompat_ForkExec - Error Cannot allocate memory starting command2019-08-27T17:33:57.357Z error hostd[54081B70] [Originator@6876 sub=Default opID=5192a947-7f-a84a user=vpxuser:VSPHERE.LOCAL\vpxd-extension-624a1e51-ab4a-4131-91f0-236023a6756e] AdapterServer caught exception: Error Resource error - Process Creation - Error: VmkuserCompat_ForkExec - Error Cannot allocate memory starting command2019-08-27T17:33:57.357Z error hostd[54081B70] [Originator@6876 sub=Default opID=5192a947-7f-a84a user=vpxuser:VSPHERE.LOCAL\vpxd-extension-624a1e51-ab4a-4131-91f0-236023a6756e] Backtrace:--> [backtrace begin] product: VMware ESX, version: 6.5.0, build: build-13004031, tag: hostd, cpu: x86, os: esx, buildType: release--> backtrace[00] libvmacore.so[0x002DC73F]: Vmacore::System::Stacktrace::CaptureWork(unsigned int)--> backtrace[01] libvmacore.so[0x00125DA2]: Vmacore::System::SystemFactoryImpl::CreateQuickBacktrace(Vmacore::Ref&)--> backtrace[02] libvmacore.so[0x000DD98F]: Vmacore::Throwable::Throwable(std::string const&)--> backtrace[03] libvmomi.so[0x0015B02F]--> backtrace[04] libvmomi.so[0x000F9910]: Vmomi::StubImpl::_Invoke_Task(Vmomi::ManagedMethod*, std::vector, std::allocator > >&, Vmacore::Ref&)--> backtrace[05] libvmomi.so[0x00162EF9]: Vmomi::StubBackedManagedObjectImpl::_Invoke(Vmomi::ManagedMethod*, std::vector, std::allocator > >&, Vmacore::Ref&)--> backtrace[06] libvmomi.so[0x000C4FE5]--> backtrace[07] hostd[0x008E423E]--> backtrace[08] hostd[0x008E49DB]--> backtrace[09] hostd[0x008E6BF6]--> backtrace[10] hostd[0x008FBCEA]--> backtrace[11] libvmacore.so[0x00201039]--> backtrace[12] libvmacore.so[0x0012A5C2]: std::function::operator()() const--> backtrace[13] libvmacore.so[0x001F9F19]--> backtrace[14] libvmacore.so[0x001FB7E9]--> backtrace[15] libvmacore.so[0x001FB9BB]--> backtrace[16] libvmacore.so[0x00200AB6]--> backtrace[17] libvmacore.so[0x002019CB]: std::_Function_handler (int)> >::_M_invoke(std::_Any_data const&)--> backtrace[18] libvmacore.so[0x0012A5C2]: std::function::operator()() const--> backtrace[19] libvmacore.so[0x001FEF16]--> backtrace[20] libvmacore.so[0x0013724F]--> backtrace[21] libvmacore.so[0x001FA79A]--> backtrace[22] libvmacore.so[0x001FE535]--> backtrace[23] libvmacore.so[0x002E4584]--> backtrace[24] libpthread.so.0[0x00006CCA]--> backtrace[25] libc.so.6[0x000D4C3E]--> [backtrace end] RVC commands are failing to query the objects on vCenter server command line: C:\Program Files\VMware\vCenter Server\rvc>rvc Try "--help" or "-h" to view more information about RVC usage. No argument is specified, connecting to localhost as administrator@vsphere.local0 /1 localhost/VMware virtual center <XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX>*** command> vsan.support_information************* BEGIN Support info for datacenter DC-NAME ************************** BEGIN Support info for cluster CLUSTER-NAME **************** command>vsan.cluster_info CLUSTER-NAME2018-01-01 00:01:00 +0900: Fetching host info from HOST1 (may take a moment) ...2018-01-01 00:01:00 +0900: Fetching host info from HOST2 (may take a moment) ...2018-01-01 00:01:00 +0900: Fetching host info from HOST3 (may take a moment) ...2018-01-01 00:01:00 +0900: Fetching host info from HOST4 (may take a moment) ...2018-01-01 00:01:00 +0900: Failed to gather from HOST3: RbVmomi::Fault: SystemError: A general system error occurred:2018-01-01 00:01:00 +0900: Failed to gather from HOST1: RbVmomi::Fault: SystemError: A general system error occurred:2018-01-01 00:01:00 +0900: Failed to gather from HOST4: RbVmomi::Fault: SystemError: A general system error occurred:2018-01-01 00:01:00 +0900: Failed to gather from HOST2: RbVmomi::Fault: SystemError: A general system error occurred: Host: HOST1Host: HOST2Host: HOST3Host: HOST4No Fault Domains configured in this cluster*** command>vsan.host_info HOST42018-01-01 00:01:00 +0900: Fetching host info from HOST4 (may take a moment) ...2018-01-01 00:01:00 +0900: Failed to gather from HOST4: RbVmomi::Fault: SystemError: 一A general system error occurred: vsan,host_info HOST4 hit an errorundefined method `[]' for nil:NilClass*** command>vsan.vm_object_info2018-01-01 00:01:00 +0900: Fetching vSAN disk info from HOST1 (may take a moment) ...2018-01-01 00:01:00 +0900: Fetching vSAN disk info from HOST2 (may take a moment) ...2018-01-01 00:01:00 +0900: Fetching vSAN disk info from HOST3 (may take a moment) ...2018-01-01 00:01:00 +0900: Fetching vSAN disk info from HOST4 (may take a moment) ...2018-01-01 00:01:00 +0900: Done fetching vSAN disk infosSystemError: A general system error occurred:C:/Program Files/VMware/vCenter Server/rvc/lib/rvc/lib/vsangeneric.rb:820:in `parse_response'
At the end of backtrace, "canProvisionObjects" reports below events + with fault reason empty: SPBM invokes VsanInternalSystem.canProvisionObjects with a VSAN policy to know the compatibility of VSAN datastore with the policy.he hostd.log on ESXi will show following traces related to this: 2019-08-27T17:33:57.360Z info hostd[54081B70] [Originator@6876 sub=Solo.Vmomi opID=5192a947-7f-a84a user=vpxuser:VSPHERE.LOCAL\vpxd-extension-624a1e51-ab4a-4131-91f0-236023a6756e] Activation [N5Vmomi10ActivationE:0x54c06b10] : Invoke done [canProvisionObjects] on [vim.host.VsanInternalSystem:ha-vsan-internal-system]2019-08-27T17:33:57.360Z verbose hostd[54081B70] [Originator@6876 sub=Solo.Vmomi opID=5192a947-7f-a84a user=vpxuser:VSPHERE.LOCAL\vpxd-extension-624a1e51-ab4a-4131-91f0-236023a6756e] Arg npbs:--> (vim.host.VsanInternalSystem.NewPolicyBatch) [--> (vim.host.VsanInternalSystem.NewPolicyBatch) {--> size = (long) [--> 1--> ],--> policy = "stripeWidthVSANstripeWidth1replicaPreferenceVSANreplicaPreferenceRAID-5/6 (Erasure Coding) - CapacityhostFailuresToTolerateVSANhostFailuresToTolerate2iopsLimitVSANiopsLimit2500vasa20 subprofileProfile-Driven Storage Management2019-08-27T17:33:57.318+00:000Profile-Driven Storage ManagementDummy Policy NameCREATE_SPEC"--> }--> ]2019-08-27T17:33:57.360Z verbose hostd[54081B70] [Originator@6876 sub=Solo.Vmomi opID=5192a947-7f-a84a user=vpxuser:VSPHERE.LOCAL\vpxd-extension-624a1e51-ab4a-4131-91f0-236023a6756e] Arg ignoreSatisfiability:--> false2019-08-27T17:33:57.360Z info hostd[54081B70] [Originator@6876 sub=Solo.Vmomi opID=5192a947-7f-a84a user=vpxuser:VSPHERE.LOCAL\vpxd-extension-624a1e51-ab4a-4131-91f0-236023a6756e] Throw vmodl.fault.SystemError2019-08-27T17:33:57.360Z info hostd[54081B70] [Originator@6876 sub=Solo.Vmomi opID=5192a947-7f-a84a user=vpxuser:VSPHERE.LOCAL\vpxd-extension-624a1e51-ab4a-4131-91f0-236023a6756e] Result:--> (vmodl.fault.SystemError) {--> faultCause = (vmodl.MethodFault) null,--> faultMessage = ,--> reason = ""--> msg = ""--> }2019-08-27T17:33:59.839Z verbose hostd[54081B70] [Originator@6876 sub=PropertyProvider opID=3937b1e6-8c-a84b user=vpxuser:VSPHERE.LOCAL\vpxd-extension-624a1e51-ab4a-4131-91f0-236023a6756e] RecordOp ASSIGN: info, haTask--vim.host.VsanInternalSystem.canProvisionObjects-171239561. Applied change to temp map.2019-08-27T17:33:59.840Z info hostd[54081B70] [Originator@6876 sub=SysCommandPosix opID=3937b1e6-8c-a84b user=vpxuser:VSPHERE.LOCAL\vpxd-extension-624a1e51-ab4a-4131-91f0-236023a6756e] ForkExec(/usr/bin/sh) 16266080
New VM creation may fail by reporting vSAN datastore as "non-compatible" datastore.
In 6.5 release, hostd needs to fork a python child process to run VsanInternalSystem.py which calls VsanInternalSystem API. Then this child process assumed to be destroyed after API call.Some of the child process does not get destroyed , then the python core dump are created by "VsanInternalSystem.py" due to multiple VsanInternalSystem.py python process concurrently running which exhausted resource pool vsan-scripts memory.The reason that multiple VsanInternalSystem.py python processes concurrently run is that the SPBM calls the VsanInternalSystem CanProvisionObjects API whenever it needs to verify the vSAN datastore compatibility when performing a VM provisioning process.
On ESXi 6.5.x, the process VsanInternalSystem.py can be stopped instead of rebooting the affected host.You could list the process by running the command on affected host by following command # ps -cPTgjstz | grep VsanInternalSystem.py Then following command can be used with the process ID provided above: # kill -9 <Process ID> In case you do not see the process ID, the option is to reboot the host by using maintenance mode. You will not see such process on unaffected hosts.