...
1. NetWorker workflows associated with Group Type: VMware and has Dynamic Association enabled (Tag based association), fails with error "Failed to load inventory."2. The workflows do not report VMs under it when an attempt is made to "Start individual clients."3. The below errors are noticed in daemon.log: nsrdisp_nwbg NSR critical Inventory session status returned with a failure. Error: 'Failed to load inventory from ''. Failed to load Category/Tagging metadata: Failed to login to CIS service at 'HTTPS:///rest/com/vmware/cis/session': Post https:///rest/com/vmware/cis/session: net/http: request canceled (Client.Timeout exceeded while awaiting headers)'. nsrdisp_nwbg NSR critical Inventory session status returned with a failure. Error: 'Failed to load inventory from ''. Failed to load Category/Tagging metadata:invalid memory address or nil pointer dereference' Inventory session status returned with a failure. Error: 'Failed to load inventory from ''. Failed to load Category/Tagging metadata: Failed to send REST request to vCenter: Post https:///rest/com/vmware/cis/tagging/tag-association/id:urn:vmomi:InventoryServiceTag:73f530b9-ea0b-48db-b657-c6cf308c2a53:GLOBAL?~action=list-attached-objects: net/http: request canceled (Client.Timeout exceeded while awaiting headers)'.
1. Failure from vCenter in clearing the stale tags from cs.identity:This can be identified by running the below commands on vCenter CLI. Connect to vCenter postgres database: # /opt/vmware/vpostgres/current/bin/psql -d VCDB -U postgres To query the stale tag associated with VM: # select count(*) from cis_kv_keyvalue where kv_provider like 'tagging:%' and kv_key like 'tag_association urn:vmomi:VirtualMachine:%' and regexp_replace(kv_key, 'tag_association urn:vmomi:VirtualMachine:vm-([0-9]+).*', '\1')::bigint not in (select id from vpx_vm); To query the stale tag associated with Host: # select count(*) from cis_kv_keyvalue where kv_provider like 'tagging:%' and kv_key like 'tag_association urn:vmomi:HostSystem:%' and regexp_replace(kv_key, 'tag_association urn:vmomi:HostSystem:host-([0-9]+).*', '\1')::bigint not in (select id from vpx_host); 2. VAPI endpoint failing on vCenter due to memory crunch and there by dumping heap memory:This can be identified by running the below commands on vCenter CLI: # ls -ltrh *hprof* | awk {'print $9'} java_pid62528.hprof java_pid45649.hprof java_pid36715.hprof java_pid2514.hprof java_pid43896.hprof java_pid52081.hprof Each of the above correspond to a time that you noticed the errors on daemon.log on NetWorker. 3. Domain authentication issue.The VMware vCenter environment has been integrated with an external authority (Active Directory/LDAP). An externally authenticated user account was created for integrating NetWorker with VMware; this account is the one use to add the VMware vCenter to NetWorker. An issue is occurring during the authentication of the external account in VMware.
Each of the above cases must be resolved by a VMware administrator by following the below suggestions: 1. Stale Tags: To clear stale tags:A. Stop vpxd and Content library service: service-control --stop vmware-vpxd service-control --stop vmware-content-library B. Connect to vCenter postgres database: /opt/vmware/vpostgres/current/bin/psql -d VCDB -U postgres To delete stale tags: delete from cis_kv_keyvalue where kv_provider like 'tagging:%' and kv_key like 'tag_association urn:vmomi:VirtualMachine:%' and regexp_replace(kv_key, 'tag_association urn:vmomi:VirtualMachine:vm-([0-9]+).*', '\1')::bigint not in (select id from vpx_vm) returning kv_key, kv_value; delete from cis_kv_keyvalue where kv_provider like 'tagging:%' and kv_key like 'tag_association urn:vmomi:HostSystem:%' and regexp_replace(kv_key, 'tag_association urn:vmomi:HostSystem:host-([0-9]+).*', '\1')::bigint not in (select id from vpx_host) returning kv_key, kv_value; delete from cis_kv_keyvalue where kv_provider like 'tagging:%' and kv_key like 'tag_association urn:vmomi:Datastore:%' and regexp_replace(kv_key, 'tag_association urn:vmomi:Datastore:datastore-([0-9]+).*', '\1')::bigint not in (select id from vpx_datastore) returning kv_key, kv_value; A. Run the above two select query commands again to ensure that the count is 0.B. Start services: service-control --start vmware-vpxd service-control --start vmware-content-library C. Logout of vCenter session and login back again and validate the environment, tags, NSX, Back up, Provisioning, and so forth.If vCenter does not look healthy, collect logs from vCenter with the command "vc-support" and contact VMware. 2. VAPI Crashes: Identify the memory allocated to VAPI and increase if the memory is too low, Best practice can be determined by VMware support. #cloudvm-ram-size -l | grep -i vapi vmware-vapi-endpoint = 256 To: #cloudvm-ram-size -l | grep -i vapi vmware-vapi-endpoint = 1120 3. Domain authentication issue: If the above two issues are not observed and the vCenter was added to NetWorker using an external (AD/LDAP) account. Perform the following as a test.Update the VMware vCenter resource in NetWorker to use a VMware SSO account. For example, this can be the administrator@vsphere.local account. It does not have to be the administrator account, another SSO account can be created. Ensure to assign the permissions detailed in the NetWorker VMware Integration Guide, available through: https://www.dell.com/support/home/product-support/product/networker/docsYou can update the VMware user account used by NetWorker using: NetWorker Management Console (NMC): Go to Protection->VMware View->Right-Click the vCenter->Modify vCenter.NetWorker Web User Interface (NWUI): Go to Protection->VMware vCenter->Select the vCenter->Click Edit. If the issue is not observed when using an SSO account, then that suggests an issue during external authentication with the vCenter. You can continue to use the SSO account; otherwise, domain authentication issues must be investigated between the VMware vCenter and Domain administrators.