...
After upgrading to vCenter Server 6.5 Update 2, the vmware-vapi-endpoint fails to start or crashes.In the endpoint.log file, you see entries similar to: # less /var/log/vmware/vapi/endpoint/endpoint.logCaused by: javax.net.ssl.SSLHandshakeException: com.vmware.vim.vmomi.client.exception.VlsiCertificateException: Server certificate chain is not trusted and thumbprint verification is not configured at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1964) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:328) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:322) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1614) at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1052) at sun.security.ssl.Handshaker.process_record(Handshaker.java:987) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1072) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1397) at com.vmware.vim.vmomi.client.http.impl.ThumbprintTrustManager$HostnameVerifier.verify(ThumbprintTrustManager.java:420) ... 45 moreCaused by: com.vmware.vim.vmomi.client.exception.VlsiCertificateException: Server certificate chain is not trusted and thumbprint verification is not configured at com.vmware.vim.vmomi.client.http.impl.ThumbprintTrustManager.checkServerTrusted(ThumbprintTrustManager.java:206) at sun.security.ssl.AbstractTrustManagerWrapper.checkServerTrusted(SSLContextImpl.java:985) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1596) ... 53 moreCaused by: com.vmware.identity.vecs.VecsGenericException: Native platform error [code: 87][Enum of entries on store 'TRUSTED_ROOT_CRLS' failed. [Server: __localhost__, User: __localuser__]] at com.vmware.identity.vecs.VecsEntryEnumeration.BAIL_ON_ERROR(VecsEntryEnumeration.java:108) at com.vmware.identity.vecs.VecsEntryEnumeration.enumEntries(VecsEntryEnumeration.java:139) at com.vmware.identity.vecs.VecsEntryEnumeration.fetchMoreEntries(VecsEntryEnumeration.java:122) at com.vmware.identity.vecs.VecsEntryEnumeration.<init>(VecsEntryEnumeration.java:36) at com.vmware.identity.vecs.VMwareEndpointCertificateStore.enumerateEntries(VMwareEndpointCertificateStore.java:369) at com.vmware.provider.VecsCertStoreEngine.engineGetCRLs(VecsCertStoreEngine.java:77) at java.security.cert.CertStore.getCRLs(CertStore.java:181) at com.vmware.vim.vmomi.client.http.impl.ThumbprintTrustManager.checkForRevocation(ThumbprintTrustManager.java:246) at com.vmware.vim.vmomi.client.http.impl.ThumbprintTrustManager.checkServerTrusted(ThumbprintTrustManager.java:158) ... 55 more2018-10-15T15:45:59.685+02:00 | INFO | state-manager1 | HealthStatusCollectorImpl | HEALTH ORANGE Failed to retrieve SSO settings from component manager.2018-10-15T15:45:59.685+02:00 | ERROR | state-manager1 | DefaultStateManager | Could not initialize endpoint runtime state.com.vmware.vapi.endpoint.config.ConfigurationException: Failed to retrieve SSO settings. at com.vmware.vapi.endpoint.cis.SsoSettingsBuilder.buildInitial(SsoSettingsBuilder.java:63) at com.vmware.vapi.state.impl.DefaultStateManager.build(DefaultStateManager.java:354) at com.vmware.vapi.state.impl.DefaultStateManager$1.doInitialConfig(DefaultStateManager.java:168) at com.vmware.vapi.state.impl.DefaultStateManager$1.run(DefaultStateManager.java:151) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) In the vmafdd-syslog file, you see the same certificates being pushed to VECS over and over. You can verify this by running the command. # grep "Added cert to VECS DB" /var/log/vmware/vmafdd/vmafdd-syslog.log18-10-13T11:27:24.090346+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: 7ec611450f6b70edd15c936358731ce2a103003818-10-13T11:28:24.085596+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: 7ec611450f6b70edd15c936358731ce2a103003818-10-13T11:29:24.089158+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: 7ec611450f6b70edd15c936358731ce2a103003818-10-13T11:30:24.041227+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: 7ec611450f6b70edd15c936358731ce2a103003818-10-13T11:31:24.084083+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: 7ec611450f6b70edd15c936358731ce2a103003818-10-13T11:32:24.095645+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: 7ec611450f6b70edd15c936358731ce2a103003818-10-13T11:33:24.087458+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: 7ec611450f6b70edd15c936358731ce2a103003818-10-13T11:34:24.318936+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: e8366b69a8bf3724a3c2446a3e1cc8cb3eaf44e418-10-13T11:35:24.091393+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: e8366b69a8bf3724a3c2446a3e1cc8cb3eaf44e418-10-13T11:36:24.108070+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: e8366b69a8bf3724a3c2446a3e1cc8cb3eaf44e418-10-13T11:37:24.082253+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: e8366b69a8bf3724a3c2446a3e1cc8cb3eaf44e418-10-13T11:38:24.098974+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: e8366b69a8bf3724a3c2446a3e1cc8cb3eaf44e418-10-13T11:39:24.084759+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: e8366b69a8bf3724a3c2446a3e1cc8cb3eaf44e418-10-13T11:40:24.086880+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: e8366b69a8bf3724a3c2446a3e1cc8cb3eaf44e418-10-13T11:41:24.092401+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: e8366b69a8bf3724a3c2446a3e1cc8cb3eaf44e418-10-13T11:42:24.099424+02:00 notice vmafdd t@140015749916416: Added cert to VECS DB: e8366b69a8bf3724a3c2446a3e1cc8cb3eaf44e4 Note: The CRL store is filled with spurious entries and the number grows indefinitely over time. Run the following command to see the current number and to monitor growth:# /usr/lib/vmware-vmafd/bin/vecs-cli entry list --store TRUSTED_ROOT_CRLS --text | wc -l
This issue is caused by one or more corrupt CRL files in /etc/ssl/certs. To verify that you have corrupt entries complete the following steps. SSH to the vCenter Server Appliance.Navigate to the /etc/ssl/certs location and run the following command to return the "Authority Key Identifier" for all CRLs, if you see a failure then you may have a corrupt entry. # for i in `grep -l "BEGIN X509 CRL" *`;do openssl crl -inform PEM -text -noout -in $i | grep -A 1 " Authority Key Identifier";done Expected output example: X509v3 Authority Key Identifier:keyid:DF:35:D5:0F:B8:82:A3:5E:02:CA:CA:34:04:16:0F:90:92:EA:B6:5CX509v3 Authority Key Identifier:keyid:6F:F3:16:F6:4C:3D:D7:02:D9:EA:2B:C4:7A:3A:14:5F:D5:9F:A6:27X509v3 Authority Key Identifier:keyid:73:58:C4:3D:55:7C:1A:83:7C:5C:63:68:BA:B8:9D:1E:1E:DA:E0:80X509v3 Authority Key Identifier:keyid:0D:E7:CA:07:1C:CA:01:DC:57:EE:30:6E:FC:FB:55:86:39:96:D0: Run the following command to check for an corruption relating to CA certificates. This should return with the "Subject Key Identifier", if you see a failure then you may have a corrupt entry. # for i in `grep -l "BEGIN CERTIFICATE" *`;do openssl x509 -in $i -noout -text | grep -A 1 "Subject Key Identifier";done Expected output example: X509v3 Subject Key Identifier:6A:72:26:7A:D0:1E:EF:7D:E7:3B:69:51:D4:6C:8D:9F:90:12:66:ABX509v3 Subject Key Identifier:6A:72:26:7A:D0:1E:EF:7D:E7:3B:69:51:D4:6C:8D:9F:90:12:66:ABX509v3 Subject Key Identifier:C7:A0:49:75:16:61:84:DB:31:4B:84:D2:F1:37:40:90:EF:4E:DC:F7X509v3 Subject Key Identifier:C7:A0:49:75:16:61:84:DB:31:4B:84:D2:F1:37:40:90:EF:4E:DC:F7
To resolve this issue, delete any corrupt files in /etc/ssl/certs and remove all entries from the CRL store so that VMDIR push down fresh certificates to VECS. This in turn allows the VAPI service to start successfully. Ensure you a have a valid backup or snapshot of the vCenter Server before proceeding. Overview of Backup and Restore options in vCenter Server 6.x (2149237)A script has been written to automate this process. SSH to the vCenter Server Appliance. CD into /tmp. Create a file for the script. For example # vi crl-fix.shCopy and paste the following into the file: #!/bin/bashcd /etc/ssl/certsmkdir /tmp/pemsmkdir /tmp/OLD-CRLS-CAsmv *.pem /tmp/pems && mv *.* /tmp/OLD-CRLS-CAsh=$(/usr/lib/vmware-vmafd/bin/vecs-cli entry list --store TRUSTED_ROOT_CRLS --text | grep Alias | cut -d : -f 2)for hh in "echo "${h[@]}"";do echo "Y" | /usr/lib/vmware-vmafd/bin/vecs-cli entry delete --store TRUSTED_ROOT_CRLS --alias $hh;donemv /tmp/pems/* .for l in `ls *.pem`;do ln -s $l ${l/pem/0};doneservice-control --stop vmafdd && service-control --start vmafdd Save the file and change the permissions before executing the script. # chmod +x crl-fix.sh Run the script using following syntax. # ./crl-fix.sh Reboot the vCenter Server Appliance.