BugZero | VMware BugID 80020 - vPostgres service fails to start on vCenter Server...

VMware - Defect ID: 80020

vPostgres service fails to start on vCenter Server due to several entries in TRUSTED_ROOT_CRLS VECS Store

VMware - Defect ID: 80020

vPostgres service fails to start on vCenter Server due to several entries in TRUSTED_ROOT_CRLS VECS Store

Last updated on 10/18/2023

Overall: 0N/A

Severity: 0N/A

Community: 0N/A

Lifecycle: 0N/A

What is the BugZero Risk Score?

Vendor details

No defect details.

Overall: 0N/A

Severity: 0N/A

Community: 0N/A

Lifecycle: 0N/A

What is the BugZero Risk Score?

Vendor details

No defect details.

Symptoms

Service vmware-vpostgres fails to start on vCenter Server.Most of the other services as well fail to start, such as vmware-vpxd-svcs and vmware-vpxd. For more information about vCenter services, see Stopping, Starting or Restarting VMware vCenter Server Appliance 6.x & above services.Can't connect to vCenter Database getting the below error,. For more information about VCDB, see Interacting with the vCenter Server Appliance 6.5/6.7/7.0 embedded vPostgres Database Failed to connect to database: ODBC error: (08001) - [unixODBC]could not connect to server: Connection refused--> Is the server running on host "localhost" (127.0.0.1) and accepting--> TCP/IP connections on port 5432? vPostgres logs are not updated with any events. Note: vPostgres are located in /var/log/vmware/vpostgres/postgresql-xx.log In the vpxd.log, you may see entries similar to 2020-07-07T20:18:01.671Z error vpxd[35339] [Originator@6876 sub=vpxdVdb] [VpxdVdb::SetDBType] Failed to connect to database: ODBC error: (08001) - [unixODBC]could not connect to server: Connection refused--> Is the server running on host "localhost" (127.0.0.1) and accepting--> TCP/IP connections on port 5432?--> Retry attempt: 16305 ... Note: The vpxd.log is located at /var/log/vmware/vpxd/vpxd.log vmon-syslog.log doesn't indicate why vmware-vpostgres is not starting. 2020-07-07T20:31:03.805884+00:00 notice vmon Received start request for vmware-vpostgres2020-07-07T20:31:03.806089+00:00 notice vmon <vmware-vpostgres-prestart> Constructed command: /opt/vmware/vpostgres/current/scripts/pg_pre_start|| <vmware-vpostgres-prestart> Constructed command: /opt/vmware/vpostgres/current/scripts/pg_pre_start2020-07-07T20:33:03.040400+00:00 notice vmon Executing service batch op API_HEALTH. IgnoreFail=1, service count=102020-07-07T20:33:03.040808+00:00 notice vmon <vapi-endpoint-healthcmd> Constructed command: /usr/bin/python /usr/lib/vmware-vmon/vmonApiHealthCmd.py -n vapi-endpoint -u /vapiendpoint/health -t 302020-07-07T20:33:03.041005+00:00 notice vmon <rhttpproxy-healthcmd> Constructed command: /usr/bin/python /usr/lib/vmware-rhttpproxy/rhttpproxy-vmon-apihealth.py2020-07-07T20:33:03.041184+00:00 notice vmon <vmware-vpostgres> Skip service health check. State STOPPED, Curr request 12020-07-07T20:33:03.041356+00:00 notice vmon <vcha> Skip service health check. State STOPPED, Curr request 02020-07-07T20:33:03.041535+00:00 notice vmon <vmware-postgres-archiver> Skip service health check. State STOPPED, Curr request 02020-07-07T20:33:03.041711+00:00 notice vmon <vpxd-svcs> Skip service health check. State STOPPED, Curr request 02020-07-07T20:33:03.041882+00:00 notice vmon <vpxd> Skip service health check. State STOPPING, Curr request 12020-07-07T20:33:03.042051+00:00 notice vmon <sps> Skip service health check. State STOPPED, Curr request 02020-07-07T20:33:03.042221+00:00 notice vmon <rbd> Skip service health check. State STOPPED, Curr request 02020-07-07T20:33:03.042407+00:00 notice vmon <pschealth> Skip service health check. State STOPPED, Curr request 02020-07-07T20:33:03.354545+00:00 notice vmon Successfully executed service batch operation API_HEALTH. Note: The vmon-syslog.log is located at /var/log/vmware/vmon/vmon-syslog.log In vpxd-svcs.log you may see the blow error SQL Error: org.apache.commons.dbcp.SQLNestedException: Cannot create PoolableConnectionFactory (Connection refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.)Note: The vpxd-svcs.log is located at /var/log/vmware/vpxd-svcs/vpxd-svcs.log

Cause

This is caused due to corrupted certificates under /etc/ssl/certs , which causes an unexpectedly high number of certificate entries in TRUSTED_ROOT_CRLS store.To confirm the cause of the issue, run the below command on the VCSA. If you are using an external PSC, run the following command on the vCenter and PSC both:# /usr/lib/vmware-vmafd/bin/vecs-cli entry list --store TRUSTED_ROOT_CRLS | grep NumberOutput should look like:Number of entries in store : 3165Notes: If the output of the command is a big number (like hundreds or thousands), proceed with the resolution in this article.In case of External Platform Service Controller, the above command will be run on the Platform Service Controller and vCenter both per the above.

Resolution

To resolve this issue, remove the extra entries in the TRUSTED_ROOT_CRLS store following the below steps: Take an offline Snapshot of the VCSA virtual machine (and the Platform Service Controller virtual machine in case of external PSC). Caution: Do NOT skip this step. Connect to the VCSA (and the external PSC, if you are using one) through ssh.Download the "crl-fix.sh" script attached to this article and upload to the impacted VCSA/PSC in the /tmp (or to the external Platform Service Controller) using WinSCP or copy its contents to a text file on the appliance using vi editor. Note: If you get an error of the below while connecting to the appliance via WinSCP run the following command. For more information, see Error when uploading files to vCenter Server Appliance using WinSCP (2107727).# chsh -s /bin/bash root as per above the link.Host is not communicating for more than 15 seconds. If the problem repeats, try turning off 'Optimize connection buffer size'.orCannot initialize SFTP protocol. Is the host running an SFTP server? Browse to the /tmp directory. # cd /tmp Run the below command to make the file executable. # chmod +x crl-fix.sh Note: The script will take some time before showing any progress. Run the crl-fix.sh script. # ./crl-fix.sh Note: If you got the below error while running the script:bash: ./crl-fix.sh: /bin/bash^M: bad interpreter: No such file or directoryThis error is caused by DOS carriage returns added to the script when copying from a Windows-based text editor. To resolve this problem, run this command and rerun the script: # sed -i -e 's/\r$//' crl-fix.sh Notes: The script may take some time before showing any progress depending on the number of entries in the TRUSTED_ROOT_CRLS store.When the script completes, it should stop the vmafdd service and start it again as below: Restart services of the VCSA and/or the external PSC # service-control --stop --all# service-control --start --all

Original Vendor Announcement

No bugs this month

Ready to prevent the next vendor outage?

Get a demo

OPERATIONAL DEFECT DATABASE

VMware - Defect ID: 80020

vPostgres service fails to start on vCenter Server due to several entries in TRUSTED_ROOT_CRLS VECS Store

VMware - Defect ID: 80020

vPostgres service fails to start on vCenter Server due to several entries in TRUSTED_ROOT_CRLS VECS Store

Last updated on 10/18/2023

Vendor details

Vendor details

Description

Symptoms

Cause

Resolution

Links

Top VMware defects by risk score

Ready to prevent the next vendor outage?