
OPERATIONAL DEFECT DATABASE
...

...
NSX Manager shows it is using 100% of CPU.A large number of firewall rules have been recently added to the environment.A large number of hosts are in the environment and configured to receive firewall rules.NSX Manager is unresponsive to Web Interface or API commands.ESXi to NSX Manager communication channel appears down for several hosts.In the vsfwd.log file on a host showing communication channel down, you see entries similar to: Re-read credentials to broker <IP Address>:5671: Logging in: Input/output error 2018-04-18T16:00:04UTC rmqClient Closing, No Ack received for Client netClient index 7 In the vsm.log file on the affected hosts, you see entries similar to: 2018-04-18 12:34:50.894 MDT ERROR HeartbeatManagerHeartbeatTimer HeartbeatManager$HeartbeatTask:297 - Client has not responded to the heartbeat for longer than the alert threshold. Peer name = 'com.vmware.vshield.userworld', client token = 'host-71', client id = '<UUID>', last heartbeat response = '4', last published heartbeat = '74' Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
This article provides information on lowering CPU usage and ensuring firewall rules are getting published to all applicable hosts.
This issue occurs because the NSX Manager's communication channel to the ESXi hosts is down or unavailable. This leads to NSX Manager repeatedly trying to reconnect to the ESXi hosts and synchronize the firewall rules.
This issue is resolved in: VMware NSX for vSphere 6.2.8, available at VMware Downloads.VMware NSX for vSphere 6.3.6, available at VMware Downloads.
To work around this issue if you do not want to upgrade: Stop the vsfwd services on all the hosts which should clear out pending queues by running this command: /etc/init.d/vShield-Stateful-Firewall stop Restart the NSX Manager and wait for a few minutes for the services to be in a ready state in the User Interface (UI) before proceeding to the next step. Note: There are no hosts syncs during this time as vsfwd is down. Start the vsfwd service on a few hosts (5-8 hosts) at a time by running this command: /etc/init.d/vShield-Stateful-Firewall start Note: This spikes the NSX Manager CPU for a few mins (~10). Once spike is done, restart the next batch of vsfwds.
Click on a version to see all relevant bugs
VMware Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.