...
NSX-T Data Center 3.2.0/3.2.0.1An NSX-T Manager has been added or removed from the management clusterTransport Nodes, Hosts or Edges, may show Controller Connectivity down on the NSX UIOn an ESX host, similar logging to this example may be seen in /var/run/log/nsx-syslog.log 2022-03-09T13:56:40Z nsx-proxy: NSX 2100995 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" tid="2100995" level="ERROR" invalid="true"] VersionMastershipHandshakeClient: received MasterResponse UUID {<UUID4>} not in {<UUID1>, <UUID2>, <UUID3>} The following behaviour may be observed Prior to making any changes the Transport Node connects to 3 Managers and 1 Controller which is expected behaviour.(from nsxcli shell)> get managers - 192.168.1.10 Connected (NSX-RPC) *- 192.168.1.11 Connected (NSX-RPC) - 192.168.1.12 Connected (NSX-RPC) > get controllers Controller IP Port SSL Status Is Physical Master Session State Controller FQDN 192.168.1.10 1235 enabled connected true up NA 192.168.1.11 1235 enabled not used false null NA 192.168.1.12 1235 enabled not used false null NAIn this example, a new Manager 192.168.1.13 is added to the cluster for the purpose of replacing an existing node.The new Manager is reflected on the Transport Node connections> get managers - 192.168.1.10 Connected (NSX-RPC) *- 192.168.1.11 Connected (NSX-RPC) - 192.168.1.12 Connected (NSX-RPC) - 192.168.1.13 Connected (NSX-RPC) <<<< New ManagerHowever the new Manager IP is missing from Controller connections and Controller connection is now down> get controllers Thu Mar 10 2022 UTC 18:05:38.605 Controller IP Port SSL Status Is Physical Master Session State Controller FQDN 192.168.1.10 1235 enabled disconnected true down NA 192.168.1.11 1235 enabled not used false null NA 192.168.1.12 1235 enabled not used false null NANew Controller information has not been pushed to the Transport Node(From root shell)#egrep "server|fqdn" /etc/vmware/nsx/controller-info.xml <server>192.168.1.10</server> <server>192.168.1.11</server> <server>192.168.1.12</server>
One Controller is responsible for handling the addition or deletion of a Controller to the cluster.This issue occurs because this Controller only sends the new list of Controllers to the Transport Nodes connecting to it. Transport Nodes sharded to other Controllers do not get the updated list of Controllers and so lose their Controller connectivity.
This issue is resolved in NSX-T Data Center 3.2.1, available at VMware Downloads .
To workaround this issue, each time a Manager is added or removed from the Controller cluster the nsx proxy service must be restarted on the impacted Transport Nodes.The service restart will repopulate controller-info.xml and allow the Controller connection to come up.From the Transport Node root shell:#/etc/init.d/nsx-proxy restartConfirm the Controller file is populated with the correct Manager IPs#egrep "server|fqdn" /etc/vmware/nsx/controller-info.xmlRefresh the UI and confirm the TN status is healthy.