...
Virtual machines running on the same ESXi host fails to communicate with each other Virtual machine fails to communicate with the NSX Edge Gateway (ESG) Routing does not appear to be functioning despite having a defined route for the NSX Edge Gateway Rebooting the NSX Edge does not resolve the issue Running the esxcli network vswitch dvs vmware vxlan network list --vds-name=Name_VDS command on the ESXi host displays the VNIs as downFor example:~ # esxcli network vswitch dvs vmware vxlan network list --vds-name=Compute_VDSVXLAN ID Multicast IP Control Plane Controller Connection Port Count MAC Entry Count ARP Entry Count-------- ------------------------- ----------------------------------- --------------------- ---------- --------------- --------------- 5001 N/A (headend replication) Enabled (multicast proxy,ARP proxy) 192.168.110.203 (down) 1 1 0 5000 N/A (headend replication) Enabled (multicast proxy,ARP proxy) 192.168.110.201 (down) 1 0 0 In the /var/log/netcpa.log file on the ESXi host, you see entries similar to:2015-07-16T16:18:58.340Z [FFC97B70 info 'Default'] Vdrb: core app ready on 10.1.8.13:02015-07-16T16:18:58.341Z [FFC97B70 info 'Default'] Core: Controller is ready: 10.1.8.13:02015-07-16T16:19:27.112Z [FFDBBB70 error 'Default'] Async read callback failed, connection 10.1.8.13:0 was shutdown by peer.2015-07-16T16:19:27.113Z [FFD7AB70 info 'Default'] Vxlan: ctrl connection 10.1.8.13:0 down2015-07-16T16:19:27.113Z [FFD7AB70 info 'Default'] Vdrb: ctrl connection 10.1.8.13:0 down2015-07-16T16:19:28.350Z [FFD7AB70 info 'Default'] Core: Hello sent: 10.1.8.13:02015-07-16T16:19:28.350Z [FFC97B70 info 'Default'] Vxlan: received freqCtrlPeriod 1000 freqCtrlQuery 100 freqCtrlUpdate 202015-07-16T16:19:28.350Z [FFC97B70 info 'Default'] Vxlan: received bteAgeingTime 3002015-07-16T16:19:28.350Z [FFC97B70 info 'Default'] Vxlan: received arpAgeingTime 3002015-07-16T16:19:28.350Z [FFC97B70 info 'Default'] Core: Max Pkt Len of peer 10.1.8.13: 4096 In the /var/log/netcpa.log file on the ESXi host, you see entries similar to:2015-11-02T14:36:10.341Z [5BB13B70 info 'Default'] Core: Controller is ready: 10.222.254.182:02015-11-02T14:36:40.443Z [5B9EFB70 info 'Default'] Core: Controller is ready: 10.222.254.182:02015-11-02T14:37:10.364Z [5BB13B70 info 'Default'] Core: Controller is ready: 10.222.254.182:02015-11-02T14:37:40.385Z [5BA91B70 info 'Default'] Core: Controller is ready: 10.222.254.182:0 In the /var/log/netcpa.log file on the ESXi host, you see entries similar to:netcpa.log:2015-11-02T14:39:40.380Z [5B96DB70 info 'Default'] Vxlan: send VNI Membership Update(Join) to the controller: VNI 10119 controller 10.222.254.182netcpa.log:2015-11-02T14:39:40.380Z [5B96DB70 info 'Default'] Vxlan: send VNI Membership Update(Join) to the controller: VNI 10122 controller 10.222.254.182netcpa.log:2015-11-02T14:39:40.380Z [5B96DB70 info 'Default'] Vxlan: send VNI Membership Update(Join) to the controller: VNI 10124 controller 10.222.254.182netcpa.log:2015-11-02T14:39:40.380Z [5B96DB70 info 'Default'] Vxlan: send VNI Membership Update(Join) to the controller: VNI 10127 controller 10.222.254.182 In the /var/log/netcpa.log file on the ESXi host, you see entries similar to:2015-11-02T14:38:09.152Z [5B8EBB70 error 'Default'] Async read callback failed, connection 10.222.254.182:0 was shutdown by peer.2015-11-02T14:38:39.154Z [5B8EBB70 error 'Default'] Async read callback failed, connection 10.222.254.182:0 was shutdown by peer.2015-11-02T14:39:09.157Z [5B8EBB70 error 'Default'] Async read callback failed, connection 10.222.254.182:0 was shutdown by peer.2015-11-02T14:39:39.159Z [5B9EFB70 error 'Default'] Async read callback failed, connection 10.222.254.182:0 was shutdown by peer.Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
This issue occurs because netcpa fails to reset the flag for the msgScheduler when it tries to re-connect to the NSX Controller.Netcpa is not sending out a message because the flag in msgScheduler is set to true which indicates a message is being sent out during that time.An example of a live netcpa core dump:(gdb) p (MsgScheduler *)0x1f0339a8->_txInProgressAttempt to extract a component of a value that is not a structure pointer.(gdb) p ((MsgScheduler *)0x1f0339a8)->_txInProgress$5 = true =========================================> this should be false
This issue is resolved in: VMware NSX for vSphere 6.1.5, available at VMware Downloads. For more information, see the NSX for vSphere 6.1.5 Release Notes. VMware NSX for vSphere 6.2, available at VMware Downloads. For more information, see the NSX for vSphere 6.2.0 Release Notes. If you are unable to upgrade, follow the workaround. To work around the issue, restart the netcpa service on the affected ESXi host(s). Log in as root to the ESXi host through SSH or through the console. Run the /etc/init.d/netcpad restart command to restart the netcpa agent on the ESXi host:VMware NSX for vSphere 6.2 introduces a proactive health check which periodically reports the central control plane to local control plane status to NSX Manager and is displayed at the NSX Manager UI. This report also serves as a heartbeat to detect the operational status of the NSX Manager to ESXi host netcpa channel.
VMware NSX for vSphere 6.x における Netcpa の問題VMware NSX for vSphere 6.x 中的 Netcpa 问题
Click on a version to see all relevant bugs
VMware Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.