BugZero | VMware BugID 2137005 - Netcpa issues in VMware NSX for vSphere 6.x

VMware - Defect ID: 2137005

Netcpa issues in VMware NSX for vSphere 6.x

VMware - Defect ID: 2137005

Netcpa issues in VMware NSX for vSphere 6.x

Last updated on December 26th, 2015

BugZero Risk Score
5.3 Medium

Overall: N/A

Severity: N/A

Community: N/A

Lifecycle: N/A

What is the BugZero Risk Score?

VMware Integration

Learn more about where this data comes from

VMware Integration

Learn more

Bug Scrub Advisor

Streamline upgrades with automated vendor bug scrubs

Bug Scrub Advisor

Learn more

BugZero Enterprise

Wish you caught this bug sooner? Get proactive today.

BugZero Enterprise

Learn more

Bug Details

Description

Symptoms

Virtual machines running on the same ESXi host fails to communicate with each other Virtual machine fails to communicate with the NSX Edge Gateway (ESG) Routing does not appear to be functioning despite having a defined route for the NSX Edge Gateway Rebooting the NSX Edge does not resolve the issue Running the esxcli network vswitch dvs vmware vxlan network list --vds-name=Name_VDS command on the ESXi host displays the VNIs as downFor example:~ # esxcli network vswitch dvs vmware vxlan network list --vds-name=Compute_VDSVXLAN ID Multicast IP Control Plane Controller Connection Port Count MAC Entry Count ARP Entry Count-------- ------------------------- ----------------------------------- --------------------- ---------- --------------- --------------- 5001 N/A (headend replication) Enabled (multicast proxy,ARP proxy) 192.168.110.203 (down) 1 1 0 5000 N/A (headend replication) Enabled (multicast proxy,ARP proxy) 192.168.110.201 (down) 1 0 0 In the /var/log/netcpa.log file on the ESXi host, you see entries similar to:2015-07-16T16:18:58.340Z [FFC97B70 info 'Default'] Vdrb: core app ready on 10.1.8.13:02015-07-16T16:18:58.341Z [FFC97B70 info 'Default'] Core: Controller is ready: 10.1.8.13:02015-07-16T16:19:27.112Z [FFDBBB70 error 'Default'] Async read callback failed, connection 10.1.8.13:0 was shutdown by peer.2015-07-16T16:19:27.113Z [FFD7AB70 info 'Default'] Vxlan: ctrl connection 10.1.8.13:0 down2015-07-16T16:19:27.113Z [FFD7AB70 info 'Default'] Vdrb: ctrl connection 10.1.8.13:0 down2015-07-16T16:19:28.350Z [FFD7AB70 info 'Default'] Core: Hello sent: 10.1.8.13:02015-07-16T16:19:28.350Z [FFC97B70 info 'Default'] Vxlan: received freqCtrlPeriod 1000 freqCtrlQuery 100 freqCtrlUpdate 202015-07-16T16:19:28.350Z [FFC97B70 info 'Default'] Vxlan: received bteAgeingTime 3002015-07-16T16:19:28.350Z [FFC97B70 info 'Default'] Vxlan: received arpAgeingTime 3002015-07-16T16:19:28.350Z [FFC97B70 info 'Default'] Core: Max Pkt Len of peer 10.1.8.13: 4096 In the /var/log/netcpa.log file on the ESXi host, you see entries similar to:2015-11-02T14:36:10.341Z [5BB13B70 info 'Default'] Core: Controller is ready: 10.222.254.182:02015-11-02T14:36:40.443Z [5B9EFB70 info 'Default'] Core: Controller is ready: 10.222.254.182:02015-11-02T14:37:10.364Z [5BB13B70 info 'Default'] Core: Controller is ready: 10.222.254.182:02015-11-02T14:37:40.385Z [5BA91B70 info 'Default'] Core: Controller is ready: 10.222.254.182:0 In the /var/log/netcpa.log file on the ESXi host, you see entries similar to:netcpa.log:2015-11-02T14:39:40.380Z [5B96DB70 info 'Default'] Vxlan: send VNI Membership Update(Join) to the controller: VNI 10119 controller 10.222.254.182netcpa.log:2015-11-02T14:39:40.380Z [5B96DB70 info 'Default'] Vxlan: send VNI Membership Update(Join) to the controller: VNI 10122 controller 10.222.254.182netcpa.log:2015-11-02T14:39:40.380Z [5B96DB70 info 'Default'] Vxlan: send VNI Membership Update(Join) to the controller: VNI 10124 controller 10.222.254.182netcpa.log:2015-11-02T14:39:40.380Z [5B96DB70 info 'Default'] Vxlan: send VNI Membership Update(Join) to the controller: VNI 10127 controller 10.222.254.182 In the /var/log/netcpa.log file on the ESXi host, you see entries similar to:2015-11-02T14:38:09.152Z [5B8EBB70 error 'Default'] Async read callback failed, connection 10.222.254.182:0 was shutdown by peer.2015-11-02T14:38:39.154Z [5B8EBB70 error 'Default'] Async read callback failed, connection 10.222.254.182:0 was shutdown by peer.2015-11-02T14:39:09.157Z [5B8EBB70 error 'Default'] Async read callback failed, connection 10.222.254.182:0 was shutdown by peer.2015-11-02T14:39:39.159Z [5B9EFB70 error 'Default'] Async read callback failed, connection 10.222.254.182:0 was shutdown by peer.Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Cause

This issue occurs because netcpa fails to reset the flag for the msgScheduler when it tries to re-connect to the NSX Controller.Netcpa is not sending out a message because the flag in msgScheduler is set to true which indicates a message is being sent out during that time.An example of a live netcpa core dump:(gdb) p (MsgScheduler *)0x1f0339a8->_txInProgressAttempt to extract a component of a value that is not a structure pointer.(gdb) p ((MsgScheduler *)0x1f0339a8)->_txInProgress$5 = true =========================================> this should be false

Resolution

This issue is resolved in: VMware NSX for vSphere 6.1.5, available at VMware Downloads. For more information, see the NSX for vSphere 6.1.5 Release Notes. VMware NSX for vSphere 6.2, available at VMware Downloads. For more information, see the NSX for vSphere 6.2.0 Release Notes. If you are unable to upgrade, follow the workaround. To work around the issue, restart the netcpa service on the affected ESXi host(s). Log in as root to the ESXi host through SSH or through the console. Run the /etc/init.d/netcpad restart command to restart the netcpa agent on the ESXi host:VMware NSX for vSphere 6.2 introduces a proactive health check which periodically reports the central control plane to local control plane status to NSX Manager and is displayed at the NSX Manager UI. This report also serves as a heartbeat to detect the operational status of the NSX Manager to ESXi host netcpa channel.

Related Information

VMware NSX for vSphere 6.x における Netcpa の問題VMware NSX for vSphere 6.x 中的 Netcpa 问题

Change history

Top VMware Defects by Risk Score

No bugs this month

VMware Integration

Learn more about where this data comes from

VMware Integration

Learn more

Bug Scrub Advisor

Streamline upgrades with automated vendor bug scrubs

Bug Scrub Advisor

Learn more

BugZero Enterprise

Wish you caught this bug sooner? Get proactive today.

BugZero Enterprise

Learn more

Ready to prevent the next vendor outage?

Get a demo

OPERATIONAL DEFECT DATABASE

VMware - Defect ID: 2137005

Netcpa issues in VMware NSX for vSphere 6.x

VMware - Defect ID: 2137005

Netcpa issues in VMware NSX for vSphere 6.x

Last updated on December 26th, 2015

BugZero Risk Score
5.3 Medium

Bug Details

Symptoms

Cause

Resolution

Related Information

Links

Top VMware Defects by Risk Score

Ready to prevent the next vendor outage?

OPERATIONAL DEFECT DATABASE

VMware - Defect ID: 2137005

Netcpa issues in VMware NSX for vSphere 6.x

VMware - Defect ID: 2137005

Netcpa issues in VMware NSX for vSphere 6.x

Last updated on December 26th, 2015

BugZero Risk Score5.3 Medium

Bug Details

Symptoms

Cause

Resolution

Related Information

Links

Top VMware Defects by Risk Score

Ready to prevent the next vendor outage?

BugZero Risk Score
5.3 Medium