Loading...
Loading...
Poor performance for traffic going through an NSX Edge when using ESXi 6.5 or above Hardware LRO is disabled/not supported: #esxcli network nic queue loadbalancer listNIC RxQPair RxQNoFeature PreEmptibleQ RxQLatency RxDynamicLB DynamicQPool NumaIOAwareLB RSS LRO GeneveOAM------ ------- ------------ ------------ ---------- ----------- ------------ ------------- --- --- ---------vmnic0 UA ND UA UA NA UA NA UA UA UAvmnic1 UA ND UA UA NA UA NA UA UA UAvmnic2 UA ND UA UA NA UA NA UA UA UAvmnic3 UA ND UA UA NA UA NA UA UA UAWhere: - U: Setting unsupported by device - S: Setting supported by device Software LRO is enabled: #esxcli system settings advanced list -o /Net/NetpollSwLRO Path: /Net/NetpollSwLRO Type: integer Int Value: 1 <--- 1: enabled, 0: disabled Default Int Value: 1 Min Value: 0 Max Value: 1 String Value: Default String Value: Valid Characters: Description: Whether to perform SW LRO on pkts in netpoll The output of #net-stats -A -t WwQqihV shows pktsizeout higher than the NSX Edge vNic MTU: {"name": "TEST-EDGE.eth1", "switch": "DvsPortset-0", "id": 67108882, "mac": "00:50:56:b7:06:b6", "rxmode": 0, "tunemode": 0, "uplink": "false", "txpps": 9061, "txmbps": 5.2, "txsize": 72, "txeps": 0.00, "rxpps": 11230, "rxmbps": 136.0, "rxsize": 1513, "rxeps": 0.00, "vnic": { "type": "vmxnet3", "ring1sz": 512, "ring2sz": 128, "tsopct": 0.0, "tsotputpct": 0.0, "txucastpct": 100.0, "txeps": 0.0, "lropct": 0.0, "lrotputpct": 0.0, "rxucastpct": 100.0, "rxeps": 0.0, "maxqueuelen": 0, "requeuecnt": 0.0, "agingdrpcnt": 0.0, "txdisc": 0.0, "qstop": 0.0, "txallocerr": 0.0, "txtsosplit": 0.0, "r1full": 0.0, "r2full": 0.0, "sgerr": 0.0}, "rxqueue": { "count": 1, "details": [ {"intridx": 0, "pps": 11230, "mbps": 136.0, "errs": 0.0} ]}, "txqueue": { "count": 1, "details": [ {"intridx": 0, "pps": 9061, "mbps": 5.2, "errs": 0.0} ]}, "intr": { "count": 2, "details": [ 7471, 0] }, "sys": [ "151055" ], "vcpu": [ "120137" ], "histos":[ { "name": "pktsizein", "min": 60, "max": 102 ,"mean": 72, "count": 9061, "values":[[66, 46.6], [512, 53.4], [1024, 0.0], [1518, 0.0], [4096, 0.0], [9018, 0.0], [16402, 0.0], [32786, 0.0], [65554, 0.0], [131072, 0.0], [262144, 0.0], [262145, 0.0]] }, { "name": "pktsizeout", "min": 60, "max": 4410 ,"mean": 1661, "count": 10191, "values":[[66, 0.0], [512, 0.0], [1024, 0.0], [1518, 89.8], [4096, 10.2], [9018, 0.0], [16402, 0.0], [32786, 0.0], [65554, 0.0], [131072, 0.0], [262144, 0.0], [262145, 0.0]] }, { "name": "clusterin", "min": 1, "max": 2 ,"mean": 1, "count": 6958, "values":[[1, 69.8], [2, 30.2], [4, 0.0], [8, 0.0], [16, 0.0], [32, 0.0], [64, 0.0], [128, 0.0], [265, 0.0], [512, 0.0], [1024, 0.0], [2048, 0.0], [4096, 0.0], [8192, 0.0], [8193, 0.0]] }, { "name": "clusterout", "min": 1, "max": 2 ,"mean": 1, "count": 6958, "values":[[1, 53.5], [2, 46.5], [4, 0.0], [8, 0.0], [16, 0.0], [32, 0.0], [64, 0.0], [128, 0.0], [265, 0.0], [512, 0.0], [1024, 0.0], [2048, 0.0], [4096, 0.0], [8192, 0.0], [8193, 0.0]] }, { "name": "pktintervalin", "min": 0, "max": 34601 ,"mean": 110, "count": 9061, "values":[[0, 23.2], [10, 0.0], [25, 0.0], [50, 0.0], [100, 0.1], [250, 76.5], [500, 0.2], [750, 0.0], [1000, 0.0], [2000, 0.0], [5000, 0.0], [10000, 0.0], [20000, 0.0], [25000, 0.0], [50000, 0.0], [75000, 0.0], [100000, 0.0], [500000, 0.0], [500001, 0.0]] }, { "name": "pktintervalout", "min": 0, "max": 34581 ,"mean": 98, "count": 10190, "values":[[0, 31.7], [10, 0.0], [25, 0.0], [50, 0.0], [100, 0.1], [250, 68.1], [500, 0.1], [750, 0.0], [1000, 0.0], [2000, 0.0], [5000, 0.0], [10000, 0.0], [20000, 0.0], [25000, 0.0], [50000, 0.0], [75000, 0.0], [100000, 0.0], [500000, 0.0], [500001, 0.0]] } \]}, A packet capture from the ESXi host done inbound on the Edge vNic connected to VXLAN (in most cases a transit VXLAN) reveals TCP segments with incorrect checksums: #pktcap-uw --switchport EDGE_VXLAN_VNIC_PORT_ID --dir 1 --ng -o - | tcpdump -envr - | grep incorrect 11.11.11.11.17229 > 22.22.22.22.rsync: Flags [S], cksum 0x5626 (incorrect -> 0x3bdf), seq 4049220372, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 4], length 0 11.11.11.11.17229 > 22.22.22.22.rsync: Flags [.], cksum 0x561a (incorrect -> 0x3569), ack 2641433779, win 1825, length 0 11.11.11.11.17229 > 22.22.22.22.rsync: Flags [P.], cksum 0x5628 (incorrect -> 0x8dbe), seq 0:14, ack 1, win 1825, length 14
The issue occurs on ESXi host 6.5 and above when pNic/drivers do not support Hardware LRO and "Uplink Software LRO" (feature introduced in ESXi 6.5) performs the LRO operation and aggregates multiple segments into larger segments for delivery.When those larger segments reach the Edge's vNIC, the vmxnet3 backend finds the guest OS didn't request LRO (i.e. #ethtool -K lro off for Linux) and re-segment the large segment to match the Edge vNic MTU. However, the resulting segments are marked as checksum verified in the vNIC vmxnet3 backend receive descriptors even though we did not insert an actual correct checksum. As a result, the segments have an invalid TCP checksum and are forwarded by the Edge with this incorrect checksum causing the destination to drop the segments. The segments are eventually retransmitted causing performance degradation.
The issue is fixed in NSX for vSphere 6.3.6 and NSX for vSphere 6.4.2
To work around the issue, disable Software LRO on the ESXi hosts where the Edge VMs are running and reboot the ESXi host for the changes to apply.To enable Software LRO: #esxcli system settings advanced set -o /Net/NetpollSwLRO -i 1To disable Software LRO: #esxcli system settings advanced set -o /Net/NetpollSwLRO -i 0To verify Software LRO: #esxcli system settings advanced list -o /Net/NetpollSwLRO
Click on a version to see all relevant bugs
VMware Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.