Loading...
Loading...
For any HPE ProLiant for Microsoft Azure Stack Hub Gen10 solutions listed in the Scope section below, the Azure Stack hub nodes may experience excessive RDMA (Remote Direct Memory Access) frame drops for storage traffic.When this issue occurs, Storage Spaces Direct CSV (Cluster Shared Volume) Auto-Pauses events will be reported, and Storage Spaces Direct (S2D) will experience traffic interruptions. As a result, cluster storage performance will be affected.The above issue occurs when the Azure Stack hub nodes are connected to either HPE FlexFabric 5950 Switch Series, or HPE Networking Comware Switch Series 5945.To determine if this issue has occurred, perform the following steps:From the Azure Stack hub node, open the Administrator Portal.Select Storage Resource Provider, and then Volume Metrics, as shown below.Additionally, this issue can also be confirmed from the switch level. Log into the corresponding switches, and execute the commands shown below to review the cumulative "dropped packets".screen-length disabledisplay buffer queuedisplay packet-drop interfacedisplay qos queue-statistics interface outboundDropped packets on Queue 3 and Queue 5 indicate incomplete RoCE (RDMA over Converged Ethernet) and buffer configuration on the switches.
In the scenario described above, the following Azure Stack hub nodes:HPE ProLiant DL360 Gen10 with Microsoft Azure Stack All Flash NodeHPE ProLiant DL360 Gen10 with Microsoft Azure Stack All Flash Rugged NodeHPE DL380 Gen10 with Microsoft Azure Stack All Flash NodeHPE DL380 Gen10 with Microsoft Azure Stack NodeAdditionally, when the above Azure Stack hub nodes are connected to either of the following switches:HPE FlexFabric 5950 Switch Series running firmware versions prior to 7.10.R6301P03.HPE Networking Comware Switch Series 5945 running firmware versions prior to 7.10.R6710P03.
To avoid this issue, update the switch firmware to the versions shown below. The Release Notes provide detailed update instructions.HPE FlexFabric 5950 Switch Series firmware version7.10.R6301P03HPE Networking Comware Switch Series 5945 firmware version7.10.R6710P03Additional information can be found at the following link:Switch firmwareOnce the switch firmware has been updated, log into the switch as admin, and execute the following commands in sys mode:int range twen1/0/1 to twen1/0/16qos wrr af1 group spqos wrr af2 group spqos wrr af4 group spqos wrr cs6 group spqos wrr cs7 group spacl number 3001rule 0 permit tcp destination-port eq 445 countingrule 1 permit udp destination-port eq 4791 countingrule 2 permit tcp source-port eq 445 countingtraffic classifier ROCE_class operator orif-match acl 3001traffic behavior ROCE_behaviorremark dot1p 3accounting packetqos policy ROCE_policyClassifier ROCE_class behavior ROCE_behavior mode dcbxint range twen1/0/1 to twen1/0/16qos apply policy ROCE_policy outboundint range twen1/0/1 to twen1/0/16lldp enablelldp tlv-enable dot1-tlv dcbxdcbx version rev101qos wred queue table queue-table3queue 3 weighting-constant 12queue 3 drop-level 0 low-limit 10 high-limit 20 discard-probability 30queue 3 ecnint hun1/0/25 <port connecting to peer ToR switch>qos wrr byte-countqos wrr 3 group spqos wred apply queue-table3int hun1/0/26 <port connecting to peer ToR switch>qos wrr byte-countqos wrr 3 group spqos wred apply queue-table3int range twen1/0/1 to twen1/0/16qos gts queue 3 cir 12500000 cbs 15680000qos gts queue 5 cir 500000 cbs 16000000buffer egress cell queue 3 shared ratio 100buffer egress cell queue 5 shared ratio 100buffer applysave forceNote:Disregard any warning messages indicating that sections of the above configuration are already configured.To confirm the new configuration has been applied successfully, execute the following command:display current-configuration
Operating Systems Affected:Not Applicable
Click on a version to see all relevant bugs
Hewlett Packard Enterprise Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.