...
Document Version Release Date Details 2 April 04, 2023 The Resolution section has been updated with the firmware version containing the solution. 1 May 18, 2022 Original document release. For any HPE Synergy 12000 Frame containing one (or more) pair of Mellanox SH2200 Switch Module for HPE Synergy configured with Multi-Chassis Link Aggregation Group (MLAG), any restart of the FLM with the active redundancy role may result in a connectivity disruption between one (or more) compute modules in the frame and its associated production Mellanox fabric(s). Restarting the active FLM will occur when any of the following operations are performed: FLM firmware update Manually-requested FLM redundancy failover Manually-requested restart of the active FLM Physical removal of the active FLM FLM factory reset The active FLM performs discovery of all of the components within the frame each time there is a reset of the active FLM. As part of this discovery process, the FLM performs a check to verify the compatibility of interconnect modules connected via the internal midplane crosslink connection. The disruption of production network connectivity described above occurs due to a flaw in this compatibility check which causes the FLM to briefly disable the crosslink connection between the Mellanox SH2200 Switch Modules. When this issue occurs, connectivity is typically disrupted for 60 seconds (or less); however, may not recover automatically depending on factors such as the specific deployed fabric types, fabric redundancy configuration, Operating System/driver/application configuration, and resilience of the fabric technology. Note that this issue only impacts Mellanox SH2200 Switch Modules where MLAG has been configured.
In the scenario described above, the following: Any HPE Synergy Frame Link Module running firmware versions prior to 4.00.00. Any Mellanox SH2200 Switch Module for HPE Synergy running any firmware version and configured with Multi-Chassis Link Aggregation Group (MLAG).
To avoid this issue, install FLM firmware version 4.00.00 (or later). This version has been included in HPE Synergy Service Pack (SSP) 2023.03.01. For any upcoming versions and additional information, refer to the following link: HPE Synergy Software Releases As a workaround, if the configuration matches the Scope section above, schedule a maintenance window before performing any of the following operations: FLM firmware update Manually-requested FLM redundancy failover Manually-requested restart of the active FLM Physical removal of the active FLM FLM factory reset To determine if the Mellanox SH2200 Switch Module MLAG connectivity was restored, run the "show mlag" and "show mlag statistics" commands, as shown below. Perform either of the following steps to connect to the switch: SSH into the switch directly using the mgmt0 IP address. OR i. SSH into the composer IP address and log in as Administrator. This will log into the main-view. ii. Type "console-view" to enter the console. iii. List available interconnect modules using command "show interconnect list". iv. Select the affected Mellanox SH2200 Switch Module and run command: connect interconnect <enclosure name> <bay number> switch-xxx [MlagDomain1: master] # show mlag statistics IPL 1: RX Heartbeat : 449 TX Heartbeat : 449 Mgmt RX Heartbeat : 449 Mgmt TX Heartbeat : 449 RX IGMP tunnel : 0 TX IGMP tunnel : 8 RX XSTP tunnel : 203 TX XSTP tunnel : 9 RX ARP tunnel : 19 TX ARP tunnel : 34 RX mlag-notification : 0 TX mlag-notification : 0 RX port-notification : 2 TX port-notification : 2 RX FDB sync : 269 TX FDB sync : 269 RX LACP manager : 1 TX LACP manager : 1 Heartbeat TX errors : 0 Heartbeat RX miss : 0 Heartbeat remote defect : 0 Heartbeat local defect : 0 Mgmt Heartbeat TX errors : 0 Mgmt Heartbeat RX miss : 0 Mgmt Heartbeat remote defect: 0 Mgmt Heartbeat local defect : 0 Additionally, check the status of the ports used to create the Inter Peer Link (IPL), which typically are Eth1/21 and Eth1/22. The following is an example (some output was omitted): switch-xxx [MlagDomain1: master] # show int status | i 1/2 ... Eth1/21 (Po1) Up Enabled 100G 9216 - Eth1/22 (Po1) Up Enabled 100G 9216 - If the Mellanox SH2200 Switch Module MLAG connectivity is down, reboot the affected switch. Note: For most instances, either the MLAG connectivity will recover automatically, or rebooting the affected switch will restore MLAG connectivity. RECEIVE PROACTIVE UPDATES : Receive support alerts (such as Customer Advisories), as well as updates on drivers, software, firmware, and customer replaceable components, proactively in your e-mail through HPE Support Alerts. Sign up for Support Alerts at the following URL: HPE Email Preference Center NAVIGATION TIP: For hints on navigating HPE.com to locate the latest drivers, patches and other support software downloads, refer to the Navigation Tips document. SEARCH TIP: For hints on locating similar documents on HPE.com, refer to the Search Tips document.