...
On HPE ProLiant Gen10 Plus or Gen11 platforms with Intel Xeon Scalable Processors, and configured with any of the HPE NVIDIA Mellanox ConnectX7-based network adapters with firmware version 28.34.1002 listed in the Scope section below; the server may fail to boot, displaying a Red Screen of Death (RSOD) and with Uncorrectable Machine Check Exception (UMCE) errors logged against the processor in the iLO IML. This issue is observed when the network ports of the HPE NVIDIA Mellanox ConnectX7-based network adapters are cable connected and Active, and a server cold boot or soft reboot is attempted. Example of the Red Screen of Death (RSOD) Example of messages in the iLO IML Critical,2995,1456,0x0005,CPU,0x0003,Hardware,05/25/2023 15:21:26,2582: Uncorrectable Machine Check Exception (Processor 1, APIC ID 0x00000000, Bank 0x00000006, Status 0xBA000000"00000E0B, Address 0x00000000"00000000, Misc 0x00000000"00A40000). ACTION: Update the system firmware. If the issue persists, contact support. Critical,2996,255,0x000A,POST Message,0x3167,Hardware,05/25/2023 15:22:11,2583: X64 Exception Type 0x12 (Machine-Check Exception) occurred during the previous boot. Image name: Metronome ACTION: Check the Integrated Management Log (IML) for additional information.
Any HPE ProLiant Gen10 Plus or Gen11 server platform with Intel Xeon Scalable Processors, and configured with any of the following HPE NVIDIA Mellanox ConnectX7-based network adapters with firmware version 28.34.1002: HPE NVIDIA Mellanox ConnectX7-based network adapters HPE InfiniBand NDR 1-port OSFP PCIe5 x16 MCX75310AAS-NEAT Adapter (P45641-B21 / P45641-H21) HPE InfiniBand NDR200 1-port OSFP PCIe5 x16 MCX75310AAS-HEAT Adapter (P45642-B21 / P45642-H21) In addition, the network adapter ports are connected to an NDR switch with any of the following NDR/NDR200 Direct Attach Copper (DAC) cables: NDR/NDR200 DAC cables HPE InfiniBand NDR OSFP to 2xOSFP 1m Splitter Direct Attach Copper Cable (P45697-B22) HPE InfiniBand NDR OSFP to 2xOSFP 1.5m Splitter Direct Attach Copper Cable (P45697-B23) HPE InfiniBand NDR OSFP to 2xOSFP 2m Splitter Direct Attach Copper Cable (P45697-B24) HPE InfiniBand NDR OSFP to 2xOSFP 2.5m Splitter Direct Attach Copper Cable (P45697-B25) HPE InfiniBand NDR OSFP to 2xOSFP 3m Splitter Direct Attach Copper Cable (P45697-B26) HPE InfiniBand NDR200 OSFP to 4xOSFP 1m Splitter Direct Attach Copper Cable (P45698-B22) HPE InfiniBand NDR200 OSFP to 4xOSFP 1.5m Splitter Direct Attach Copper Cable (P45698-B24) HPE InfiniBand NDR200 OSFP to 4xOSFP 2m Splitter Direct Attach Copper Cable (P45698-B25) HPE InfiniBand NDR200 OSFP to 4xOSFP 2.5m Splitter Direct Attach Copper Cable (P45698-B26) HPE InfiniBand NDR200 OSFP to 4xOSFP 3m Splitter Direct Attach Copper Cable (P45698-B23)
To resolve the issue, download and install the firmware version 28.35.1012 (or later), available at the following URLs: For HPE NVIDIA Mellanox ConnectX7-based network adapters configured on HPE ProLiant Servers running a Linux Operating System: Firmware for HPE InfiniBand NDR 1-port OSFP PCIe5 x16 MCX75310AAS-NEAT Adapter: HPE Part Numbers P45641-B21 and P45641-H21 Firmware for HPE InfiniBand NDR200 1-port OSFP PCIe5 x16 MCX75310AAS-HEAT Adapter: HPE Part Numbers P45642-B21 and P45642-H21 For HPE NVIDIA Mellanox ConnectX7-based network adapters configured on HPE ProLiant Servers running a Windows Operating System: Firmware for HPE InfiniBand NDR 1-port OSFP PCIe5 x16 MCX75310AAS-NEAT Adapter: HPE Part Numbers P45641-B21 and P45641-H21 Firmware for HPE InfiniBand NDR200 1-port OSFP PCIe5 x16 MCX75310AAS-HEAT Adapter: HPE Part Numbers P45642-B21 and P45642-H21 Procedure to update the firmware Bring down the port connections of the HPE NVIDIA Mellanox ConnectX7-based network adapter. This can be achieved in two ways. Disconnect each NDR/NDR200 cable from the NDR switch. OR Remove power to the NDR switch. Power-cycle the server. Update the network adapter firmware using the URLs listed above. Reboot the server. Bring up the network adapter port connections as follows: If the NDR/NDR200 cable was disconnected from the NDR switch, connect the cable back to the switch. OR If the NDR switch was powered down, restore power to the switch. Reboot the server. Important note: This issue is not observed when NDR/NDR200 MPO or ACC cables are used to connect the HPE NVIDIA Mellanox ConnectX7-based network adapter ports to NDR switch. RECEIVE PROACTIVE UPDATES : Receive support alerts (such as Customer Advisories), as well as updates on drivers, software, firmware, and customer replaceable components, proactively in your e-mail through HPE Support Alerts. Sign up for Support Alerts at the following URL: HPE Email Preference Center NAVIGATION TIP: For hints on navigating HPE.com to locate the latest drivers, patches and other support software downloads, refer to the Navigation Tips document. SEARCH TIP: For hints on locating similar documents on HPE.com, refer to the Search Tips document.