...
On HPE ProLiant Gen10 Plus platforms with Intel 3rd Generation Xeon Scalable Processors, and configured with any of the HPE NVIDIA Mellanox Virtual Protocol Interconnect (VPI) ConnectX6-based network adapters with firmware version 20.34.1002 (or prior) listed in the Scope section below, and supporting both InfiniBand HDR and Ethernet modes; when a server reboot is performed after changing certain BIOS settings in the BIOS/Platform Configuration (RBSU), the boot fails, displaying a Red Screen of Death (RSOD) and Uncorrectable Machine Check Exception (UMCE) errors are logged against Processor 1. The following are some example cases where the errors are experienced: Changing the WorkloadProfile back and forth between GeneralPowerEfficientCompute and HighPerformanceCompute(HPC), followed by a reboot. While using GeneralPowerEfficientCompute, changing persistent memory (SCM) from Memory mode to AppDirect mode and back using ipmctl commands, followed by a reboot. The errors are not experienced if a cold boot is performed instead of a reboot, or the errors may not occur with multiple reboots, after the above changes. When this occurs, HPE iLO is using an incorrect Endpoint ID (EID) in a Management Component Transport Protocol (MCTP) message that is being sent. In the NVIDIA Mellanox network adapter, the PLDM AEN event receiver is not cleared on PCIe reset when the media type is MCTP over PCIe VDM, and asynchronous messages are sent over MCTP before endpoint discovery is done. This results in the HPE NVIDIA-based network adapter sending out a malformed MCTP packet that the Intel chipset cannot handle gracefully, which results in the Red Screen of Death (RSOD). Verification of the processor information To check and verify the processor information, perform the following steps: Go to iLO. Login to iLO Web. Go to System Information. Go to Processors.
Any HPE ProLiant Gen10 Plus server platform with Intel 3rd Generation Xeon Scalable Processors (Intel Xeon Platinum 83xx, Intel Xeon Gold 53xx/63xx and Intel Xeon Silver 43xx processors), and configured with any of the following HPE NVIDIA Mellanox VPI ConnectX6-based network adapters with firmware version 20.34.1002 (or prior): HPE InfiniBand HDR/Ethernet 200Gb 2-port QSFP56 PCIe4 x16 OCP3 MCX653436A-HDAI Adapter (P31348-B21 / P31348-H21) HPE InfiniBand HDR/Ethernet 200Gb 1-port QSFP56 PCIe4 x16 OCP3 MCX653435A-HDAI Adapter (P31323-B21 / P31323-H21) HPE InfiniBand HDR/Ethernet 200Gb 2-port QSFP56 PCIe4 x16 MCX653106A-HDAT Adapter (P31324-B21 / P31324-H21) HPE InfiniBand HDR/Ethernet 200Gb 1-port QSFP56 PCIe4 x16 MCX653105A-HDAT Adapter (P23664-B21 / P23664-H21) HPE InfiniBand HDR100/Ethernet 100Gb 1-port MCX653105A-ECAT QSFP56 x16 Adapter (P23665-B21 / P23665-H21) HPE InfiniBand HDR/Ethernet 200Gb 1-port QSFP56 PCIe3 x16 MCX653105A-HDAT Adapter (P06154-B21 / P06154-H21) HPE InfiniBand HDR100/Ethernet 100Gb 2-port QSFP56 PCIe4 x16 MCX653106A-ECAT Adapter (P23666-B21 / P23666-H21) HPE InfiniBand HDR100/Ethernet 100Gb 2-port QSFP56 PCIe3 x16 MCX653106A-ECAT Adapter (P06251-B21 / P06251-H21) HPE InfiniBand HDR100/Ethernet 100Gb 1-port QSFP56 PCIe3 x16 MCX653105A-ECAT Adapter (P06250-B21 / P06250-H21)
To resolve the issue, download and install the firmware version 20.35.1012, available at the following URLs: For HPE NVIDIA Mellanox VPI ConnectX6-based network adapters configured on HPE ProLiant Servers running a Linux Operating System Firmware for HPE InfiniBand HDR/Ethernet 200Gb 2-port QSFP56 PCIe4 x16 OCP3 MCX653436A-HDAI Adapter : HPE part numbers P31348-B21 and P31348-H21 Firmware for HPE InfiniBand HDR/Ethernet 200Gb 1-port QSFP56 PCIe4 x16 OCP3 MCX653435A-HDAI Adapter: HPE part numbers P31323-B21 and P31323-H21 Firmware for HPE InfiniBand HDR/Ethernet 200Gb 2-port QSFP56 PCIe4 x16 MCX653106A-HDAT Adapter: HPE part numbers P31324-B21 and P31324-H21 Firmware for HPE InfiniBand HDR/Ethernet 200Gb 1-port QSFP56 PCIe4 x16 MCX653105A-HDAT Adapter: HPE part numbers P23664-B21 and P23664-H21 Firmware for HPE InfiniBand HDR100/Ethernet 100Gb 1-port QSFP56 PCIe4 x16 MCX653105A-ECAT Adapter: HPE part numbers P23665-B21 and P23665-H21 Firmware for HPE InfiniBand HDR/Ethernet 200Gb 1-port QSFP56 PCIe3 x16 MCX653105A-HDAT Adapter (Original Name: HPE InfiniBand HDR/Ethernet 200Gb 1-port 940QSFP56 x16 Adapter): HPE part numbers P06154-B21 and P06154-H21 Firmware for HPE InfiniBand HDR100/Ethernet 100Gb 2-port QSFP56 PCIe4 x16 MCX653106A-ECAT Adapter: HPE part numbers P23666-B21 and P23666-H21 Firmware for HPE InfiniBand HDR100/Ethernet 100Gb 2-port QSFP56 PCIe3 x16 MCX653106A-ECAT Adapter (Original Name: HPE InfiniBand HDR100/Ethernet 100Gb 2-port 940QSFP56 x16 Adapter): HPE part numbers P06251-B21 and P06251-H21 Firmware for HPE InfiniBand HDR100/Ethernet 100Gb 1-port QSFP56 PCIe3 x16 MCX653105A-ECAT Adapter (Original Name: HPE InfiniBand HDR100/Ethernet 100Gb 1-port 940QSFP56 x16 Adapter): HPE part numbers P06250-B21 and P06250-H21 For HPE NVIDIA Mellanox VPI ConnectX6-based network adapters configured on HPE ProLiant Servers running a Windows Operating System Firmware for HPE InfiniBand HDR/Ethernet 200Gb 2-port QSFP56 PCIe4 x16 OCP3 MCX653436A-HDAI Adapter : HPE part numbers P31348-B21 and P31348-H21 Firmware for HPE InfiniBand HDR/Ethernet 200Gb 1-port QSFP56 PCIe4 x16 OCP3 MCX653435A-HDAI Adapter: HPE part numbers P31323-B21 and P31323-H21 Firmware for HPE InfiniBand HDR/Ethernet 200Gb 2-port QSFP56 PCIe4 x16 MCX653106A-HDAT Adapter: HPE part numbers P31324-B21 and P31324-H21 Firmware for HPE InfiniBand HDR/Ethernet 200Gb 1-port QSFP56 PCIe4 x16 MCX653105A-HDAT Adapter: HPE part numbers P23664-B21 and P23664-H21 Firmware for HPE InfiniBand HDR100/Ethernet 100Gb 1-port QSFP56 PCIe4 x16 MCX653105A-ECAT Adapter: HPE part numbers P23665-B21 and P23665-H21 Firmware for HPE InfiniBand HDR/Ethernet 200Gb 1-port QSFP56 PCIe3 x16 MCX653105A-HDAT Adapter (Original Name: HPE InfiniBand HDR/Ethernet 200Gb 1-port 940QSFP56 x16 Adapter): HPE part numbers P06154-B21 and P06154-H21 Firmware for HPE InfiniBand HDR100/Ethernet 100Gb 2-port QSFP56 PCIe4 x16 MCX653106A-ECAT Adapter: HPE part numbers P23666-B21 and P23666-H21 Firmware for HPE InfiniBand HDR100/Ethernet 100Gb 2-port QSFP56 PCIe3 x16 MCX653106A-ECAT Adapter (Original Name: HPE InfiniBand HDR100/Ethernet 100Gb 2-port 940QSFP56 x16 Adapter): HPE part numbers P06251-B21 and P06251-H21 Firmware for HPE InfiniBand HDR100/Ethernet 100Gb 1-port QSFP56 PCIe3 x16 MCX653105A-ECAT Adapter (Original Name: HPE InfiniBand HDR100/Ethernet 100Gb 1-port 940QSFP56 x16 Adapter): HPE part numbers P06250-B21 and P06250-H21 Important notes: The firmware version 20.35.1012 also adds support for using the "SetEventReceiver" PLDM command with mode polling for HPE NVIDIA Mellanox VPI ConnectX6-based network adapters. HPE recommends using HPE iLO 5 firmware v2.72 (or later) along with the network adapter firmware version indicated above. RECEIVE PROACTIVE UPDATES : Receive support alerts (such as Customer Advisories), as well as updates on drivers, software, firmware, and customer replaceable components, proactively in your e-mail through HPE Support Alerts. Sign up for Support Alerts at the following URL: HPE Email Preference Center NAVIGATION TIP: For hints on navigating HPE.com to locate the latest drivers, patches and other support software downloads, refer to the Navigation Tips document. SEARCH TIP: For hints on locating similar documents on HPE.com, refer to the Search Tips document.