Loading...
Loading...
An HPE Superdome Flex or HPE Compute Scale-up server 3200 system configured with a large number of I/O cards including multiple Mellanox devices may experience out of memory or interrupt enablement issues during kdump.When this occurs, the following error messages will be displayed on the console during "kdump kexec" (you need to be looking at the console to see them):For the out of memory case:[24.251426] dracut-initqueue[[ 25.845410] systemd-udevd invoked oom-killer: gfp_mask=0x80d0, order=0, oom_score_adj=0..[ 25.932106] [<ffffffff81184dce>] oom_kill_process+0x24e/0x3c0..[ 25.960507] [<ffffffff81185606>] out_of_memory+0x4b6/0x4f0For the interrupt enablement case:[ 55.816329] lpfc 0005:c1:00.0: 22:0426 Failed to enable interrupt.[ 55.823927] ------------[ cut here ]------------[ 55.829083] kernel BUG at arch/x86/kernel/apic/io_apic.c:1356!If this occurs, the memory footprint of the crashkernel boot needs to be reduced, by blacklisting drivers for devices that are unneeded as dump devices.Note: If you need to use one of these devices during kdump, you will need to increase the size of the crashkernel allocation and/or the number of cpus used during kdump kexec.
HPE Superdome Flex or HPE Compute Scale-up server 3200 with large I/O configurations including multiple Mellanox devices.
To blacklist device drivers, list them in the KDUMP_MODULE_BLACKLIST environment variable in the /etc/sysconfig/hpe-auto-config file, then run hpe-auto-config.For example, to blacklist the Mellanox driver:a. Add mlx5_core to KDUMP_MODULE_BLACKLIST in the /etc/sysconfig/hpe-auto-config file as follows:KDUMP_MODULE_BLACKLIST=snip>..mlx5_core"b. Run 'systemctl restart hpe-auto-config'.If you need to use a device during kdump that has a driver that is causing out of memory issues during kdump (for example, dumping via a Mellanox Ethernet interface) you will need to increase the crashkernel allocation. To do this, perform the following steps:a. Capture the current 'crashkernel' setting from /proc/cmdline. It should look similar to the following:crashkernel=512M,highorcrashkernel=16G-4096G:512M,4096G-16384G:1G,16384G-32768G:2G,32768G-:3G@4Gb. Create a new file, /etc/hpe-auto-config/97_kdump_io.sh that sets the new crashkernel setting. The crash kernel allocation will need to be increased for the size of the system.If the format of the crash kernel= settings is crashkernel=X,high, the 97_kdump_io.shfile contents would look as follows for an increase from 512M to 2G:"#!/bin/bash# Include the shared library of hpe-auto-config commandssource /usr/lib64/hpe-auto-config/shlibboot_option replace crashkernel=2G,high"If the crashkernel= setting follows the "crashkernel=16G-4096G:512M,4096G-16384G:1G,16384G-32768G:2G,32768G-:3G@4G"format, substitute the "boot_option replace" line with (e.x.):boot_option replace crashkernel=16G-4096G:2G,4096G-16384G:3G,16384G-32768G:4G,32768G-:5G@4Gc. Set executable permission on /etc/hpe-auto-config/97_kdump_io.shchmod 700/etc/hpe-auto-config/97_kdump_io.shd. Run 'systemctl restart hpe-auto-config'.e. Reboot the system.If the kdump device you are using is causing irq enablement issues during kdump, you may need to increase the number of CPUs that kdump executes with. To do this, perform the following steps:a. Collect the current setting of KDUMP_COMMANDLINE from /etc/sysconfig/kdumpIf KDUMP_COMMANDLINE contains "nr_cpus=4 disable_cpu_apicid=Y"b. Change/etc/hpe-auto-config/10_kdump.sh line matching "add+=(nr_cpus=4)" to "add+=(nr_cpus=X)", where X is the desired number of cpus with which kdump should kexec.ORb. Edit the KDUMP_COMMANDLINE_APPEND setting in /etc/sysconfig/kdump, adding "nr_cpus=X disable_cpu_apicid=0", where X is the desired number of cpus with which kdump should kexec.NOTE that you must replace/remove any existing nr_cpus or max cpus values that may already exist in KDUMP_COMMANDLINE_APPEND.c. Run 'systemctl restart hpe-auto-config'.Document VersionRelease DateDetails3August 12, 2025Updated to clarify syntax in several examples2March 28, 2025Updated to include Compute Scale-up server 3200 and updated the Resolution section with more information1December 6, 2017Original document release
Operating Systems Affected:Not Applicable
Click on a version to see all relevant bugs
Hewlett Packard Enterprise Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.