BugZero | Dell BugID 169404 - Intermittent performance issues in rare scenarios ...

Dell - Defect ID: 169404

Intermittent performance issues in rare scenarios caused by 100% full disk drives leading to locking contention

Dell - Defect ID: 169404

Intermittent performance issues in rare scenarios caused by 100% full disk drives leading to locking contention

Last updated on November 26th, 2025

BugZero Risk Score
0.0 Coming soon

Overall: N/A

Severity: N/A

Community: N/A

Lifecycle: N/A

What is the BugZero Risk Score?

Dell Integration

Learn more about where this data comes from

Dell Integration

Learn more

Bug Scrub Advisor

Streamline upgrades with automated vendor bug scrubs

Bug Scrub Advisor

Learn more

BugZero Enterprise

Wish you caught this bug sooner? Get proactive today.

BugZero Enterprise

Learn more

Bug Details

Support Case Count: 210
Article View Count: 7300
Impact Category: Performance
Views: 2

Description

Symptoms

Intermittent performance issues in rare scenarios caused by 100% full disk drives leading to locking contention. /var/log/messages may show errors similar to: /boot/kernel.amd64/kernel: [bam_alloc.c:3712](pid 22982="lwio")(tid=102121) bam_s_alloc_wait stalled for 60 seconds (sysctl efs.bam.dump_alloc) If hangdumps are generated as the locking thread may show stack trace similar to these waiting for. bam_s_alloc_block or dev_local_alloc_dinodes or dev_local_alloc_blocks You can follow the KB article https://support.emc.com/kb/483388 to determine the locking thread, and its stack trace.The bam_s_alloc_block example: 11060 isi_migr_sworker: Waiting on 0xfffff804d177a780 with msg bam_s_alloc Stack: -------------------------------------------------- kernel:sched_switch+0x6df kernel:mi_switch+0x255 kernel:sleepq_timedwait+0x39 kernel:_sleep+0x28e kernel:bam_s_alloc_block+0x144 kernel:mds_txn_exec_impl+0x24d kernel:ifm_ext_txn_exec_impl+0x15 kernel:ifm_write_attribute_mods+0x2bf kernel:ifm_end_operation+0x1c3f kernel:txn_i_end+0x4f7 kernel:bam_snap_paint_gather+0x977 kernel:bam_snap_paint+0x300 kernel:bam_snap_cow_inodes+0x7f kernel:bam_replace_dead_inodes_inner+0x101e kernel:bam_replace_dead_inodes+0x1b kernel:ifs_vnop_wrapremove+0x108 kernel:VOP_REMOVE_APV+0xaa kernel:isi_kern_unlinkat+0x1f6 kernel:amd64_syscall+0x396 -------------------------------------------------- The dev_local_alloc_dinodes example: 4317 kt: dxt-worker: thread 0xfffff802759c7760: Stack: -------------------------------------------------- kernel:_sx_xlock_hard+0x1b6 kernel:lbm_alloc_invalidate_cg+0x49 kernel:lbm_alloc_prepare+0x57b kernel:lbm_alloc_inodes+0x80 kernel:dev_local_alloc_dinodes+0xcf kernel:handle_alloc_dinodes+0x14c kernel:efsidp_call_handler+0x25f kernel:dxt_main+0x71d kernel:kt_main+0x1ee kernel:fork_exit+0x74 -------------------------------------------------- The dev_local_alloc_dinodes example: 71089 kt: dxt-worker: thread 0xfffff8039e1b0760: Waiting on 0xfffffe0bc264d3f8 with msg "getblk" Stack: -------------------------------------------------- kernel:sched_switch+0x5bd kernel:mi_switch+0x21d kernel:sleepq_timedwait+0x39 kernel:sleeplk+0x155 kernel:__lockmgr_args+0x58a kernel:getblk_locked+0x3bd kernel:drv_getblk+0x7a kernel:j_getblk+0x317 kernel:lbm_aread+0x338 kernel:lbm_read+0x65 kernel:lbm_read_cgblock_helper+0xe6 kernel:lbm_read_cgblock+0x11d kernel:lbm_alloc_prepare+0x545 kernel:lbm_alloc_blocks+0x5c kernel:dev_local_alloc_blocks+0xf0 kernel:handle_alloc_blocks+0x1d3 kernel:efsidp_call_handler+0x25f kernel:dxt_main+0x71d kernel:kt_main+0x1ee kernel:fork_exit+0x74 --------------------------------------------------

Cause

The issue is caused by some drives on some nodes being close to 100% full. To find the drives with lowest blkfree. # isi_for_array -X sysctl efs.lbm.drive_space | grep blkfree | sort -nk3 -t '='| head. Any drives listed above with close to only 64,000 free blocks need attention.

Resolution

1. Current solution/ work around to set the following sysctl values across the cluster to avoid similar issue should any drive reaches close to 100% full. Note down the current values before changing the settings: # sysctl efs.bam.layout.drive_inode_threshold efs.bam.layout.low_drive_threshold efs.bam.layout.high_drive_threshold efs.bam.layout.drive_metablock_threshold Then change to the below settings. # isi_sysctl_cluster efs.bam.layout.drive_inode_threshold=10240000 # isi_sysctl_cluster efs.bam.layout.low_drive_threshold=128000 # isi_sysctl_cluster efs.bam.layout.high_drive_threshold=256000 # isi_sysctl_cluster efs.bam.layout.drive_metablock_threshold=256000 2. If the issue does not get relieved after setting the above sysctls, force reboot OR disconnect the node with FULL disks temporarily from the IB network. 3. Once the node is back in the cluster run Auto-Balance (preferably Auto-BalanceLin even if there are no SSDs) job soon after the issue has been worked around. # isi job start AutoBalanceLin 4. In the end advice customer to: - Keep data balanced between drives, nodes, and pools. - Customer must ensure that the Auto-Balance, Auto-BalanceLin, or Multi-Scan job is run regularly. Especially after a device/drive failure or replacement. - If any node pools are too full to maintain balance add a new node or move data off to alternative node pools. Also in 9.4+ the below KB can assist. Drive Draining

Relevant Products

Click on a version to see all relevant bugs

Affected versions:No known affected versions

Fixed versions: No known fixed versions

Relevant Products

Click on a version to see all relevant bugs

Affected versions:No known affected versions

Fixed versions: No known fixed versions

Top Dell Defects

Defect ID: 183941
SupportAssist Enterprise: Collection and file uploads fail due to double-byte characters saved on the Contact information page
Defect ID: 129123
How to Identify and Resolve memory issues on Precision Desktop Workstations
Defect ID: 64097
Avamar: Error while relicensing a grid: "ERROR: avmaint: license: server_exception(MSG_ERR_NOTSTARTED)"
Defect ID: 58890
Avamar: Avamar EMC ItemPoint for Exchange may fail with error "Backup browse result is empty" and errors on Avmapi.log
Defect ID: 213297
NetWorker: External Authority Configuration Failed with Error: LDAP Error 49, data 52e

Ready to prevent the next vendor outage?

Get a demo

Dell - Defect ID: 169404

Intermittent performance issues in rare scenarios caused by 100% full disk drives leading to locking contention

Dell - Defect ID: 169404

Intermittent performance issues in rare scenarios caused by 100% full disk drives leading to locking contention

Last updated on November 26th, 2025

BugZero Risk Score0.0 Coming soon

Bug Details

Symptoms

Cause

Resolution

Top Dell Defects

Ready to prevent the next vendor outage?

Links

BugZero Risk Score
0.0 Coming soon