Loading...
Loading...
Intermittent performance issues in rare scenarios caused by 100% full disk drives leading to locking contention. /var/log/messages may show errors similar to: /boot/kernel.amd64/kernel: [bam_alloc.c:3712](pid 22982="lwio")(tid=102121) bam_s_alloc_wait stalled for 60 seconds (sysctl efs.bam.dump_alloc) If hangdumps are generated as the locking thread may show stack trace similar to these waiting for. bam_s_alloc_block or dev_local_alloc_dinodes or dev_local_alloc_blocks You can follow the KB article https://support.emc.com/kb/483388 to determine the locking thread, and its stack trace.The bam_s_alloc_block example: 11060 isi_migr_sworker: Waiting on 0xfffff804d177a780 with msg bam_s_alloc Stack: -------------------------------------------------- kernel:sched_switch+0x6df kernel:mi_switch+0x255 kernel:sleepq_timedwait+0x39 kernel:_sleep+0x28e kernel:bam_s_alloc_block+0x144 kernel:mds_txn_exec_impl+0x24d kernel:ifm_ext_txn_exec_impl+0x15 kernel:ifm_write_attribute_mods+0x2bf kernel:ifm_end_operation+0x1c3f kernel:txn_i_end+0x4f7 kernel:bam_snap_paint_gather+0x977 kernel:bam_snap_paint+0x300 kernel:bam_snap_cow_inodes+0x7f kernel:bam_replace_dead_inodes_inner+0x101e kernel:bam_replace_dead_inodes+0x1b kernel:ifs_vnop_wrapremove+0x108 kernel:VOP_REMOVE_APV+0xaa kernel:isi_kern_unlinkat+0x1f6 kernel:amd64_syscall+0x396 -------------------------------------------------- The dev_local_alloc_dinodes example: 4317 kt: dxt-worker: thread 0xfffff802759c7760: Stack: -------------------------------------------------- kernel:_sx_xlock_hard+0x1b6 kernel:lbm_alloc_invalidate_cg+0x49 kernel:lbm_alloc_prepare+0x57b kernel:lbm_alloc_inodes+0x80 kernel:dev_local_alloc_dinodes+0xcf kernel:handle_alloc_dinodes+0x14c kernel:efsidp_call_handler+0x25f kernel:dxt_main+0x71d kernel:kt_main+0x1ee kernel:fork_exit+0x74 -------------------------------------------------- The dev_local_alloc_dinodes example: 71089 kt: dxt-worker: thread 0xfffff8039e1b0760: Waiting on 0xfffffe0bc264d3f8 with msg "getblk" Stack: -------------------------------------------------- kernel:sched_switch+0x5bd kernel:mi_switch+0x21d kernel:sleepq_timedwait+0x39 kernel:sleeplk+0x155 kernel:__lockmgr_args+0x58a kernel:getblk_locked+0x3bd kernel:drv_getblk+0x7a kernel:j_getblk+0x317 kernel:lbm_aread+0x338 kernel:lbm_read+0x65 kernel:lbm_read_cgblock_helper+0xe6 kernel:lbm_read_cgblock+0x11d kernel:lbm_alloc_prepare+0x545 kernel:lbm_alloc_blocks+0x5c kernel:dev_local_alloc_blocks+0xf0 kernel:handle_alloc_blocks+0x1d3 kernel:efsidp_call_handler+0x25f kernel:dxt_main+0x71d kernel:kt_main+0x1ee kernel:fork_exit+0x74 --------------------------------------------------
The issue is caused by some drives on some nodes being close to 100% full. To find the drives with lowest blkfree. # isi_for_array -X sysctl efs.lbm.drive_space | grep blkfree | sort -nk3 -t '='| head. Any drives listed above with close to only 64,000 free blocks need attention.
1. Current solution/ work around to set the following sysctl values across the cluster to avoid similar issue should any drive reaches close to 100% full. Note down the current values before changing the settings: # sysctl efs.bam.layout.drive_inode_threshold efs.bam.layout.low_drive_threshold efs.bam.layout.high_drive_threshold efs.bam.layout.drive_metablock_threshold Then change to the below settings. # isi_sysctl_cluster efs.bam.layout.drive_inode_threshold=10240000 # isi_sysctl_cluster efs.bam.layout.low_drive_threshold=128000 # isi_sysctl_cluster efs.bam.layout.high_drive_threshold=256000 # isi_sysctl_cluster efs.bam.layout.drive_metablock_threshold=256000 2. If the issue does not get relieved after setting the above sysctls, force reboot OR disconnect the node with FULL disks temporarily from the IB network. 3. Once the node is back in the cluster run Auto-Balance (preferably Auto-BalanceLin even if there are no SSDs) job soon after the issue has been worked around. # isi job start AutoBalanceLin 4. In the end advice customer to: - Keep data balanced between drives, nodes, and pools. - Customer must ensure that the Auto-Balance, Auto-BalanceLin, or Multi-Scan job is run regularly. Especially after a device/drive failure or replacement. - If any node pools are too full to maintain balance add a new node or move data off to alternative node pools. Also in 9.4+ the below KB can assist. Drive Draining
Click on a version to see all relevant bugs
Dell Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.