Loading...
Loading...
One or more of the following symptoms could be noticed: One or more PowerScale nodes panic with a string similar to one of these in /var/log/messages : panic @ time 1778169056.246, thread 0xfffffe8ec0574100: Journal BTL drain on buf 0xfffffe8143e0c038 failed due to timeout . BTL was held by transaction (5:5143063721) [0xfffffe8efdbd3a40] tracking: getblk_corepanic @ time 1778435497.146, thread 0xfffffe98692b3000: BUF_TIMELOCK: Waited more than 300 seconds for lock on 0xfffffe82c3b03ea8 (lock access type: 0x89900; wmesg: getblk) -- lockinfo: lock state: EXCL (recursed 0), held by: 0xffffff1e8e18b000; buf_track: jt_flush_block extrainfos: (0: td: 0x0; flags: 200200; time: 4108081318; 1: td: 0x0; flags: 201200; time: 676860600; 2: td: 0x0; flags: 201200; time: 676860439); ext_fields: (b_ext = 0xfffffe82c3b04240; b_trans_item: 0xfffff82333d71de0; b_shadow_item: 0x0; b_ifs_type: 2; b_source_baddr: 2cade5b7000e0029); vnode: 0xfffff80a562026a8; ldnum: 14; disk: 0xfffff80509c67800 da2; iosched: (total_inqueue: 41820; total_inprog: 18; bio_in_prog: 18);panic @ time 1777739689.534, thread 0xfffffe9dff52bb00: BUF_TIMELOCK: Waited more than 240 seconds for lock on 0xfffffe837d0e17b0 (lock access type: 0x89900; wmesg: getblk) -- lockinfo: lock state: EXCL (recursed 0), held by: 0xfffffffffffffff0; buf_track: biodone extrainfos: (0: td: 0x0; flags: 200200; time: 3583726706; 1: td: 0x0; flags: 201200; time: 2710420770; 2: td: 0x0; flags: 201200; time: 1737195872); ext_fields: (b_ext = 0xfffffe837d0e1b48; b_trans_item: 0x0; b_shadow_item: 0x0; b_ifs_type: 0; b_source_baddr: 660d66b900020002); geom bio fields: (bp->b_bio: 0xfffff8325b43dce8; bio_cmd: 2; bio_tdflags: 0; bio_disk: 0x0); Locking contention due to thread waiting for txn_i_commit which could potentially cause client latency. Dell support must analyze any hangdumps to confirm this symptom. High disk queue on one or more drives on the affected node in /var/log/vmlog . For example: Drive Type OpsIn BytesIn OpsOut BytesOut TimeAvg TimeInQ Queued Busy ------------------------------------------------------------------------------ . . . 2:20 SAS 330.7 11.4M 18.7 284.0k 2.2ms 2.9ms 624.9 76.3
This is a newly identified issue in which frequent flushing leads to disk unresponsiveness. Affected versions: Only these exact versions are affected: OneFS 9.10.1.7 OneFS 9.13.0.1 OneFS 9.13.0.2 OneFS 9.13.1.0 OneFS releases before and after these versions are not affected.
Workaround: If planning to upgrade to one of the affected releases, proactively apply this workaround before the upgrade. This workaround can also be applied after upgrading to the affected versions. The value for the workaround has been increased since 2026-June-03, especially for A-class nodes with 96 GB or lower memory. Increase the sysctl vfs.dirtybufthresh on all nodes local sysctl.conf file by performing these steps from any single node: Comment out any workaround if applied as per this article updates before 2026-June-03 in /etc/local/sysctl.conf on all nodes. To do so run: isi_for_array "sed -i '' 's/^vfs.dirtybufthresh/#vfs.dirtybufthresh/g' /etc/local/sysctl.conf " Apply this new updated workaround in /etc/local/sysctl.conf on all nodes. To do so run: isi_for_array -s 'echo "vfs.dirtybufthresh=$((16 * (`sysctl -n kern.nbuf` / 2 + 20) * 9 / 10 ))" >> /etc/local/sysctl.conf' The above command adds the following line to /etc/local/sysctl.conf on all nodes: vfs.dirtybufthresh=<value_for_specific_node> The above workaround is NOT applicable for Compliance Cluster. It is recommended to upgrade to the fixed code versions for Compliance Cluster. Permanent solution: Engineering is planning a permanent solution in future OneFS releases. Until those releases are available, keep the workaround values set in /etc/local/sysctl.conf .
Click on a version to see all relevant bugs
Dell Integration
Learn more about where this data comes from
BugZero Plan
Streamline upgrades with automated vendor bug scrubs
BugZero Prevent
Wish you caught this bug sooner? Get proactive today.