Loading...
Loading...
One or more PowerScale nodes panic with a string similar to one of these in /var/log/messages : panic @ time 1778169056.246, thread 0xfffffe8ec0574100: Journal BTL drain on buf 0xfffffe8143e0c038 failed due to timeout . BTL was held by transaction (5:5143063721) [0xfffffe8efdbd3a40] tracking: getblk_corepanic @ time 1778435497.146, thread 0xfffffe98692b3000: BUF_TIMELOCK: Waited more than 300 seconds for lock on 0xfffffe82c3b03ea8 (lock access type: 0x89900; wmesg: getblk) -- lockinfo: lock state: EXCL (recursed 0), held by: 0xffffff1e8e18b000; buf_track: jt_flush_block extrainfos: (0: td: 0x0; flags: 200200; time: 4108081318; 1: td: 0x0; flags: 201200; time: 676860600; 2: td: 0x0; flags: 201200; time: 676860439); ext_fields: (b_ext = 0xfffffe82c3b04240; b_trans_item: 0xfffff82333d71de0; b_shadow_item: 0x0; b_ifs_type: 2; b_source_baddr: 2cade5b7000e0029); vnode: 0xfffff80a562026a8; ldnum: 14; disk: 0xfffff80509c67800 da2; iosched: (total_inqueue: 41820; total_inprog: 18; bio_in_prog: 18);panic @ time 1777739689.534, thread 0xfffffe9dff52bb00: BUF_TIMELOCK: Waited more than 240 seconds for lock on 0xfffffe837d0e17b0 (lock access type: 0x89900; wmesg: getblk) -- lockinfo: lock state: EXCL (recursed 0), held by: 0xfffffffffffffff0; buf_track: biodone extrainfos: (0: td: 0x0; flags: 200200; time: 3583726706; 1: td: 0x0; flags: 201200; time: 2710420770; 2: td: 0x0; flags: 201200; time: 1737195872); ext_fields: (b_ext = 0xfffffe837d0e1b48; b_trans_item: 0x0; b_shadow_item: 0x0; b_ifs_type: 0; b_source_baddr: 660d66b900020002); geom bio fields: (bp->b_bio: 0xfffff8325b43dce8; bio_cmd: 2; bio_tdflags: 0; bio_disk: 0x0); Possible hangdumps due to locking threads waiting for txn_i_commit which could potentially cause client latency. Dell support must analyze any hangdumps to confirm this symptom. Should also be able to see high disk queue on one or more drives on the affected node in /var/log/vmlog . For example: Drive Type OpsIn BytesIn OpsOut BytesOut TimeAvg TimeInQ Queued Busy ------------------------------------------------------------------------------ . . . 2:20 SAS 330.7 11.4M 18.7 284.0k 2.2ms 2.9ms 624.9 76.3
This is a newly identified issue in which frequent flushing leads to disk unresponsiveness. Affected versions: Only these exact versions are affected: OneFS 9.10.1.7 OneFS 9.13.0.1 OneFS 9.13.0.2 OneFS 9.13.1.0 OneFS releases before and after the above versions are not affected.
Workaround: If planning to upgrade to one of the affected releases, proactively apply this workaround before the upgrade. This workaround can also be applied after upgrading to the affected versions. Increase the sysctl vfs.dirtybufthresh by four times the default value n all nodes local sysctl.conf file by running the following command from any single node: isi_for_array -s 'grep -v ^# /etc/local/sysctl.conf 2>/dev/null | grep -q vfs.dirtybufthresh || echo "vfs.dirtybufthresh=$(( `sysctl -n vfs.dirtybufthresh` * 4 ))" >> /etc/local/sysctl.conf' The above command adds the following line to /etc/local/sysctl.conf on all nodes: vfs.dirtybufthresh=<value_for_specific_node> IMPORTANT: The above workaround is not applicable for Compliance Cluster. It is recommended to upgrade to the fixed code versions for Compliance Cluster. Permanent solution: If the cluster is already on one of the affected releases, apply the above workaround or upgrade to one of these releases or later. 9.10.1.X (Upcoming release) 9.13.1.X (Upcoming release) 9.14 (Released in April 2026) After upgrade to the fixed version, remember to comment out the workaround values in /etc/local/sysctl.conf if the workaround was applied. Although commenting out that line is not essential as the fixed version default value matches the workaround value, it is the best practice. To comment out on all nodes, run the command below: isi_for_array "sed -i '' 's/^vfs.dirtybufthresh/#vfs.dirtybufthresh/g' /etc/local/sysctl.conf " The above command comments out (with # ) the line beginning with "vfs.dirtybufthresh=" from /etc/local/sysctl.conf on each node.
Click on a version to see all relevant bugs
Dell Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.