BugZero | Red Hat BugID RHEL-8053 - iSCSI initiator cannot recover after target restar...

OPERATIONAL DEFECT DATABASE

...

BugZero | Red Hat BugID RHEL-8053 - iSCSI initiator cannot recover after target restar...

Red Hat - Defect ID: RHEL-8053

iSCSI initiator cannot recover after target restart with large number of sessions

Red Hat - Defect ID: RHEL-8053

iSCSI initiator cannot recover after target restart with large number of sessions

Last updated on March 15th, 2025

BugZero Risk Score
5.8 Medium

Overall: 5.8

Severity: 6.4

Community: 5.1

Lifecycle: 9.1

What is the BugZero Risk Score?

Red Hat Integration

Learn more about where this data comes from

Red Hat Integration

Learn more

Bug Scrub Advisor

Streamline upgrades with automated vendor bug scrubs

Bug Scrub Advisor

Learn more

BugZero Enterprise

Wish you caught this bug sooner? Get proactive today.

BugZero Enterprise

Learn more

Bug Details

Priority: Major
Status: Closed
Impact Category: iscsi-initiator-utils

Description

Issue

Description of problem: When an iSCSI initiator has a large number of sessions (1024 in the example below) open with a target server, rebooting the target or setting its firewall to reject traffic for a short period of time (15-30s) leaves some or all of the iSCSI sessions in a broken state. They cannot be logged out or logged in again using 'iscsiadm' - “Logging out of session …” messages are printed, but session/block device state is not affected. The only way to clear the state is to reboot the initiator. Also, ‘iscsiadm’ hangs when trying to print more session information. Version-Release number of selected component (if applicable): iscsi-initiator-utils.x86_64 6.2.1.4-4.git095f59c kernel.x86_64 4.18.0-372.26.1 How reproducible: Easily reproducible on every attempt. Steps to Reproduce: Setup the target with the following script (it requires 1024*512M space on /mnt, but disk size can be reduced) #!/bin/bash [[ -z $TARGETS ]] && TARGETS=1024 [[ -z $BASEDIR ]] && BASEDIR="/mnt" [[ -z $BASENAME ]] && BASENAME="iqn.2022-10.com.example" yum install -y targetcli firewall-cmd --permanent --add-port=3260/tcp firewall-cmd --reload cmds="" for tgt in $(seq "$TARGETS"); do disk="disk${tgt}" target="${BASENAME}:tgt${tgt}" cmds="${cmds}cd /backstores/fileio\n" cmds="${cmds}create disk${tgt} ${BASEDIR}/${disk} 512M\n" cmds="${cmds}cd /iscsi\n" cmds="${cmds}create ${target}\n" cmds="${cmds}cd /iscsi/${target}/tpg1/luns\n" cmds="${cmds}create /backstores/fileio/${disk}\n" cmds="${cmds}cd /iscsi/${target}/tpg1/acls\n" cmds="${cmds}create iqn.2022-10.com.example:s26\n" done echo -e "$cmds" | targetcli systemctl restart target Create sessions on the initiator: iscsiadm -m discoverydb --type sendtargets --portal 10.1.7.25 --discover # replace 10.1.7.25 with target IP iscsiadm -m node --login all One way to put the sessions in a broken state is to simply reboot the target server. Another is to reject iSCSI packets for a short interval, e.g. by running ‘iptables -A INPUT -p tcp --dport 3260 -j REJECT; sleep 30; iptables -D INPUT -p tcp --dport 3260 -j REJECT’. Actual results: The vast majority of iSCSI block devices on the initiator go from “running” into a “blocked” state (as per ‘/sys/block/sd*/device/state’), and after a while reach “transport-offline”. Trying to use the “iscsiadm -m session -P3” command hangs with the following output: [root@s26 ~]# iscsiadm -m session -P3 iSCSI Transport Class version 2.0-870 version 6.2.1.4-1 Target: iqn.2022-10.com.example:tgt1 (non-flash) Current Portal: 10.1.7.25:3260,1 Persistent Portal: 10.1.7.25:3260,1 ********** Interface: ********** Iface Name: default Iface Transport: tcp Iface Initiatorname: iqn.2022-10.com.example:s26 Iface IPaddress: 10.1.7.26 Iface HWaddress: default Iface Netdev: default SID: 1 When running the above command with strace, it seems to get stuck polling for a response: socket(AF_UNIX, SOCK_STREAM, 0) = 3 connect(3, {sa_family=AF_UNIX, sun_path=@"ISCSIADM_ABSTRACT_NAMESPACE"} , 30) = 0 write(3, "\r\0\0\0\0\0\0\0\1\0\0\0\0[...]”, 16104) = 16104 poll([ {fd=3, events=POLLIN} ], 1, 1000) = 0 (Timeout) Increasing the ‘node.session.timeo.replacement_timeout’ parameter in /etc/iscsi/iscsid.conf might allow for some devices to return back to a ‘running’ state (and they can be used as normal), but still leaves the system in an overall broken state. Expected results: The iSCSI sessions should either recover, or at least be able to be manually reconnected by doing a logout & login.

Release Notes

No. of Comments

Resolution

Won't Do

Change history

2025-03-15 Added: 8.6.0

Links

Relevant Products

Click on a version to see all relevant bugs

Affected versions:8.6.0

Fixed versions: No known fixed versions

Relevant Products

Click on a version to see all relevant bugs

Affected versions:8.6.0

Fixed versions: No known fixed versions

Top Red Hat Defects

9.3Defect ID: RHEL-90416
Autoregv2: Cloud Images do not have immediate access to content from CDN
9.3Defect ID: RHEL-17164
System cannot boot when usr is a separate file system with latest systemd-219-78.el7_9.8
9.3Defect ID: RHEL-71547
glibc: Fix transliteration regression in iconv tool
9.2Defect ID: RHEL-105250
Fix redhat-cloud-client-configuration to support the new autoregistration v2 flow
9.2Defect ID: RHEL-80090
Upgrade libseccomp on both CentOS 9 & 10

Red Hat Integration

Learn more about where this data comes from

Red Hat Integration

Learn more

Bug Scrub Advisor

Streamline upgrades with automated vendor bug scrubs

Bug Scrub Advisor

Learn more

BugZero Enterprise

Wish you caught this bug sooner? Get proactive today.

BugZero Enterprise

Learn more

Ready to prevent the next vendor outage?

Get a demo

OPERATIONAL DEFECT DATABASE

Red Hat - Defect ID: RHEL-8053

iSCSI initiator cannot recover after target restart with large number of sessions

Red Hat - Defect ID: RHEL-8053

iSCSI initiator cannot recover after target restart with large number of sessions

Last updated on March 15th, 2025

BugZero Risk Score5.8 Medium

Bug Details

Issue

Release Notes

No. of Comments

Resolution

Links

Top Red Hat Defects

Ready to prevent the next vendor outage?

BugZero Risk Score
5.8 Medium