
OPERATIONAL DEFECT DATABASE
...

...
Description of problem: When an iSCSI initiator has a large number of sessions (1024 in the example below) open with a target server, rebooting the target or setting its firewall to reject traffic for a short period of time (15-30s) leaves some or all of the iSCSI sessions in a broken state. They cannot be logged out or logged in again using 'iscsiadm' - “Logging out of session …” messages are printed, but session/block device state is not affected. The only way to clear the state is to reboot the initiator. Also, ‘iscsiadm’ hangs when trying to print more session information. Version-Release number of selected component (if applicable): iscsi-initiator-utils.x86_64 6.2.1.4-4.git095f59c kernel.x86_64 4.18.0-372.26.1 How reproducible: Easily reproducible on every attempt. Steps to Reproduce: Setup the target with the following script (it requires 1024*512M space on /mnt, but disk size can be reduced) #!/bin/bash [[ -z $TARGETS ]] && TARGETS=1024 [[ -z $BASEDIR ]] && BASEDIR="/mnt" [[ -z $BASENAME ]] && BASENAME="iqn.2022-10.com.example" yum install -y targetcli firewall-cmd --permanent --add-port=3260/tcp firewall-cmd --reload cmds="" for tgt in $(seq "$TARGETS"); do disk="disk${tgt}" target="${BASENAME}:tgt${tgt}" cmds="${cmds}cd /backstores/fileio\n" cmds="${cmds}create disk${tgt} ${BASEDIR}/${disk} 512M\n" cmds="${cmds}cd /iscsi\n" cmds="${cmds}create ${target}\n" cmds="${cmds}cd /iscsi/${target}/tpg1/luns\n" cmds="${cmds}create /backstores/fileio/${disk}\n" cmds="${cmds}cd /iscsi/${target}/tpg1/acls\n" cmds="${cmds}create iqn.2022-10.com.example:s26\n" done echo -e "$cmds" | targetcli systemctl restart target Create sessions on the initiator: iscsiadm -m discoverydb --type sendtargets --portal 10.1.7.25 --discover # replace 10.1.7.25 with target IP iscsiadm -m node --login all One way to put the sessions in a broken state is to simply reboot the target server. Another is to reject iSCSI packets for a short interval, e.g. by running ‘iptables -A INPUT -p tcp --dport 3260 -j REJECT; sleep 30; iptables -D INPUT -p tcp --dport 3260 -j REJECT’. Actual results: The vast majority of iSCSI block devices on the initiator go from “running” into a “blocked” state (as per ‘/sys/block/sd*/device/state’), and after a while reach “transport-offline”. Trying to use the “iscsiadm -m session -P3” command hangs with the following output: [root@s26 ~]# iscsiadm -m session -P3 iSCSI Transport Class version 2.0-870 version 6.2.1.4-1 Target: iqn.2022-10.com.example:tgt1 (non-flash) Current Portal: 10.1.7.25:3260,1 Persistent Portal: 10.1.7.25:3260,1 ********** Interface: ********** Iface Name: default Iface Transport: tcp Iface Initiatorname: iqn.2022-10.com.example:s26 Iface IPaddress: 10.1.7.26 Iface HWaddress: default Iface Netdev: default SID: 1 When running the above command with strace, it seems to get stuck polling for a response: socket(AF_UNIX, SOCK_STREAM, 0) = 3 connect(3, {sa_family=AF_UNIX, sun_path=@"ISCSIADM_ABSTRACT_NAMESPACE"} , 30) = 0 write(3, "\r\0\0\0\0\0\0\0\1\0\0\0\0[...]”, 16104) = 16104 poll([ {fd=3, events=POLLIN} ], 1, 1000) = 0 (Timeout) Increasing the ‘node.session.timeo.replacement_timeout’ parameter in /etc/iscsi/iscsid.conf might allow for some devices to return back to a ‘running’ state (and they can be used as normal), but still leaves the system in an overall broken state. Expected results: The iSCSI sessions should either recover, or at least be able to be manually reconnected by doing a logout & login.
Won't Do
Click on a version to see all relevant bugs
Red Hat Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.