...
Description of problem: "sockperf pingpong multicast pkey/vlan" test case consistently fails when libvma is tested on MLX4 IB0 devices. vma test results on rdma-virt-00/rdma-virt-01 & Beaker job J:7282087: 4.18.0-438.el8.x86_64, rdma-core-41.0-1.el8, mlx4, ib0, ConnectX-3 Pro & mlx4_0 Result | Status | Test ------------------------------------------------- PASS | 0 | sockperf pingpong multicast PASS | 0 | sockperf throughput multicast PASS | 0 | sockperf throughput unicast PASS | 0 | sockperf pingpong unicast PASS | 0 | sockperf (100 sockets) pingpong multicast PASS | 0 | sockperf (100 sockets) pingpong unicast FAIL | 1 | sockperf pingpong multicast pkey/vlan <<<============ PASS | 0 | sockperf pingpong unicast pkey/vlan Checking for failures and known issues: sockperf pingpong multicast pkey/vlan is NOT a known issue on any environment - consider filing a BZ +++++++++++++++++++++++++ This is a regression where, with RHEL-8.7.0, the same test passes. vma test results on rdma-virt-00/rdma-virt-01 & Beaker job J:7282212: 4.18.0-425.3.1.el8.x86_64, rdma-core-41.0-1.el8, mlx4, ib0, ConnectX-3 Pro & mlx4_0 Result | Status | Test ------------------------------------------------- PASS | 0 | sockperf pingpong multicast PASS | 0 | sockperf throughput multicast PASS | 0 | sockperf throughput unicast PASS | 0 | sockperf pingpong unicast PASS | 0 | sockperf (100 sockets) pingpong multicast PASS | 0 | sockperf (100 sockets) pingpong unicast PASS | 0 | sockperf pingpong multicast pkey/vlan <<<===== PASS | 0 | sockperf pingpong unicast pkey/vlan Checking for failures and known issues: no test failures Version-Release number of selected component (if applicable): Clients: rdma-virt-01 Servers: rdma-virt-00 DISTRO=RHEL-8.8.0-20221120.2 + [22-11-27 12:50:17] cat /etc/redhat-release Red Hat Enterprise Linux release 8.8 Beta (Ootpa) + [22-11-27 12:50:17] uname -a Linux rdma-virt-01.rdma.lab.eng.rdu2.redhat.com 4.18.0-438.el8.x86_64 #1 SMP Mon Nov 14 13:08:07 EST 2022 x86_64 x86_64 x86_64 GNU/Linux + [22-11-27 12:50:17] cat /proc/cmdline BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-438.el8.x86_64 root=UUID=7a33441c-1f35-4485-a8ce-fb7f97d8ec9b ro intel_idle.max_cstate=0 processor.max_cstate=0 intel_iommu=on iommu=on console=tty0 rd_NO_PLYMOUTH crashkernel=auto resume=UUID=38129101-96b6-4039-9d07-de4355860d19 console=ttyS1,115200n81 + [22-11-27 12:50:17] rpm -q rdma-core linux-firmware rdma-core-41.0-1.el8.x86_64 linux-firmware-20220726-110.git150864a4.el8.noarch + [22-11-27 12:50:17] tail /sys/class/infiniband/mlx4_0/fw_ver /sys/class/infiniband/mlx4_1/fw_ver ==> /sys/class/infiniband/mlx4_0/fw_ver <== 2.42.5000 ==> /sys/class/infiniband/mlx4_1/fw_ver <== 2.42.5000 + [22-11-27 12:50:17] lspci + [22-11-27 12:50:17] grep -i -e ethernet -e infiniband -e omni -e ConnectX 02:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe 02:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe 03:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe 03:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe 04:00.0 Network controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro] 04:00.1 Network controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] 04:00.2 Network controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] 04:00.3 Network controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] 04:00.4 Network controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] 04:00.5 Network controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] 04:00.6 Network controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] 04:00.7 Network controller: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] 06:00.0 Network controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro] + [22-11-27 12:50:18] rpm -q libvma libvma-9.6.4-1.el8.x86_64 How reproducible: 100% Steps to Reproduce: On both server and client hosts: + [22-11-27 12:58:15] export SERVER_IP2=172.31.2.200 + [22-11-27 12:58:15] SERVER_IP2=172.31.2.200 1. Get both RDMA server & client hosts ready for libvma test with current sockperf, with RHEL-8.8 build as specified above on MLX4 IB0 devices 2. On server host, issue the following commands LD_PRELOAD=libvma.so timeout --preserve-status --kill-after=5m 3m sockperf server -i 172.31.2.200 3. On the client hosts, issue the following command LD_PRELOAD=libvma.so timeout --preserve-status --kill-after=5m 3m sockperf pp -i 172.31.2.200 -t 10 --msg-size=1472 Actual results: [0m VMA INFO: --------------------------------------------------------------------------- [0m[0m VMA INFO: VMA_VERSION: 9.6.4-1 Release built on Aug 17 2022 14:23:56 [0m[0m VMA INFO: Cmd Line: date ++ [%y-%m-%d %H:%M:%S] [0m[0m VMA INFO: --------------------------------------------------------------------------- [0m[0m VMA INFO: Log Level INFO [VMA_TRACELEVEL] [0m[0m VMA INFO: --------------------------------------------------------------------------- [0m+ [22-11-27 13:00:53] LD_PRELOAD=libvma.so [0m VMA INFO: --------------------------------------------------------------------------- [0m[0m VMA INFO: VMA_VERSION: 9.6.4-1 Release built on Aug 17 2022 14:23:56 [0m[0m VMA INFO: Cmd Line: date ++ [%y-%m-%d %H:%M:%S] [0m[0m VMA INFO: --------------------------------------------------------------------------- [0m[0m VMA INFO: Log Level INFO [VMA_TRACELEVEL] [0m[0m VMA INFO: --------------------------------------------------------------------------- [0m+ [22-11-27 13:00:53] timeout --preserve-status --kill-after=5m 3m sockperf pp -i 172.31.2.200 -t 10 --msg-size=1472 [0m VMA INFO: --------------------------------------------------------------------------- [0m[0m VMA INFO: VMA_VERSION: 9.6.4-1 Release built on Aug 17 2022 14:23:56 [0m[0m VMA INFO: Cmd Line: timeout --preserve-status --kill-after=5m 3m sockperf pp -i 172.31.2.200 -t 10 --msg-size=1472 [0m[0m VMA INFO: --------------------------------------------------------------------------- [0m[0m VMA INFO: Log Level INFO [VMA_TRACELEVEL] [0m[0m VMA INFO: --------------------------------------------------------------------------- [0m[0m VMA INFO: --------------------------------------------------------------------------- [0m[0m VMA INFO: VMA_VERSION: 9.6.4-1 Release built on Aug 17 2022 14:23:56 [0m[0m VMA INFO: Cmd Line: sockperf pp -i 172.31.2.200 -t 10 --msg-size=1472 [0m[0m VMA INFO: --------------------------------------------------------------------------- [0m[0m VMA INFO: Log Level INFO [VMA_TRACELEVEL] [0m[0m VMA INFO: --------------------------------------------------------------------------- [0msockperf: [2;35m== version #3.10-0.git5ebd327da983 == [0m sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s) [ 0] IP = 172.31.2.200 PORT = 11111 # UDP sockperf: Warmup stage (sending a few dummy messages)... sockperf: Starting test... sockperf: Test end (interrupted by timer) sockperf: Test ended sockperf: No messages were received from the server. Is the server down? + [22-11-27 13:01:07] result=0 + [22-11-27 13:01:07] '[' 0 -ne 0 ']' + [22-11-27 13:01:07] grep -qi -e ' error ' -e 'no messages were received' /tmp/vma.txt + [22-11-27 13:01:07] return 1 Expected results: On RHEL-8.7.0, the above same testcase result look like the following: [0m VMA INFO: --------------------------------------------------------------------------- [0m[0m VMA INFO: VMA_VERSION: 9.6.4-1 Release built on Aug 17 2022 14:23:56 [0m[0m VMA INFO: Cmd Line: date ++ [%y-%m-%d %H:%M:%S] [0m[0m VMA INFO: --------------------------------------------------------------------------- [0m[0m VMA INFO: Log Level INFO [VMA_TRACELEVEL] [0m[0m VMA INFO: --------------------------------------------------------------------------- [0m+ [22-11-27 16:42:52] LD_PRELOAD=libvma.so [0m VMA INFO: --------------------------------------------------------------------------- [0m[0m VMA INFO: VMA_VERSION: 9.6.4-1 Release built on Aug 17 2022 14:23:56 [0m[0m VMA INFO: Cmd Line: date ++ [%y-%m-%d %H:%M:%S] [0m[0m VMA INFO: --------------------------------------------------------------------------- [0m[0m VMA INFO: Log Level INFO [VMA_TRACELEVEL] [0m[0m VMA INFO: --------------------------------------------------------------------------- [0m+ [22-11-27 16:42:52] timeout --preserve-status --kill-after=5m 3m sockperf pp -i 172.31.2.200 -t 10 --msg-size=1472 [0m VMA INFO: --------------------------------------------------------------------------- [0m[0m VMA INFO: VMA_VERSION: 9.6.4-1 Release built on Aug 17 2022 14:23:56 [0m[0m VMA INFO: Cmd Line: timeout --preserve-status --kill-after=5m 3m sockperf pp -i 172.31.2.200 -t 10 --msg-size=1472 [0m[0m VMA INFO: --------------------------------------------------------------------------- [0m[0m VMA INFO: Log Level INFO [VMA_TRACELEVEL] [0m[0m VMA INFO: --------------------------------------------------------------------------- [0m[0m VMA INFO: --------------------------------------------------------------------------- [0m[0m VMA INFO: VMA_VERSION: 9.6.4-1 Release built on Aug 17 2022 14:23:56 [0m[0m VMA INFO: Cmd Line: sockperf pp -i 172.31.2.200 -t 10 --msg-size=1472 [0m[0m VMA INFO: --------------------------------------------------------------------------- [0m[0m VMA INFO: Log Level INFO [VMA_TRACELEVEL] [0m[0m VMA INFO: --------------------------------------------------------------------------- [0msockperf: [2;35m== version #3.10-0.git5ebd327da983 == [0m sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s) [ 0] IP = 172.31.2.200 PORT = 11111 # UDP sockperf: Warmup stage (sending a few dummy messages)... sockperf: Starting test... sockperf: Test end (interrupted by timer) sockperf: Test ended sockperf: [Total Run] RunTime=10.000 sec; Warm up time=400 msec; SentMessages=1785444; ReceivedMessages=1785443 sockperf: ========= Printing statistics for Server No: 0 sockperf: [Valid Duration] RunTime=9.550 sec; SentMessages=1711702; ReceivedMessages=1711702 sockperf: [2;35m====> avg-latency=2.772 (std-dev=0.286, mean-ad=0.081, median-ad=0.036, siqr=0.028, cv=0.103, std-error=0.000, 99.0% ci=[2.771, 2.773])[0m sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0 sockperf: Summary: Latency is 2.772 usec sockperf: [2;35mTotal 1711702 observations[0m; each percentile contains 17117.02 observations sockperf: ---> <MAX> observation = 81.937 sockperf: ---> percentile 99.999 = 55.007 sockperf: ---> percentile 99.990 = 4.831 sockperf: ---> percentile 99.900 = 4.301 sockperf: ---> percentile 99.000 = 3.303 sockperf: ---> percentile 90.000 = 2.901 sockperf: ---> percentile 75.000 = 2.763 sockperf: ---> percentile 50.000 = 2.728 sockperf: ---> percentile 25.000 = 2.706 sockperf: ---> <MIN> observation = 2.621 + [22-11-27 16:43:06] result=0 + [22-11-27 16:43:06] '[' 0 -ne 0 ']' + [22-11-27 16:43:06] grep -qi -e ' error ' -e 'no messages were received' /tmp/vma.txt + [22-11-27 16:43:06] return 0 + [22-11-27 16:43:06] RQA_check_result -r 0 -t 'sockperf pingpong multicast pkey/vlan' Additional info:
Won't Do