Loading...
Loading...
The vWitness daemon is shutting down every few days. From the storvwlsd logs: <Error> [1089 vwlsListen] pdsThrdCreate #3008 : Error Creating Thread: [Errno: 11 - Resource temporarily unavailable] <Error> [1089 vwlsListen] : [vwlsListen()] pdsThrdCreate() error. rc 700002011 ([PDS/Thrd] Error while trying to create a thread [Errno: 11 - Resource temporarily unavailable]) <Error> [1089 0005979xxxxx_MGMT-0] pdsSSLRead #3382 : SSL_read() failed: [Errno: 104 - Connection reset by peer] <Error> [1089 0005979xxxxx_MGMT-0] pdsIpcRecvMsg #3269 : Error during receive, [fd=507], handle=0x7fadfc4e9930, nRc=700004012 <Error> [1089 0005979xxxxx_MGMT-0] : [vwlsConn()] pdsIpcRecvMsg() failed rc 700004012 ([PDS/Sock] recv/recvfrom() failed [Errno: 104 - Connection reset by peer]) [1089 0005979xxxxx_MGMT-0] : [vwlsConn()] internal error pdsIpcRecvMsg, rc 700004012 ([PDS/Sock] recv/recvfrom() failed [Errno: 104 - Connection reset by peer]) [1089 0005979xxxxx_MGMT-0] : [vwlsConn()] connection to vWMD at ::ffff:xxx.xxx.xxx.xxx (::ffff:xxx.xxx.xxx.xxx) terminating [1089 Shutdown] : [daemonInst_shutdownCB()] storvwlsd shutting down Similar errors that are seen on the MGMT containers in the storvwmd log every hour: <Error> [851 vWitnessName] pdsSSLRead #3382 : SSL_read() failed: [Errno: 104 - Connection reset by peer] <Error> [851 vWitnessName] pdsIpcRecvMsg #3269 : Error during receive, [fd=9], handle=0x7f2e6c0008c0, nRc=700004012 <Error> [851 vWitnessName] : [pingVWLS()] Failed to receive PING response from vWitness vWitnessName: [PDS/Sock] recv/recvfrom() failed [Errno: 104 - Connection reset by peer] <Warn > [851 vWitnessName] : [pingVWLS()] Connection to vWitness vWitnessName is now closed <Dbg > [851 vWitnessName] : [attemptConnect()] Connecting to vWitness vWitnessName @ xxx.xxx.xxx.xxx:10123 [851 vWitnessName] : [attemptConnect()] Successfully reconnected to vWitness vWitnessName, UID: TEQbDim5a0Rh/var/messages shows:localhost kernel: [457290.134557] cgroup: fork rejected by pids controller in /system.slice/emc_storvwlsd.service
The customer's environment is configured to drop network connections after they have been connected for an hour. This causes the connections to reset, which triggers this issue. We see similar connection drops after 1 hour using an SSH connection to the vApp. Also, connecting to the arrays using Secure Remote Services (SRS), the connection drops at one hour. This causes the storvwlsd daemon to create too many tasks, as each new connection creates a new task. You can see the task count gradually increasing: # systemctl status emc_storvwlsd [0m emc_storvwlsd.service - LSB: EMC Solutions Enabler Witness Lock Service Daemon Loaded: loaded (/etc/init.d/emc_storvwlsd; bad; vendor preset: disabled) Active: [0;1;32mactive (running)[0m since Fri 2022-04-01 12:02:33 -03; 2 days ago Docs: man:systemd-sysv-generator(8) Process: 873 ExecStart=/etc/init.d/emc_storvwlsd start (code=exited, status=0/SUCCESS) Tasks: 251 (limit: 512 ) CGroup: /system.slice/emc_storvwlsd.service └─1076 storvwlsd start -name storvwlsd Once the number of tasks reaches 512, the service crashes. The number of tasks are similarly reported to the daemon: stordaemon action storvwlsd -cmd list -stats storvwlsd Statistics: # running threads : 321 # thread pools : 2 # active Mutex vars : 101 # active CondVars : 11 # active RW-locks : 1 # open IPC channels : 322 # active sockets ipV4 (total) : 320 # active sockets (secure) : 316 # files open : 3 # Page Faults : 64 Proc Size (KB) : 908500 When the limit of 512 is breached, the daemon shuts down as it cannot acquire the resources that are required to create a new connection.
The customer must work with their network team to identify the cause of the connection drops every hour. In this case, it was found that vApps on different subnets were also having this issue, suggesting something in the routing is triggering this disconnect.
Click on a version to see all relevant bugs
Dell Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.