...
This issue is present in all OneFS versions prior to fix development.It can impact any application that uses NFSv4 and locking operations. This issue has most often been observed with "Message Queue" applications such as ActiveMQ or OpenMQ as they make extensive use of NFSv4 locking mechanisms. For MQ applications; it often manifests as a Secondary MQ server taking over for the Primary when it should not.This causes outages or data inconsistencies that require manual intervention on the Application side to resolve. When observed in packet captures, the client locks a file successfully while a second client attempts to access it.When the node the locking client is connected to reboots, the second client is granted the lock.The first client receives an error when their lock ends since they no longer have the lock. The issue can be reproduced as follows: On the first client, mount an export using NFSv4.0.On the second client, mount the same export using NFSv4.0.On the first client, lock a file for 5 minutes.On the second client, start a loop attempting to lock the same file.Reboot the OneFS node that the first client is connected to.If the issue is present, the second client is granted a lock before the first client lock ends.
There was an error in our logic for moving NFSv4 connections to other nodes that caused the locks to be released on node reboot.
Fix:Upgrade or patch to one of the following versions of OneFS. 9.1.0.19+9.2.1.12+9.4.0.3+9.5.0.0+ Workaround:There are no workarounds to this issue.
Click on a version to see all relevant bugs
Dell Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.