...
The neighborhood status of a node shows in domain 0 though it is in a multi-node cluster. For example: Isilon-24# sysctl efs.lin.lock.initiator.coordinator_weights efs.lin.lock.initiator.coordinator_weights: { gen: 454 failure_domains: { 0 (down): { all_nodes: { 26 } up_nodes: { 26 } weights: [ 670000 ] } 0 (down) would indicate that a node is down or split and part of Domain zero (0). But in this case, no nodes are down as indicated in the group statement below. Node-24 (devid-26) is indeed up, online, and part of the group. Isilon-24# isi_group_info efs.gmp.group: (54) :{ 1-2:0-14, 3:0-6,8-15, 4:0-10,12-15, 5:0,2,4-16, 6-9:0-14, 10:0,2-7,9-16, 11- 12:0-14, 13:0-9,11-15, 14:0-14, 15:0-1,3-15, 16-17:0-14, 18:1-15, 19-20,25-26:0-14, 27:0-1,3-12,15-17, 28:0-14, smb: 1-20,25-28, nfs: 1-20,25-28, all_enabled_protocols: 1-20,25-28, isi_cbind_d: 1-20,25-28, lsass: 1-20,25-28 }
This is a code defect and the impact of this condition is low, almost negligible. This issue would not prevent an upgrade. A potential impact is that the node is not chosen as a Locking Framework (LKF) delegate (primary or secondary) when taking locks on a resource. Three delegates are chosen for every lock. Delegates must be in the same neighborhood (failure domain) to avoid silent lock loss (and corruption) that might occur if all LKF delegates were rebooted simultaneously.
No additional actions are necessary. This issue would not prevent an upgrade. This issue self-corrects itself when the first node splits out of the group during the parallel upgrade. This triggers a recalculation of the failure domains which moves the node from the default failure domain to failure domain 1.
Click on a version to see all relevant bugs
Dell Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.