BugZero | MongoDB BugID 417750 - Secondary stuck in loop of current election

OPERATIONAL DEFECT DATABASE

...

BugZero | MongoDB BugID 417750 - Secondary stuck in loop of current election

MongoDB - Defect ID: 417750

Secondary stuck in loop of current election

MongoDB - Defect ID: 417750

Secondary stuck in loop of current election

Last updated on May 21st, 2020

BugZero Risk Score
6.2 Medium

Overall: 6.2

Severity: 6.4

Community: 8.5

Lifecycle: 9.1

What is the BugZero Risk Score?

MongoDB Integration

Learn more about where this data comes from

MongoDB Integration

Learn more

Bug Scrub Advisor

Streamline upgrades with automated vendor bug scrubs

Bug Scrub Advisor

Learn more

BugZero Enterprise

Wish you caught this bug sooner? Get proactive today.

BugZero Enterprise

Learn more

Bug Details

Priority: Major - P3
Status: Closed

Description

Info

Hi, This was witnessed twice where the events happened in the following order. 1. Primary(A) is down. 2. Secondary(B) detects and starts an election. 3. It wins the dry-run election and on every subsequent election timeout(10s) prints the following. This goes on until its restarted. [ReplicationExecutor] Not starting an election, since we are not electable due to: Not standing for election again; already candidate 4. A is started. It performs the election and is primary. 5. B is restarted. On one instance it took a long time to shutdown and in another had to be SIGKILLed. 6. A is shutdown. 7. B detects, starts elections and becomes primary. Relevant logs from Node A for step 1-4 2017-08-17T05:32:08.534-0700 I REPL [ReplicationExecutor] Starting an election, since we've seen no PRIMARY in the past 10000ms 2017-08-17T05:32:08.534-0700 I REPL [ReplicationExecutor] conducting a dry run election to see if we could be elected 2017-08-17T05:32:08.534-0700 I REPL [ReplicationExecutor] VoteRequester(term 8 dry run) received a yes vote from nodeB:27017; response message: { term: 8, voteGranted: true, reason: "", ok: 1.0 } 2017-08-17T05:32:08.534-0700 I REPL [ReplicationExecutor] dry election run succeeded, running for election 2017-08-17T05:32:08.621-0700 I REPL [ReplicationExecutor] VoteRequester(term 9) received a yes vote from node-arbiter:27022; response message: { term: 9, voteGranted: true, reason: "", ok: 1.0 } 2017-08-17T05:32:08.621-0700 I REPL [ReplicationExecutor] election succeeded, assuming primary role in term 9 Relevant logs from Node B for step 1-4 2017-08-17T05:18:32.066-0700 I REPL [ReplicationExecutor] Starting an election, since we've seen no PRIMARY in the past 10000ms 2017-08-17T05:18:32.066-0700 I REPL [ReplicationExecutor] conducting a dry run election to see if we could be elected 2017-08-17T05:18:32.068-0700 I REPL [ReplicationExecutor] VoteRequester(term 7 dry run) failed to receive response from nodeA:27017: HostUnreachable: Connection refused 2017-08-17T05:18:32.069-0700 I REPL [ReplicationExecutor] VoteRequester(term 7 dry run) received a yes vote from node-Arbiter:27022; response message: { term: 7, voteGranted: true, reason: "", ok: 1.0 } 2017-08-17T05:18:32.069-0700 I REPL [ReplicationExecutor] dry election run succeeded, running for election 2017-08-17T05:18:42.441-0700 I REPL [ReplicationExecutor] Not starting an election, since we are not electable due to: Not standing for election again; already candidate 2017-08-17T05:18:53.473-0700 I REPL [ReplicationExecutor] Not starting an election, since we are not electable due to: Not standing for election again; already candidate 2017-08-17T05:19:04.837-0700 I REPL [ReplicationExecutor] Not starting an election, since we are not electable due to: Not standing for election again; already candidate ... Note that, although nodeB didn't log any voteRequester messages, it does increase the election term from 7 to 8. This can be confirmed by the term number used by nodeA for dry-run election, 8. Assuming that nothing went wrong in starting the election, I believe this loop can happen when a node waits indefinitely for voteRequests after starting a real election. I couldn't find any information related to this.

Top User Comments

matthew.russotto commented on Thu, 11 Jul 2019 19:15:30 +0000: This makes sense; WiredTigerRecordStore::reclaimOplog never releases the IX lock on "local" as long as there is work to do. bartle commented on Mon, 22 Apr 2019 04:51:54 +0000: I think I figured out what happened. These particular replsets were being run on EBS volumes created from EBS snapshots, and the oplog was also truncated (via replSetResizeOplog). The latter would have caused the background oplog thread to start pruning oplog entries. Obviously, that process takes a lock (IX) on "local", and it appears that reading local.system.replset also happens relatively frequently, which also takes a lock (IX) on "local". With pv1, every time an election is called, the "local.replset.election" file is written, which takes an X lock on "local". I believe that latter operation was being blocked by all the other IX locks on "local", and by the time the "X" lock was acquired, election timers had expired. I suspect if I had waited long enough, the background thread might have finished chewing through the oplog. bartle commented on Sun, 21 Apr 2019 05:21:55 +0000: Oh, and the situation we saw this in was a 3-node replset where at least 1 member was down initially. However, even when all members eventually come back up, they're still unable to elect a primary because it looks like 2 out of 3 nodes end up in the wedged state. bartle commented on Sun, 21 Apr 2019 04:32:16 +0000: We recently ran into this issue (on 3.4.20). I think the deadlock issue is on the right track, since running currentOp indicates that "replExecDBWorker-1" is stuck waiting for a lock while holding an X lock on "local"; calling "show collections" and "show dbs" also hang, presumably because they can't an IX lock on "local". I could't figure out what was actually holding the lock, but I can try attaching gdb next time and try to dump all thread stacks. I might be misreading currentOp, but I think it's the case that if "waitingForLock" is true, there isn't any indication of which lock an operation is waiting on? i.e. the "locks" document only contain locks held, and not those being waited on? It'd probably be good to surface the lock being waited on as well? william.schultz commented on Tue, 16 Jan 2018 15:56:13 +0000: Closing this for now, since I was unable to reproduce the issue. If it occurs again, we can possibly re-open it and investigate further. william.schultz commented on Mon, 18 Dec 2017 19:23:41 +0000: My suspicion is that there may be a deadlock on node5-2 after the dry run election succeeds, causing the thread that is going to start the real election to hang. Election timeouts keep firing and being serviced by the ReplicationExecutor, causing us to see the "Not running for election" messages repetedly, but the election thread is hung, so we indefinitely remain as a candidate. I have not figured out the root cause yet, but it may be related to how the Status ReplicationCoordinatorExternalStateImpl::storeLocalLastVoteDocument function takes locks. william.schultz commented on Mon, 18 Dec 2017 18:32:12 +0000: shredder12 Is this issue easily reproducible, or has it only been seen rarely? william.schultz commented on Thu, 14 Sep 2017 20:33:33 +0000: I did look into this briefly, but have not come up with a conclusive diagnosis yet. shredder12 commented on Fri, 18 Aug 2017 21:26:07 +0000: Hi Thomas, Thank you for the quick response. PFA logs for the replicaset nodes during the incident. node5-1.log node5-2.log node5-arb.log node5-1 aka nodeA node5-2 aka nodeB node5-arb - the arbiter of the replicaset. thomas.schubert commented on Fri, 18 Aug 2017 02:28:57 +0000: Hi shredder12, Thanks for reporting this behavior. So we can better understand what is going on here, would you please provide the complete log files for each node in the replica set as well as the replica set configuration, e.g. the output of rs.conf()? Thank you, Thomas

Steps to Reproduce

Change history

2025-03-21 Added: 3.4.3

Links

Relevant Products

Click on a version to see all relevant bugs

Affected versions:3.4.3

Fixed versions: No known fixed versions

Relevant Products

Click on a version to see all relevant bugs

Affected versions:3.4.3

Fixed versions: No known fixed versions

Top MongoDB Defects

5.3Defect ID: 3215941
moveChunk with waitForDelete hangs when range deleter is disabled
5.3Defect ID: 3233763
Escape CheckMetadataConsistency hook assertion from running in sharded clusters
5.3Defect ID: 3214801
mongod always uses a SHA1 Signature Algorithm on Windows for Client Auth
5.3Defect ID: 3215948
[v7.0] Resharding doesn't create a shard key index if a multikey index exists prefixed by the shard key
5.3Defect ID: 3198578
$listClusterCatalog reports wrong sharding metadata for timeseries collections

MongoDB Integration

Learn more about where this data comes from

MongoDB Integration

Learn more

Bug Scrub Advisor

Streamline upgrades with automated vendor bug scrubs

Bug Scrub Advisor

Learn more

BugZero Enterprise

Wish you caught this bug sooner? Get proactive today.

BugZero Enterprise

Learn more

Ready to prevent the next vendor outage?

Get a demo

OPERATIONAL DEFECT DATABASE

MongoDB - Defect ID: 417750

Secondary stuck in loop of current election

MongoDB - Defect ID: 417750

Secondary stuck in loop of current election

Last updated on May 21st, 2020

BugZero Risk Score6.2 Medium

Bug Details

Info

Top User Comments

Steps to Reproduce

Links

Top MongoDB Defects

Ready to prevent the next vendor outage?

BugZero Risk Score
6.2 Medium