...
BugZero found this defect 2759 days ago.
The subject replicaset has 3 nodes (see rs.conf() below). t1 IP address is 10.3.1.12 t2 IP address is 10.3.1.13 t3 IP address is 10.3.1.16 After a transient network failure (switch ports were disabled and enabled back) on the secondary (t3) it became primary, causing rollbacks on the previous primary (t1) and other secondary (t2). All writes are done with w:majority, so this is really strange. Logs from all three machines are attached. rs.conf() { "_id" : "driveFS-temp-1", "version" : 4, "protocolVersion" : NumberLong(1), "writeConcernMajorityJournalDefault" : false, "members" : [ { "_id" : 0, "host" : "t1.s1.fs.drive.bru:27231", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 }, { "_id" : 1, "host" : "t2.s1.fs.drive.bru:27231", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 }, { "_id" : 2, "host" : "t3.s1.fs.drive.bru:27231", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 } ], "settings" : { "chainingAllowed" : true, "heartbeatIntervalMillis" : 2000, "heartbeatTimeoutSecs" : 10, "electionTimeoutMillis" : 5000, "catchUpTimeoutMillis" : 2000, "getLastErrorModes" : { }, "getLastErrorDefaults" : { "w" : 1, "wtimeout" : 0 }, "replicaSetId" : ObjectId("58c9657b40aba377920b23f2") } }
onyxmaster commented on Mon, 1 May 2017 15:47:10 +0000: Thank you for the information. I was more surprised that election allowed a secondary to be elected as primary when primary was available and connected to the other secondary. Well, since this preserves the acknowledged majority writes it's okay. thomas.schubert commented on Mon, 1 May 2017 02:37:17 +0000: Hi onyxmaster, After reviewing the logs, there is no indication of a bug during this failover. While, w : majority acknowledges writes will not be rolled back, writes written with this write concern that have not been acknowledged to the application are liable to be rolled back on failover. In this case, it appears that writes were completed on the secondary, but the rest of the replicaset (and application by extension) was not yet aware that these writes had been completed. Consequently, the secondary and old primary rolled back on fail over. Kind regards, Thomas thomas.schubert commented on Wed, 19 Apr 2017 14:25:42 +0000: Hi onyxmaster, Thank you for the detailed report and logs, we're investigating this behavior and will update this ticket after we've finished reviewing the logs. Kind regards, Thomas