Info
In ReplicationCoordinatorImpl::_scheduleNextLivenessUpdate_inlock(), we do not schedule a new liveness update if the nextTimeout would be in the past. This is wrong; we should schedule an immediate liveness update in that case.
One scenario is that we have just run our liveness check and the earliest live member was just barely fresh ("almost stale"), so we do nothing. A small time passes before we schedule the new one, and now that member is stale, so the next timeout period is in the past. We then stop doing liveness checks.
Top User Comments
xgen-internal-githook commented on Wed, 15 Nov 2017 15:55:22 +0000:
Author:
{'name': 'Judah Schvimer', 'username': 'judahschvimer', 'email': 'judah@mongodb.com'}
Message: SERVER-29937 Make sure liveness timeouts cannot be missed
(cherry picked from commit f1bf0b33b4f1ce7bb50f208ef5e2d736ef5eba68)
Branch: v3.4
https://github.com/mongodb/mongo/commit/e267cc9db06685424a3b8e074b5aeedc95746e87
xgen-internal-githook commented on Mon, 30 Oct 2017 16:32:58 +0000:
Author:
{'email': 'judah@mongodb.com', 'name': 'Judah Schvimer', 'username': 'judahschvimer'}
Message: SERVER-29937 Make sure liveness timeouts cannot be missed
(cherry picked from commit f1bf0b33b4f1ce7bb50f208ef5e2d736ef5eba68)
Branch: v3.2
https://github.com/mongodb/mongo/commit/6fdbdf619aed482bbe24ac3c27f8d4a9700a5937
ramon.fernandez commented on Fri, 15 Sep 2017 17:11:40 +0000:
Author:
{'username': u'judahschvimer', 'name': u'Judah Schvimer', 'email': u'judah@mongodb.com'}
Message:SERVER-29937 Make sure liveness timeouts cannot be missed
Branch:master
https://github.com/mongodb/mongo/commit/f1bf0b33b4f1ce7bb50f208ef5e2d736ef5eba68