...
Because arbiters don't persist any data, they never replicate the admin.system.keys collection, which replica set members read from locally to load the keys used for clusterTime signing and validation. Currently, this issue was masked in our tests because clusterTimes aren't signed or validated when FCV is not fully upgraded to v3.6, but since arbiters also don't persist the admin.system.version collection, they never update their in-memory FCV and stay at v3.4. After we remove the FCV checks in SERVER-32463, non __system users will be no longer be able to communicate with arbiters in standalone replica sets with auth on, unless they have the advanceClusterTime privilege or don't gossip clusterTime. This shouldn't be a problem in sharded clusters, since keys are only persisted on the CSRS and are cached in memory on every other node in the cluster. Implementation: The simplest and most efficient way is not to install logical clock if its an arbiter node: https://github.com/mongodb/mongo/blob/r3.7.3/src/mongo/db/db.cpp#L780 However at the time of this check the node is not yet processed configuration and will not be able to recognize if its isArbiter or not. Hence another approach is chosen. Sharding part: jack.mulrow please ack. 1. Add bool LogicalClock::isEnabled() const; bool _isEnabled {true}; // initially is enabled to allow normal RS initialization and void LogicalClock::setEnabled(bool) Alternative: do not create LogicalClock optionally anymore: I prefer to not change more than needed mostly because the performance is hurt by taking the lock every time LogicalClock data is accessed. 2. Other LogicalClock public methods should invariant(_isEnabled); so there is no accidental calls to a disabled logical clock. Therefor the responsibility to check is on the caller. 3.Do not validate or advance logicalTime if its not enabled. https://github.com/mongodb/mongo/blob/r3.7.3/src/mongo/rpc/metadata.cpp#L102 4.Do not append LogicalTime metadata if LogicalClock is not enabled at https://github.com/mongodb/mongo/blob/r3.7.3/src/mongo/db/service_entry_point_common.cpp#L264, https://github.com/mongodb/mongo/blob/r3.7.3/src/mongo/db/service_entry_point_common.cpp#L292 Replication part: judah.schvimer please ack. If the node is a ReplicaSet arbiter member but its sharding state is not enabled then logical clock should not be enabled. 5. https://github.com/mongodb/mongo/blob/r3.7.3/src/mongo/db/db.cpp#L539 initilizes replication coordinator so it can tell if the current node is an arbiter. This will be the place where to disable the logical clock if it is an arbiter. Note: Monitor keys: https://github.com/mongodb/mongo/blob/r3.7.3/src/mongo/db/db.cpp#L527 happens before this line. I dont think it can be moved later as keys may be needed for proper initialization of oplog
misha.tyulenev commented on Mon, 16 Apr 2018 20:02:00 +0000: While 3.6 is not affected by this issue due to another bug / feature (the arbiter node "thinks" that its 3.4 and skips cluster time processing) if its fixed it will open this bug. Hence I think its better to backport the fix in advance. xgen-internal-githook commented on Fri, 13 Apr 2018 00:47:12 +0000: Author: {'email': 'misha@mongodb.com', 'name': 'Misha Tyulenev', 'username': 'mikety'} Message: SERVER-32639 skip signing and validating clusterTime in arbiters Branch: master https://github.com/mongodb/mongo/commit/b7b55e75bbf18bcd7e38fdee430e0fd972183f68 judah.schvimer commented on Wed, 11 Apr 2018 15:48:10 +0000: siyuan.zhou, the code you linked prevents a direct reconfig from arbiter to not arbiter. Is it possible to first reconfig a node out of a replica set and then reconfig it in as an arbiter, or reconfig an arbiter out of a replica set and then reconfig it in as a normal node without restart? milkie commented on Wed, 11 Apr 2018 15:08:38 +0000: Arbiters do have storage (they store the replica set config in the local database) and user writes to local db are allowed on ARBITER state nodes. Not sure if that changes the solution possibilities here. siyuan.zhou@10gen.com commented on Wed, 11 Apr 2018 08:27:50 +0000: judah.schvimer, I believe you cannot convert a secondary to arbiter without shutting down the server via reconfig. The documentation confirms that. You may operate the arbiter on the same port as the former secondary. In this procedure, you must shut down the secondary and remove its data before restarting and reconfiguring it as an arbiter. I'm not aware of places where replica set uses logical clock other than oplog. judah.schvimer commented on Tue, 10 Apr 2018 17:51:32 +0000: A reconfig cannot change arbiter status unless the node gets added as a new member. That is certainly a case you'll want to test (removing and adding the node back in as a new member). I think you could do that without ever shutting the node down, but something I'm not thinking of may be preventing that. siyuan.zhou, any thoughts? And arbiters can accept writes to non-replicated collections. I don't think arbiters can participate in any causal relations meaningfully, though that may be something to document. I think disabling the clock after reading the config from disk and on receiving the first config via a heartbeat would be fine. misha.tyulenev commented on Tue, 10 Apr 2018 15:27:15 +0000: Thanks jack.mulrow, I enumerated the questions to be easier to refer. a. Yes no $cluysterTime will be returned. This is the current behavior requested by drivers when there are no keys available, so I guess they should be ok with it. b. Good point thanks. As we discusseed offline hence arbiters cant be a writers they can not tick clusterTime. c. I kept it running assuming that sharding arbiters can use it. Now can be disabled in the case logicalClock is not enabled If arbiter status can be changed dynamically then need more complex logic in the enable / disable clocks that will start/stop the keys manager as well jack.mulrow commented on Tue, 10 Apr 2018 15:10:19 +0000: I think the high level approach makes sense, I just have a few comments/questions: Other LogicalClock public methods should invariant(_isEnabled); so there is no accidental calls to a disabled logical clock Two questions: Do we know for certain that nothing can be inserted into an arbiter, even into the local database? If we disable their logical clock and add those invariants, then any write would fail trying to reserve an optime here. Is it possible for a node to transition from arbiter back to a regular node (maybe through a reconfig)? If so, we would need to re-enable the logical clock wherever that happens. judah.schvimer Do you know if we need to worry about either of these? Do not append LogicalTime metadata if LogicalClock is not enabled a. I'm guessing it's fine for arbiters to not return $clusterTime even though other nodes in the set are? Maybe it's worth checking with drivers that this won't mess something up in their protocols for gossiping $clusterTime. If the node is a ReplicaSet arbiter member but its sharding state is not enabled then logical clock should not be enabled. b. Since arbiters don't store the sharding identity document (bc they have no storage), I don't think they will ever have their sharding state enabled. This shouldn't matter if we just disable the clock for arbiters in all cases though. This will be the place where to disable the logical clock if it is an arbiter. c. When we disable the logical clock for arbiters, should we also have the keys collection manager stop monitoring? misha.tyulenev commented on Tue, 10 Apr 2018 14:29:02 +0000: The reason arbiter is disabled in a standalone RS is because it cant get the keys for cryptography because they read from local database. In a sharded cluster keys are stored on the config shard and hence its not required to have locally, and any dataless node can still process the requests. judah.schvimer commented on Tue, 10 Apr 2018 14:17:33 +0000: If the node is a ReplicaSet arbiter member but its sharding state is not enabled then logical clock should not be enabled. Can you elaborate on why the sharding state being disabled is required? If it's an arbiter, aren't we saying that we always want to disable the logical clock?