...
When the resharding coordinator aborts, it performs the following steps: 1. Transition the state document to kAbort. 2. Send the _shardsvrAbortReshardCollection to the participants 3. Proceed with cleaning up the resharding temporary collection metadata. However, by the time (3) executes there's no guarantee that shards will have seen the transition to kAbort (1). This is because (2) only clears the filtering metadata on the primary nodes (and issues a best effort async sharding metadata refresh which in turn will also asynchronously flush the ShardServerCatalogCacheLoader). This can be problematic in case of failover to a new secondary that is not yet aware of kAbort. One solution could be to make (2) perform this sharding metadata refresh + durably (majority) flush of the shardServerCatalogCacheLoader. Another solution, which is more in line with other callers of _updateCoordinatorDocStateAndCatalogEntries, would be to call _tellAllDonorsToRefresh() (and _tellAllRecipientsToRefresh() too?) right after this line on the abort procedure.
xgen-internal-githook commented on Thu, 16 May 2024 23:36:18 +0000: Author: {'name': 'Abdul Qadeer', 'email': 'abdul.qadeer@mongodb.com', 'username': 'zorro786'} Message: SERVER-20911 Refresh coll placement info and wait for majority (#22241) (cherry picked from commit cefaf3ae63503c751d30ff529f171e6b6df4518d) (cherry picked from commit 4d0fa8ccb3c3295f9b4cda07c8c9f45fbc48ed23) Issue BACKPORT-20911 Description This is a backport to v6.0 for SERVER-88978 Testing [x] All green: https://spruce.mongodb.com/version/6644e00e88f1120007ae9569/tasks GitOrigin-RevId: e58a10985e5344972b695f550caa27a8ee74c83a Branch: v6.0 https://github.com/mongodb/mongo/commit/ccaa24a8f62882fa410a1aca82d305d8035106cc xgen-internal-githook commented on Thu, 16 May 2024 23:21:37 +0000: Author: {'name': 'Abdul Qadeer', 'email': 'abdul.qadeer@mongodb.com', 'username': 'zorro786'} Message: SERVER-20912 Refresh coll placement info and wait for majority (#22242) (cherry picked from commit cefaf3ae63503c751d30ff529f171e6b6df4518d) (cherry picked from commit 4d0fa8ccb3c3295f9b4cda07c8c9f45fbc48ed23) (cherry picked from commit c087a0311b851485d6e830735779486fae98e2bf) Issue [BACKPORT-20912](https://jira.mongodb.org/browse/BACKPORT-20912) Description This is a backport to v5.0 for SERVER-88978(https://jira.mongodb.org/browse/SERVER-88978) Testing [x] Unrelated test failures: https://spruce.mongodb.com/version/6644e1836bb3c10007f54dd8/tasks GitOrigin-RevId: 1fd9fad83f7267b0ddbd2ccf7cccddafc46a7693 Branch: v5.0 https://github.com/mongodb/mongo/commit/f7ac984ce7155b89741efbfd8a49096fe3b6f66b xgen-internal-githook commented on Tue, 14 May 2024 18:34:41 +0000: Author: {'name': 'Abdul Qadeer', 'email': 'abdul.qadeer@mongodb.com', 'username': 'zorro786'} Message: SERVER-88978 Refresh coll placement info and wait for majority (#21517) (cherry picked from commit cefaf3ae63503c751d30ff529f171e6b6df4518d) (cherry picked from commit 4d0fa8ccb3c3295f9b4cda07c8c9f45fbc48ed23) GitOrigin-RevId: 3dcda5f80d5b0b73dd22cef4347f58dad68eb4db Branch: v7.0 https://github.com/mongodb/mongo/commit/d04d3ebab78cb95f089e70ee3a175889db368a0e xgen-internal-githook commented on Mon, 13 May 2024 22:55:18 +0000: Author: {'name': 'Abdul Qadeer', 'email': 'abdul.qadeer@mongodb.com', 'username': 'zorro786'} Message: SERVER-88978 Refresh coll placement info and wait for majority (#21517) (cherry picked from commit cefaf3ae63503c751d30ff529f171e6b6df4518d) GitOrigin-RevId: 43e5970244062567425bdd584bc3500a4d6d7f8d Branch: v8.0 https://github.com/mongodb/mongo/commit/f4c7b4d3a3f85cd00ac480ceb9e90723afe4cdee xgen-internal-githook commented on Mon, 13 May 2024 21:00:17 +0000: Author: {'name': 'Abdul Qadeer', 'email': 'abdul.qadeer@mongodb.com', 'username': 'zorro786'} Message: SERVER-88978 Refresh coll placement info and wait for majority (#21517) (cherry picked from commit cefaf3ae63503c751d30ff529f171e6b6df4518d) GitOrigin-RevId: 4d0fa8ccb3c3295f9b4cda07c8c9f45fbc48ed23 Branch: v7.3 https://github.com/mongodb/mongo/commit/baed401b42865bae7116b09b6532959135ce645f xgen-internal-githook commented on Sat, 11 May 2024 00:43:57 +0000: Author: {'name': 'Abdul Qadeer', 'email': 'abdul.qadeer@mongodb.com', 'username': 'zorro786'} Message: SERVER-88978 Refresh coll placement info and wait for majority (#21517) GitOrigin-RevId: cefaf3ae63503c751d30ff529f171e6b6df4518d Branch: master https://github.com/mongodb/mongo/commit/289c9b9bab6182833b6511f21d02777d31639b82 JIRAUSER1257318 commented on Wed, 10 Apr 2024 17:16:06 +0000: Thanks Jordi Serra Torrens, I understand now how the _shardsvrAbortReshardCollection command had responded with ok:1 after waiting for write concern w:majority. I don't believe it'll be possible to call _tellAllDonorsToRefresh() during the resharding coordinator's abort procedure because the participant shards may be holding the critical section to block writes and would prevent a sharding metadata refresh from completing Good point, I hadn't considered the critical section. I'm thinking we could wedge the _tellAll{Donors,Recipients}ToRefresh() inside _awaitAllParticipantShardsDone, at a point after the _abortParticipant has completed but before proceeding with the deleting the temporary collection metadata. I like your first proposal of changing the _shardsvrAbortReshardCollection command to additionally refresh the sharding metadata and wait for collection flush. I agree this proposal might be easier to reason about. My feeling is we would need to also change the _shardsvrCommitReshardCollection command in a similar way. Do you agree? I suspect that it might not be strictly needed in the commit case. My reasoning is that when committing, all participants (donor and recipients) would have the recoverable critical section active. And before _shardsvrCommitReshardCollection finishes, that critical section is released followed by waiting for majority at the end of the command. Secondaries will react to the oplog entry corresponding to releasing the critical section by clearing their filtering metadata. (Note: In the BF, that didn't happen because participants did not reach a point where they had activated the critical section yet). Regardless, it seems much easier to reason about it if we are explicit about refresh + flush on the _shardsvrCommitReshardCollection command, so let's consider doing it there too. max.hirschhorn@10gen.com commented on Wed, 10 Apr 2024 14:19:48 +0000: Thanks jordi.serra-torrens@mongodb.com, I understand now how the _shardsvrAbortReshardCollection command had responded with ok:1 after waiting for write concern w:majority. I don't believe it'll be possible to call _tellAllDonorsToRefresh() during the resharding coordinator's abort procedure because the participant shards may be holding the critical section to block writes and would prevent a sharding metadata refresh from completing (see also SERVER-56638). The _flushReshardingStateChange command only schedules a sharding metadata refresh and doesn't wait on it to complete. I like your first proposal of changing the _shardsvrAbortReshardCollection command to additionally refresh the sharding metadata and wait for collection flush. My feeling is we would need to also change the _shardsvrCommitReshardCollection command in a similar way. Do you agree? JIRAUSER1257318 commented on Wed, 10 Apr 2024 13:55:05 +0000: max.hirschhorn@mongodb.com that clearFilteringMetadata on step-up would only occur if the secondaries still see the config.localReshardingOperations.{donor,recipient} document. I believe in this failure this was not the case – otherwise _shardsvrAbortReshardingOperation would not have completed, and therefore the configsvr would not have proceeded with removing the temporary resharding collection metadata. max.hirschhorn@10gen.com commented on Wed, 10 Apr 2024 12:04:57 +0000: jordi.serra-torrens@mongodb.com, I would have expected the new primary clearing the filtering metadata for the user data collection on step-up to mean the new primary would also learn the resharding operation has aborted and getReshardingKeyIfShouldForwardOps() == false before accepting any reads or writes. Can you clarify the sequence further to explain how the refresh from the config server saw more stale sharding metadata?