Info
Collection drops happen in two writes. The first write renames the collection to a "drop pending" namespace. This write must be timestamped with the optime of the collection drop. The second write is to remove the collection from the catalog. This write must happen after the rename has become majority committed. This write that removes from the catalog must not be timestamped[1].
Commands being processed via oplog application use a timestamp block that sets a timestamp at commit time.
[1] KVStorageEngine::dropDatabase has a mechanism to disable this timestamp for non-replicated collections, but not for replicated collections. This was introduced back when it was a goal to appropriately timestamp collection drops. However, table drops are not transactional, which means dropped tables do not come back after a crash, even if the last checkpoint was taken before the table was dropped. If the write that removes a collection from the catalog is timestamped, there's a possibility that this write will not become "stable" when performing clean shutdown. Because the table is already dropped, the data files on disk would be inconsistent; a collection will exist without its backing table.
Top User Comments
xgen-internal-githook commented on Wed, 23 May 2018 14:24:39 +0000:
Author:
{'username': 'dgottlieb', 'name': 'Daniel Gottlieb', 'email': 'daniel.gottlieb@mongodb.com'}
Message: SERVER-35127: Do not timestamp any collection drops committed by dropDatabase.
(cherry picked from commit 7730b27795b642ac772411cfba25f165ddb265e7)
Branch: v4.0
https://github.com/mongodb/mongo/commit/04a684897ae63c60d7e52acf432d03cc341757f6
xgen-internal-githook commented on Wed, 23 May 2018 14:19:11 +0000:
Author:
{'username': 'dgottlieb', 'name': 'Daniel Gottlieb', 'email': 'daniel.gottlieb@mongodb.com'}
Message: SERVER-35127: Do not timestamp any collection drops committed by dropDatabase.
Branch: master
https://github.com/mongodb/mongo/commit/7730b27795b642ac772411cfba25f165ddb265e7