...
It is possible for the listDatabases command to erroneously omit a database if the database contains a single collection and that collection is concurrently renamed. The problem stems from the fact that the listDatabases command takes a GlobalLock in MODE_IS. The renameCollection command acquires a GlobalLock in MODE_IX and a MODE_X database lock on the database on which it is performing the rename. Global locks of type IX and IS do not conflict, so the listDatabases command and renameCollection command are allowed to run concurrently. When the renameCollection command executes a rename within the same database, it will call DatabaseImpl::renameCollection, which as part of the rename operation, will call KVDatabaseCatalogEntryBase::renameCollection and remove the entry for the source collection from KVDatabaseCatalogEntryBase::_collections. It will then insert the entry for the destination collection to the structure here, before it finishes. If there was only one collection in the database, the KVDatabaseCatalogEntryBase::_collections structure will be empty until the entry for the destination collection is added. If, during this period, a listDatabases command is running, it is possible that it will view the database object in this state and consider it to be empty. It checks for the "emptiness" of KVDatabaseCatalogEntryBase::_collections here, in KVStorageEngine::listDatabases. This can cause this database to be missed, even though it should exist. This can be a problem internally, for example, for initial sync, which relies on the correctness of the results returned by the listDatabases command for its collection cloning process. There is a repro attached demonstrating how the listDatabases command can produce incorrect results. There is also a repro attached demonstrating how this issue could lead to a collection missing on a node following initial sync. Running these tests on repeat for a few runs should produce the respective error cases.
xgen-internal-githook commented on Tue, 15 May 2018 13:33:47 +0000: Author: {'email': 'maria@mongodb.com', 'username': 'mvankeulen94', 'name': 'Maria van Keulen'} Message: SERVER-34531 Ensure KVCatalog _collections is not empty during rename Branch: master https://github.com/mongodb/mongo/commit/0192520fa62db28787a5fb6ad828c1723d7d992c maria.vankeulen commented on Tue, 8 May 2018 16:07:49 +0000: I believe the bug occurs due to the KVDatabaseCatalogEntryBase::_collections object rather than the DatabaseImpl::_collections object. Hence, I believe this can be fixed at the KVDatabaseCatalogEntryBase level. I have updated the ticket description. william.schultz commented on Wed, 18 Apr 2018 17:53:37 +0000: I was able to reproduce this on 3.6 and 3.4. spencer commented on Tue, 17 Apr 2018 22:18:01 +0000: Ah I see, I was thinking about the collection getting renamed out of the database, not about it being renamed within the same db. Agreed that is a bug. william.schultz commented on Tue, 17 Apr 2018 22:01:42 +0000: spencer You are correct that if a database has no collections, it effectively does not exist, and so we should not need to copy anything for initial sync. But if a database contains a single collection, that is in the middle of being renamed within that database, it is incorrect for a listDatabases command to not include that database in its result set, since the database most certainly exists. The problem is that the listDatabases command is not viewing the renameCollection as a single atomic operation. If a collection exists in database "test", and that collection gets renamed to a different name in the same database, I would consider it an invariant that listDatabases always includes "test" in its list of results; before, during, and after the renameCollection operation. spencer commented on Tue, 17 Apr 2018 21:49:55 +0000: I'm confused how this causes an issue in initial sync. If the database has no collections in it, then effectively the database doesn't exist. Why would it be a problem for it to not show up in listDatabases?
Click on a version to see all relevant bugs
MongoDB Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.