...
A primary running dropDatabase first majority commits dropCollection oplog entries and then writes a final dropDatabase oplog entry, with which the in-memory database is closed atomically. However, if replication rollback goes back to right before the dropDatabase and after the dropCollection(s), this leaves the node with no in-memory database state. Whereas nodes that were secondaries when the dropDatabase was running on the primary will still have the database open in-memory. This sets the replica set up to allow a createCollection on a conflicting database on the primary, and then crash on oplog application on the secondaries with a DatabaseDifferCase error. Potential solution: Secondaries could check whether a database is empty before returning name conflict errors. SERVER-74703 should be re-examined to see whether or not it is simple to change something back to fix this new bug, without undoing the fix for the previous bug.
xgen-internal-githook commented on Thu, 7 Sep 2023 20:26:36 +0000: Author: {'name': 'Dianna Hohensee', 'email': 'dianna.hohensee@mongodb.com', 'username': ''} Message: SERVER-77278 Secondaries close empty database on receipt of a conflicting createCollection database name to handle primary dropDatabase oplog rollback (cherry picked from commit 7d6513611f97a60746af4c9ebd7e1cec2d4b844a) Branch: v6.0 https://github.com/mongodb/mongo/commit/a247065b7edf958c9fcbb672cd97e5648cdbe9ca xgen-internal-githook commented on Wed, 24 May 2023 20:41:43 +0000: Author: {'name': 'Dianna Hohensee', 'email': 'dianna.hohensee@mongodb.com', 'username': 'DiannaHohensee'} Message: SERVER-77278 Secondaries close empty database on receipt of a conflicting createCollection database name to handle primary dropDatabase oplog rollback (cherry picked from commit 7d6513611f97a60746af4c9ebd7e1cec2d4b844a) Branch: v7.0 https://github.com/mongodb/mongo/commit/6481f5f4dd925395d8ac6271ee294cb6dc040d62 xgen-internal-githook commented on Wed, 24 May 2023 02:12:49 +0000: Author: {'name': 'Dianna Hohensee', 'email': 'dianna.hohensee@mongodb.com', 'username': 'DiannaHohensee'} Message: SERVER-77278 Secondaries close empty database on receipt of a conflicting createCollection database name to handle primary dropDatabase oplog rollback Branch: master https://github.com/mongodb/mongo/commit/7d6513611f97a60746af4c9ebd7e1cec2d4b844a dianna.hohensee commented on Mon, 22 May 2023 15:31:20 +0000: So, interestingly, I've ascertained that this is not a hot BF because it is also broken in 6.0, not just 7.0. Turns out that the test on which I reproduced the bug was actually broken from 4.3 to just recently during 7.0 development: Greg's work in SERVER-74703 actually unintentionally fixed the test. So my reproducer didn't catch anything earlier because the base test was not testing anything. The test looks like it was broken since v4.3. The test wasn't working properly because a failpoint got moved to after onDropDatabase is called, when the test was expecting to hang before the dropDatabase oplog entry is written. I adjusted the failpoint and can reproduce the crash in 6.0. dianna.hohensee commented on Thu, 18 May 2023 17:32:02 +0000: I got this to reproduce modifying an existing replication JS test. The test hangs because I have a w:2 that never gets acknowledged (secondary crashes, other secondary isn't taking writes), but the logs show [js_test:rollback_drop_database] [js_test:rollback_drop_database] [js_test:rollback_drop_database] [jsTest] ---- [js_test:rollback_drop_database] [jsTest] ~~~after step up node [js_test:rollback_drop_database] [jsTest] ---- [js_test:rollback_drop_database] [js_test:rollback_drop_database] d20050| \{"t":{"$date":"2023-05-18T16:29:15.402+00:00"},"s":"I", "c":"STORAGE", "id":20320, "ctx":"conn32","msg":"createCollection","attr":\{"namespace":"OLDDATABASE.afterRollback","uuidDisposition":"generated","uuid":{"uuid":{"$uuid":"8a3554d9-d148-4726-9067-e48b8ee35ce1"}},"options":{}}} [js_test:rollback_drop_database] d20052| \{"t":{"$date":"2023-05-18T16:29:15.417+00:00"},"s":"I", "c":"REPL", "id":21079, "ctx":"BackgroundSync","msg":"bgsync - stopReplProducer fail point enabled. Blocking until fail point is disabled."} [js_test:rollback_drop_database] d20050| \{"t":{"$date":"2023-05-18T16:29:15.421+00:00"},"s":"I", "c":"REPL", "id":7360103, "ctx":"conn32","msg":"Wrote oplog entry for create","attr":\{"namespace":"OLDDATABASE.$cmd","uuid":{"uuid":{"$uuid":"8a3554d9-d148-4726-9067-e48b8ee35ce1"}},"opTime":\{"ts":{"$timestamp":{"t":1684427355,"i":3}},"t":3},"object":\{"create":"afterRollback","idIndex":{"v":2,"key":{"_id":1},"name":"_id_"}}}} [js_test:rollback_drop_database] d20050| \{"t":{"$date":"2023-05-18T16:29:15.422+00:00"},"s":"I", "c":"INDEX", "id":20345, "ctx":"conn32","msg":"Index build: done building","attr":\{"buildUUID":null,"collectionUUID":{"uuid":{"$uuid":"8a3554d9-d148-4726-9067-e48b8ee35ce1"}},"namespace":"OLDDATABASE.afterRollback","index":"_id_","ident":"index-3--2789427611038464271","collectionIdent":"collection-2--2789427611038464271","commitTimestamp":\{"$timestamp":{"t":1684427355,"i":3}}}} [js_test:rollback_drop_database] d20051| \{"t":{"$date":"2023-05-18T16:29:15.424+00:00"},"s":"I", "c":"REPL", "id":7360109, "ctx":"ReplBatcher","msg":"Processing DDL command oplog entry in OplogBatcher","attr":\{"oplogEntry":{"oplogEntry":{"op":"c","ns":"OLDDATABASE.$cmd","ui":{"$uuid":"8a3554d9-d148-4726-9067-e48b8ee35ce1"},"o":\{"create":"afterRollback","idIndex":{"v":2,"key":{"_id":1},"name":"_id_"}},"ts":\{"$timestamp":{"t":1684427355,"i":3}},"t":3,"v":2,"wall":\{"$date":"2023-05-18T16:29:15.421Z"}}}}} [js_test:rollback_drop_database] d20051| \{"t":{"$date":"2023-05-18T16:29:15.427+00:00"},"s":"I", "c":"REPL", "id":7360110, "ctx":"ReplWriterWorker-0","msg":"Applying DDL command oplog entry","attr":\{"oplogEntry":{"oplogEntry":{"op":"c","ns":"OLDDATABASE.$cmd","ui":{"$uuid":"8a3554d9-d148-4726-9067-e48b8ee35ce1"},"o":\{"create":"afterRollback","idIndex":{"v":2,"key":{"_id":1},"name":"_id_"}},"ts":\{"$timestamp":{"t":1684427355,"i":3}},"t":3,"v":2,"wall":\{"$date":"2023-05-18T16:29:15.421Z"}}},"oplogApplicationMode":"Secondary"}} [js_test:rollback_drop_database] d20051| \{"t":{"$date":"2023-05-18T16:29:15.429+00:00"},"s":"E", "c":"REPL", "id":21262, "ctx":"ReplWriterWorker-0","msg":"Failed command during oplog application","attr":\{"command":{"create":"afterRollback","idIndex":{"v":2,"key":{"_id":1},"name":"_id_"}},"db":"OLDDATABASE","error":\{"code":13297,"codeName":"DatabaseDifferCase","errmsg":"db already exists with different case already have: [olddatabase] trying to create [OLDDATABASE]"}}} [js_test:rollback_drop_database] d20051| \{"t":{"$date":"2023-05-18T16:29:15.429+00:00"},"s":"F", "c":"REPL", "id":21237, "ctx":"ReplWriterWorker-0","msg":"Error applying operation","attr":\{"oplogEntry":{"oplogEntry":{"op":"c","ns":"OLDDATABASE.$cmd","ui":{"$uuid":"8a3554d9-d148-4726-9067-e48b8ee35ce1"},"o":\{"create":"afterRollback","idIndex":{"v":2,"key":{"_id":1},"name":"_id_"}},"ts":\{"$timestamp":{"t":1684427355,"i":3}},"t":3,"v":2,"wall":\{"$date":"2023-05-18T16:29:15.421Z"}}},"error":" :: caused by :: DatabaseDifferCase: db already exists with different case already have: [olddatabase] trying to create [OLDDATABASE]"}} [js_test:rollback_drop_database] d20051| \{"t":{"$date":"2023-05-18T16:29:15.430+00:00"},"s":"F", "c":"REPL", "id":21235, "ctx":"OplogApplier-0","msg":"Failed to apply batch of operations","attr":\{"numOperationsInBatch":1,"firstOperation":{"oplogEntry":{"op":"c","ns":"OLDDATABASE.$cmd","ui":{"$uuid":"8a3554d9-d148-4726-9067-e48b8ee35ce1"},"o":\{"create":"afterRollback","idIndex":{"v":2,"key":{"_id":1},"name":"_id_"}},"ts":\{"$timestamp":{"t":1684427355,"i":3}},"t":3,"v":2,"wall":\{"$date":"2023-05-18T16:29:15.421Z"}}},"lastOperation":\{"oplogEntry":{"op":"c","ns":"OLDDATABASE.$cmd","ui":{"$uuid":"8a3554d9-d148-4726-9067-e48b8ee35ce1"},"o":\{"create":"afterRollback","idIndex":{"v":2,"key":{"_id":1},"name":"_id_"}},"ts":\{"$timestamp":{"t":1684427355,"i":3}},"t":3,"v":2,"wall":\{"$date":"2023-05-18T16:29:15.421Z"}}},"failedWriterThread":2,"error":"DatabaseDifferCase: db already exists with different case already have: [olddatabase] trying to create [OLDDATABASE]"}} [js_test:rollback_drop_database] d20051| \{"t":{"$date":"2023-05-18T16:29:15.430+00:00"},"s":"F", "c":"ASSERT", "id":23095, "ctx":"OplogApplier-0","msg":"Fatal assertion","attr":\{"msgid":34437,"error":"DatabaseDifferCase: db already exists with different case already have: [olddatabase] trying to create [OLDDATABASE]","file":"src/mongo/db/repl/oplog_applier_impl.cpp","line":436}}