...
Issue Status as of March 27, 2014 ISSUE SUMMARY If a background index build is already in progress, issuing the same index creation command will report success immediately. While this in itself is not a problem, if replication is active, the index creation command will be replicated causing possible problems on secondaries. This will become a problem if the index build ultimately has an error, due to a unique key constraint violation for example. Because of how replication works this can lead to secondary nodes shutting down and not recovering on a restart, since this is an unrecoverable data replication error which causes the server to to shutdown (Fatal Assertion 16360, exception during application of oplog operations on secondaries) for data integrity reasons. USER IMPACT Users are affected if they create the same background index multiple times concurrently, for example in environments with automated index creation. Due to the fact that the secondaries will shut down during an unrecoverable replication error (to protect data integrity and consistency), this can cause loss of majority and a primary step-down, rendering the replica set read-only. SOLUTION The fix is to only replicate index creation commands if they successfully build an index, and not in the case the index already exists. This will prevent secondaries from attempting an index build that could fail, and wasn't successful on the primary. WORKAROUNDS To avoid the issue, ensure that background indexes creation commands are only issued one at a time, for each index definition. AFFECTED VERSIONS All versions from 2.4.0 to 2.4.9 are affected. The 2.2 and 2.6 production release cycles are unaffected by this issue. PATCHES The fix is included in the production release version 2.4.10. Original Description If you attempt to build an index that is already in the process of being built (e.g. a background index), the attempt succeeds immediately and also is replicated immediately, even if the in-progress index build might eventually fail (due to, say, a unique constraint violation). This could result in SECONDARY nodes in a replica set attempting index builds that are destined to fail. Such SECONDARY nodes will shut down when the index build fails, since this is a data replication error that cannot be automatically resolved by the server. The code fix was to not replicate index builds in the case that such an index already exists. ---------------------------------------------------------------------------- Original description and example reproducer: On two-member replica set prepare collection like this: function prepare() { db.c.drop() for (var i=0; i<10000; i++) db.c.insert({x:i}) db.c.insert({x:0}) printjson(db.getLastError(2)) } Then start two threads in parallel each doing the following index creations: function index() { while (true) { db.c.ensureIndex({x:1}, {unique:true, background:true}) } } (I think the while loop isn't necessary as long as the two threads start at close enough to the same time.) Index build fails as expected on primary: Sun Feb 9 10:46:53.380 [conn7] background addExistingToIndex exception E11000 duplicate key error index: test.c.$x_1 dup key: { : null } Sun Feb 9 10:46:53.380 [conn7] end connection 127.0.0.1:52421 (0 connections now open) But secondary fasserts: Sun Feb 9 10:46:52.164 [repl writer worker 1] info: indexing in foreground on this replica; was a background index build on the primary Sun Feb 9 10:46:52.164 [repl writer worker 1] build index test.c { x: 1.0 } Sun Feb 9 10:46:52.182 [repl writer worker 1] ERROR: writer worker caught exception: E11000 duplicate key error index: test.c.$x_1 dup key: { : null } on: { ts: Timestamp 1391960812000|1778, h: 5159704\ 80723141670, v: 2, op: "i", ns: "test.system.indexes", o: { _id: ObjectId('52f7a2ec7ddfdca0480c6170'), ns: "test.c", key: { x: 1.0 }, name: "x_1", unique: true, background: true } } Sun Feb 9 10:46:52.183 [repl writer worker 1] Fatal Assertion 16360 0xde46e1 0xda44d3 0xc2ca7c 0xdb2021 0xe2cf69 0x30040079d1 0x30014e8b6d mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xde46e1] mongod(_ZN5mongo13fassertFailedEi+0xa3) [0xda44d3] mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x12c) [0xc2ca7c] mongod(_ZN5mongo10threadpool6Worker4loopEv+0x281) [0xdb2021] mongod() [0xe2cf69] /lib64/libpthread.so.0() [0x30040079d1] /lib64/libc.so.6(clone+0x6d) [0x30014e8b6d] Sun Feb 9 10:46:52.184 [repl writer worker 1] It appears that an index creation oplog entry gets made in this case in spite of the index creation failure.
xgen-internal-githook commented on Fri, 14 Mar 2014 14:49:44 +0000: Author: {u'username': u'erh', u'name': u'Eliot Horowitz', u'email': u'eliot@10gen.com'} Message: SERVER-12662: fix index name creation Branch: v2.4 https://github.com/mongodb/mongo/commit/317cb98a19baa2e9731c735ee223a746b60bb334 xgen-internal-githook commented on Fri, 14 Mar 2014 13:29:04 +0000: Author: {u'username': u'erh', u'name': u'Eliot Horowitz', u'email': u'eliot@10gen.com'} Message: SERVER-12662: Two background index builds in parallel with unique key constraint violations crashes secondary Branch: v2.4 https://github.com/mongodb/mongo/commit/8dd9cf06d7df65baf16eac7725e837c8303eef07 amalia.hawkins@10gen.com commented on Thu, 13 Mar 2014 17:51:49 +0000: Attached is a js script which replicates the problem. xgen-internal-githook commented on Thu, 13 Feb 2014 14:11:40 +0000: Author: {u'username': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'} Message: Revert "SERVER-12662 do not logOp inserts that have no associated diskloc (such as index-already-exists)" This reverts commit 2f3bd6fc8f3b50bf4802f4477c55660355cff9fe. The commit does not work for _id index builds and will require a more comprehensive solution. Branch: v2.4 https://github.com/mongodb/mongo/commit/a003add92d538a4de60d64fa382376177149a0e7 milkie commented on Thu, 13 Feb 2014 13:25:10 +0000: Adding an _id index explicitly (on a collection that didn't have one already) doesn't work. Will fix this. xgen-internal-githook commented on Wed, 12 Feb 2014 22:28:27 +0000: Author: {u'username': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'} Message: SERVER-12662 do not logOp inserts that have no associated diskloc (such as index-already-exists) Branch: v2.4 https://github.com/mongodb/mongo/commit/2f3bd6fc8f3b50bf4802f4477c55660355cff9fe milkie commented on Wed, 12 Feb 2014 17:34:18 +0000: I diagnosed the issue and have a proposed fix for 2.4 branch. I think the problem should already not occur in the master branch; can you check? For now I am targeting this only at 2.4.10. The issue is that if the index already exists, we still log the index-create in the oplog. In 2.6.0rc0 this behavior is changed such that nothing is logged if the index already exists.