...
Issue Status as of April 13, 2021 ISSUE DESCRIPTION AND IMPACT When multiple (size-neutral) updates are made to a single document in a capped collection, it is possible for a secondary to abort on errors like: 2018-12-18T15:19:00.857-0700 F REPL [repl writer worker 7] writer worker caught exception: 10003 Cannot change the size of a document in a capped collection: 33 != 61 on: { ts: Timestamp 1545171540000|1, t: 3, h: 7402520059475909511, v: 2, op: "u", ns: "test.my_capped", o2: { _\ id: ObjectId('5c12e0f1ea151ba34bd0ae11') }, o: { $set: { y: "a size-neutral value" } } } 2018-12-18T15:19:00.863-0700 I - [repl writer worker 7] Fatal assertion 16359 CannotGrowDocumentInCappedNamespace: Cannot change the size of a document in a capped collection: 33 != 61 at src/mongo/db/repl/sync_tail.cpp 1061 Additional issue: when an update is made to a single document in a capped collection and the document is about to be removed as part of the collection cap, it is possible for a secondary to abort with a similar error. DIAGNOSIS AND AFFECTED VERSIONS This can occur when a document is overwritten on the Secondary before it is overwritten on the Primary, and is then subject to two size-neutral updates on two fields. On the mmapV1 storage engine, differences in storage fragmentation between the Primary and the Secondary are one of the known triggers of this issue. This can also occur when a document is deleted by capped collection enforcement on the Secondary before an update to that document is replicated from the Primary. This affects all MongoDB versions 3.2 through 4.4. SERVER-21700 fixes this issue in MongoDB 5.0. REMEDIATION AND WORKAROUNDS If this occurs, resync the affected Secondaries. Ensure that capped collections are the same size across the whole replica set. Do not update documents that are frequently rolled over capped collections. Consider using a TTL for this use-case instead. Original description When multiple (size-neutral) updates are made to a single document in a capped collection, it is possible for a secondary to abort on errors like: 2018-12-18T15:19:00.857-0700 F REPL [repl writer worker 7] writer worker caught exception: 10003 Cannot change the size of a document in a capped collection: 33 != 61 on: { ts: Timestamp 1545171540000|1, t: 3, h: 7402520059475909511, v: 2, op: "u", ns: "test.my_capped", o2: { _\ id: ObjectId('5c12e0f1ea151ba34bd0ae11') }, o: { $set: { y: "a size-neutral value" } } } 2018-12-18T15:19:00.863-0700 I - [repl writer worker 7] Fatal assertion 16359 CannotGrowDocumentInCappedNamespace: Cannot change the size of a document in a capped collection: 33 != 61 at src/mongo/db/repl/sync_tail.cpp 1061
brian.lane commented on Mon, 11 Feb 2019 20:47:39 +0000: Hi raffis, You can't configure this. We would need to do code changes on our side for this and then backport to older versions. We would look to only backport this change as far as 3.6 once completed. -Brian raffis commented on Mon, 4 Feb 2019 11:35:21 +0000: Hi Brian No TTL is not an alternative at all. The capped collection is used because of the tailable functionality. > If you can upgrade to a later version of MongoDB and migrate to WiredTiger as your storage engine, we may be able to address this issue by skipping the capped update check on secondaries That would be fine, how can I configure this? Can't find any relevant documentation. What I did now is to place this collection in the local db, which does not get touched by the replset mechanism. brian.lane commented on Fri, 25 Jan 2019 03:23:11 +0000: Hi raffis, You might want to consider a TTL index on an uncapped collection as an alternative. If you can upgrade to a later version of MongoDB and migrate to WiredTiger as your storage engine, we may be able to address this issue by skipping the capped update check on secondaries. This check is a hard requirement with MMAPv1, but with WiredTiger we may be able to relax this requirement. FYI: We have deprecated MMAPv1 as of version 4.0. -Brian raffis commented on Thu, 24 Jan 2019 07:34:19 +0000: Hi Brian It' s 3.4.18 and mmapv1. Let me know if you need something else. Cheers brian.lane commented on Thu, 24 Jan 2019 05:47:13 +0000: Hi raffis Thanks for the additional details. Can you also let me know which version of MongoDB you are using as well as which storage engine - WiredTiger / MMAPv1. Cheers! -Brian raffis commented on Tue, 22 Jan 2019 08:18:45 +0000: Hi Brian Well the collection where the problem occurs is heavily used by https://github.com/gyselroth/mongodb-php-task-scheduler This is queue job processor and the capped collection makes perfectly sense there. Theoretically the library could be changed to use the newer watch functionality on the oplog but still, a capped collection makes perfectly sense. (And it would need some time to release a newer version for this functionality). The bug occurs now (randomly) daily. Basically our replset is not functional at all. I can resync entire slave nodes and quite fast they crash with the error mentioned above. Interesting fact: In all cases the consumer crashed at { $set: { status: 3, ended: new Date(1545406647813) } } while a a new datetime object is set. (The library makes other changes as well before changing this end timestamp, for example setting status to 1 without any date changes). $ cat /var/log/mongodb/mongod.log*gz | gunzip | grep -a "Cannot change"; cat /var/log/mongodb/mongod.log* | grep "Cannot change" -a 2019-01-03T10:25:08.782+0100 F REPL [repl writer worker 11] writer worker caught exception: 10003 Cannot change the size of a document in a capped collection: 71 != 86 on: { ts: Timestamp 1546507508000|2, h: -496745303623253444, v: 2, op: "u", ns: "cloudfs.taskscheduler.jobs", o2: { _id: ObjectId('5c2c8373cd9132000910d522') } , o: { $set: { status: 3, ended: new Date(1546507507990) } } } 2019-01-03T10:25:08.905+0100 I - [repl writer worker 11] Fatal assertion 16359 CannotGrowDocumentInCappedNamespace: Cannot change the size of a document in a capped collection: 71 != 86 at src/mongo/db/repl/sync_tail.cpp 1087 2019-01-03T13:27:55.107+0100 I STORAGE [initandlisten] exception in initAndListen: 10003 Cannot change the size of a document in a capped collection: 71 != 86, terminating 2018-12-21T16:37:27.830+0100 F REPL [repl writer worker 5] writer worker caught exception: 10003 Cannot change the size of a document in a capped collection: 71 != 86 on: { ts: Timestamp 1545406647000|4, h: -308296782776005596, v: 2, op: "u", ns: "cloudfs.taskscheduler.jobs", o2: { _id: ObjectId('5c1d087b95aa0a02386e9843') } , o: { $set: { status: 3, ended: new Date(1545406647813) } } } 2018-12-21T16:37:27.830+0100 I - [repl writer worker 5] Fatal assertion 16359 CannotGrowDocumentInCappedNamespace: Cannot change the size of a document in a capped collection: 71 != 86 at src/mongo/db/repl/sync_tail.cpp 1087 2018-12-21T18:33:15.567+0100 F REPL [repl writer worker 2] writer worker caught exception: 10003 Cannot change the size of a document in a capped collection: 71 != 86 on: { ts: Timestamp 1545406647000|4, h: -308296782776005596, v: 2, op: "u", ns: "cloudfs.taskscheduler.jobs", o2: { _id: ObjectId('5c1d087b95aa0a02386e9843') } , o: { $set: { status: 3, ended: new Date(1545406647813) } } } 2018-12-21T18:33:15.567+0100 I - [repl writer worker 2] Fatal assertion 16359 CannotGrowDocumentInCappedNamespace: Cannot change the size of a document in a capped collection: 71 != 86 at src/mongo/db/repl/sync_tail.cpp 1087 2018-12-21T18:54:41.777+0100 F REPL [repl writer worker 14] writer worker caught exception: 10003 Cannot change the size of a document in a capped collection: 71 != 86 on: { ts: Timestamp 1545406647000|4, h: -308296782776005596, v: 2, op: "u", ns: "cloudfs.taskscheduler.jobs", o2: { _id: ObjectId('5c1d087b95aa0a02386e9843') } , o: { $set: { status: 3, ended: new Date(1545406647813) } } } 2018-12-21T18:54:41.777+0100 I - [repl writer worker 14] Fatal assertion 16359 CannotGrowDocumentInCappedNamespace: Cannot change the size of a document in a capped collection: 71 != 86 at src/mongo/db/repl/sync_tail.cpp 1087 2019-01-21T08:43:21.477+0100 I STORAGE [initandlisten] exception in initAndListen: 10003 Cannot change the size of a document in a capped collection: 71 != 86, terminating 2019-01-21T18:37:32.433+0100 F REPL [repl writer worker 11] writer worker caught exception: 10003 Cannot change the size of a document in a capped collection: 71 != 86 on: { ts: Timestamp 1548092252000|2, h: 8891639960730045349, v: 2, op: "u", ns: "cloudfs.taskscheduler.jobs", o2: { _id: ObjectId('5c4592d8a7621c000a270723') } , o: { $set: { status: 3, ended: new Date(1548092252319) } } } 2019-01-21T18:37:32.433+0100 I - [repl writer worker 11] Fatal assertion 16359 CannotGrowDocumentInCappedNamespace: Cannot change the size of a document in a capped collection: 71 != 86 at src/mongo/db/repl/sync_tail.cpp 1087 2019-01-17T07:23:58.732+0100 F REPL [repl writer worker 6] writer worker caught exception: 10003 Cannot change the size of a document in a capped collection: 71 != 86 on: { ts: Timestamp 1547706238000|2, h: -1287491416624820144, v: 2, op: "u", ns: "cloudfs.taskscheduler.jobs", o2: { _id: ObjectId('5c3ecdfd81fa83000704e0f6') } , o: { $set: { status: 3, ended: new Date(1547706238359) } } } 2019-01-17T07:23:58.770+0100 I - [repl writer worker 6] Fatal assertion 16359 CannotGrowDocumentInCappedNamespace: Cannot change the size of a document in a capped collection: 71 != 86 at src/mongo/db/repl/sync_tail.cpp 1087 2019-01-17T07:52:46.610+0100 I STORAGE [initandlisten] exception in initAndListen: 10003 Cannot change the size of a document in a capped collection: 71 != 86, terminating 2019-01-18T10:36:02.219+0100 F REPL [repl writer worker 0] writer worker caught exception: 10003 Cannot change the size of a document in a capped collection: 71 != 86 on: { ts: Timestamp 1547804162000|2, h: -9155582286729571842, v: 2, op: "u", ns: "cloudfs.taskscheduler.jobs", o2: { _id: ObjectId('5c404c817047201a535de75f') } , o: { $set: { status: 3, ended: new Date(1547804162166) } } } 2019-01-18T10:36:02.219+0100 I - [repl writer worker 0] Fatal assertion 16359 CannotGrowDocumentInCappedNamespace: Cannot change the size of a document in a capped collection: 71 != 86 at src/mongo/db/repl/sync_tail.cpp 1087 2019-01-19T10:36:03.302+0100 F REPL [repl writer worker 13] writer worker caught exception: 10003 Cannot change the size of a document in a capped collection: 71 != 86 on: { ts: Timestamp 1547890562000|3, h: -9125484584779178179, v: 2, op: "u", ns: "cloudfs.taskscheduler.jobs", o2: { _id: ObjectId('5c419e02863a75000b7a54b9') } , o: { $set: { status: 3, ended: new Date(1547890562963) } } } 2019-01-19T10:36:03.315+0100 I - [repl writer worker 13] Fatal assertion 16359 CannotGrowDocumentInCappedNamespace: Cannot change the size of a document in a capped collection: 71 != 86 at src/mongo/db/repl/sync_tail.cpp 1087 brian.lane commented on Mon, 21 Jan 2019 22:18:35 +0000: Hi raffis, Have you considered using a TTL index instead? Here is a tutorial on how to do this. If you are unable to use TTL indexes would be interested in understanding why. -Brian raffis commented on Fri, 18 Jan 2019 09:48:05 +0000: This happened now three times in a row within 3 days. Definitely not rare. raffis commented on Thu, 17 Jan 2019 13:37:59 +0000: This is certainly not a rare condition. See SERVER-38741, We encountered this bug multiple times over the last 1-2 months. Resync the affected nodes helps but it takes ages for big nodes. daniel.hatcher commented on Thu, 20 Dec 2018 17:46:10 +0000: SERVER-8972 is the root cause of this issue but that ticket just mentioned possible data inconsistency. This ticket shows that that inconsistency can cause a server to crash.