...
Running an update with an aggregation pipeline with an `$unset` stage results in a change stream event of `replace`, instead of `update` with the corresponding fields marked as `removedFields`
JIRAUSER1265262 commented on Wed, 1 May 2024 20:18:50 +0000: Closing this since it works as designed currently; however, we are proceeding with documentation changes. Thank you for your report kartalkaan.bozdogan@ocell.io! bernard.gorman commented on Tue, 30 Apr 2024 22:08:13 +0000: mihai.andrei@mongodb.com: no, this is expected behaviour. Prior to PM-1460, pipeline updates always generated full-document replacements in the oplog and were reported as such by change streams. PM-1460 introduced a new oplog format that allowed pipeline updates (or indeed any updates) to be recorded as diffs against the original document. However, because the purpose of the project was to minimize the size of oplog events, the diff algorithm includes a provision where we bail out and record the full document - or a full subdocument at any level of diffing - if doing so requires fewer bytes than recording the delta would. In the above case, the document produced after removing the "a" field is just the _id. Because it's cheaper to record the complete remaining document, that's what we do. Here's the oplog entry generated by this update: replset:PRIMARY> db.test.insertOne({_id: 'foo', 'a': 0}); replset:PRIMARY> const watchCursor = db.test.watch(); replset:PRIMARY> db.test.updateOne({_id: 'foo'}, [{$unset: ['a']}]); replset:PRIMARY> db.getSiblingDB("local").oplog.rs.find({ns: "test.test", op: "u"}).pretty() { "op" : "u", "ns" : "test.test", "ui" : UUID("59e7a38c-febb-4640-8987-3b0cfb60808e"), "o" : { "_id" : "foo" }, "o2" : { "_id" : "foo" }, "ts" : Timestamp(1714513754, 1), "t" : NumberLong(4), "v" : NumberLong(2), "wall" : ISODate("2024-04-30T21:49:14.251Z") } That's a full-document replacement oplog event, which produces a replace change event: replset:PRIMARY> watchCursor.next() { "_id" : { "_data" : "..." }, "operationType" : "replace", "clusterTime" : Timestamp(1714513754, 1), "wallTime" : ISODate("2024-04-30T21:49:14.251Z"), "fullDocument" : { "_id" : "foo" }, "ns" : { "db" : "test", "coll" : "test" }, "documentKey" : { "_id" : "foo" } } If, however, we perform exactly the same test but include another, larger field in the original document: replset:PRIMARY> db.test.insertOne({_id: 'foo', 'a': 0, 'b': "x".repeat(10)}); replset:PRIMARY> const watchCursor = db.test.watch(); replset:PRIMARY> db.test.updateOne({_id: 'foo'}, [{$unset: ['a']}]); ... then it is no longer cheaper to record the full document in the oplog, and we record a diff instead: replset:PRIMARY> db.getSiblingDB("local").oplog.rs.find({ns: "test.test", op: "u"}).pretty() { "op" : "u", "ns" : "test.test", "ui" : UUID("a0e91ca1-0deb-4f96-80b5-52ae0423356c"), "o" : { "$v" : 2, "diff" : { "d" : { "a" : false } } }, "o2" : { "_id" : "foo" }, "ts" : Timestamp(1714514731, 3), "t" : NumberLong(4), "v" : NumberLong(2), "wall" : ISODate("2024-04-30T22:05:31.238Z") } This in turn produces an update change event: replset:PRIMARY> watchCursor.next() { "_id" : { "_data" : "..." }, "operationType" : "update", "clusterTime" : Timestamp(1714514731, 3), "wallTime" : ISODate("2024-04-30T22:05:31.238Z"), "ns" : { "db" : "test", "coll" : "test" }, "documentKey" : { "_id" : "foo" }, "updateDescription" : { "updatedFields" : { }, "removedFields" : [ "a" ], "truncatedArrays" : [ ] } } JIRAUSER1265262 commented on Thu, 28 Mar 2024 22:15:41 +0000: Thanks for your report. It certainly sounds odd that this is being updated as a "replace" instead of an update. Per the docs, i would expect this on replaceOne but not updateOne. I'll pass this to the relevant team to confirm whether this should be the case.
db.test.insertOne({_id: 'foo', 'a': 0}); // Watch the change stream const watchCursor = db.test.watch(); while (!watchCursor.isClosed()) { let next = watchCursor.tryNext(); while (next !== null) { printjson(next); next = watchCursor.tryNext(); } } // In another shell db.test.updateOne({_id: 'foo'}, [{$unset: ['a']}]);
MongoDB Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.