...
In Mongo 4.4, shard keys can be missing, see https://docs.mongodb.com/manual/core/sharding-shard-key/#shard-key-missing This works fine for insert of documents, however merge in aggregation-framework is failing: db = db.getSiblingDB("mip") db.createCollection("sharded_col") sh.enableSharding("mip") sh.shardCollection("mip.sharded_col", { tsi: 1 }) db.sharded_col.insertOne({ a: 1 }) db.sharded_col.aggregate([ { $set: { a: 2 } }, { $unset: "_id" }, { $merge: { into: "sharded_col" } } ]) { "ok" : 0.0, "errmsg" : "$merge write error: 'on' field 'tsi' cannot be missing, null, undefined or an array", "code" : NumberInt(51132), "codeName" : "Location51132", "operationTime" : Timestamp(1601372348, 4), "$clusterTime" : { "clusterTime" : Timestamp(1601372351, 40), "signature" : { "hash" : BinData(0, "wv5eJ/xcJoaS1m14aJeoSA3WSqo="), "keyId" : NumberLong(6854861305854033921) } } } If MongoDB permits missing shard key then it should also apply for merge.
xgen-internal-githook commented on Thu, 19 Sep 2024 19:42:36 +0000: Author: {'name': 'Ivan Fefer', 'email': 'ivan.fefer@mongodb.com', 'username': 'Fefer-Ivan'} Message: SERVER-51207 Allow $merge to have nullish keys when supporting index is not sparse (#24789) GitOrigin-RevId: 6093ae3dc64da0a0296f8707bab437c512e6fdc4 Branch: master https://github.com/mongodb/mongo/commit/f0f78a2f8069795158bec1cc19a768cdd5d31bc0 JIRAUSER1270969 commented on Wed, 17 Jul 2024 09:43:29 +0000: We currently allow $merge to be backed by a sparse index. We need to make sure that we only allow nullish values when the backing index is not sparse. Also, in a sharded cluster, mongos need to check for the index presence, not mongod, so we need to pass the decision to allow nullish values or not from mongos to mongod. JIRAUSER1270969 commented on Tue, 9 Jul 2024 13:56:48 +0000: Currently if $merge doesn't have "on" field, we get collection shard key + _id here: https://github.com/mongodb/mongo/blob/b4714ce943896b384d75ecabb965e37748807b24/src/mongo/db/pipeline/process_interface/mongos_process_interface.cpp#L434 We also have a clause in the documentation that fields in on should be present, not null and not arrays: https://www.mongodb.com/docs/manual/reference/operator/aggregation/merge/#syntax So missing shard key fails this check here: https://github.com/mongodb/mongo/blob/b4714ce943896b384d75ecabb965e37748807b24/src/mongo/db/pipeline/merge_processor.cpp#L193 I will look into possibly just relaxing this requirement. JIRAUSER1274254 commented on Fri, 18 Aug 2023 11:24:41 +0000: I did not have time to work on this yet. I am putting it back to open. JIRAUSER1265607 commented on Wed, 22 Mar 2023 17:45:27 +0000: Suggest to add to quick win candidate. JIRAUSER1257066 commented on Thu, 1 Oct 2020 19:05:42 +0000: wernfried.domscheit@sunrise.net thank you for the reproduction you provided and additional context about your use-case. I was able to reproduce the error and will hand it over to the sharding team for further investigation. JIRAUSER1257089 commented on Wed, 30 Sep 2020 07:03:01 +0000: My use case it to group documents in order to save space. It looks like this one: db.sharded_col.insertMany([ { a: 1, grouped: false, b: "foo" }, { a: 1, grouped: false, b: "bar" }, { a: 1, grouped: false, b: "jeff" } { a: 2, grouped: false, tsi: 123, b: "foo-foo" }, { a: 2, grouped: false, tsi: 123, b: "bar-bar" }, ]) db.sharded_col.aggregate([ { $match: { grouped: false } }, { $group: { _id: { a: "$a", tsi: "$tsi" }, b: { $push: "$b" }, ids: { $push: "$_id" } } }, { $set: { grouped: true, a: "$_id.a", tsi: "$_id.tsi" } }, { $unset: "_id" }, { $merge: { into: "sharded_col" } } ]) db.sharded_col.find({ grouped: true }).forEach(function (row) { db.sharded_col.deleteMany({ _id: { $in: row.ids } }); }) db.sharded_col.updateMany({ grouped: true }, { $unset: { _ids: '' } }) The vast majority of documents have a shard key, of course. However, there are a few documents where shard key does not exits. My current workaround is to use a dummy shard-key: `tsi: false` The documents are log messages which are inserted by an external program, one-by-one. It is not possible to inserted them already as groups. The grouping is done by a job once per hour. I like to have grouped and un-groupped documents in the same collection. asya commented on Tue, 29 Sep 2020 18:30:18 +0000: wernfried.domscheit@sunrise.net what's the use case for this? Usually $merge would be used to update existing documents based on some calculation - is the main use case here updating documents that happen to have null shard keys? Or using $merge to insert a document or documents that have null shard key? JIRAUSER1257089 commented on Tue, 29 Sep 2020 10:06:45 +0000: With proper line break: db = db.getSiblingDB("mip") db.createCollection("sharded_col") sh.enableSharding("mip") sh.shardCollection("mip.sharded_col", { tsi: 1 }) db.sharded_col.insertOne({ a: 1 }) db.sharded_col.aggregate([ { $set: { a: 2 } }, { $unset: "_id" }, { $merge: { into: "sharded_col" } } ]) { "ok" : 0.0, "errmsg" : "$merge write error: 'on' field 'tsi' cannot be missing, null, undefined or an array", "code" : NumberInt(51132), "codeName" : "Location51132", "operationTime" : Timestamp(1601372348, 4), "$clusterTime" : { "clusterTime" : Timestamp(1601372351, 40), "signature" : { "hash" : BinData(0, "wv5eJ/xcJoaS1m14aJeoSA3WSqo="), "keyId" : NumberLong(6854861305854033921) } } }