...
We have a collection that is sharded on the key "s". There is a compound index on { s : 1, s2 :1 } , but no index on just { s : 1 } . Everything works fine related to sharding. However, when running the dataSize command against a mongoS, there is no way to get it to work. Passing it the shard key results in: db.runCommand({ datasize : "database.collection", keyPattern : { "s" : 1 }, min : { "s" : -999999 }, max : { "s" : 0 }}) { "estimate" : false, "ok" : 0, "errmsg" : "couldn't find valid index containing key pattern", "$gleStats" : { "lastOpTime" : Timestamp(0, 0), "electionId" : ObjectId("000000000000000000000000") } } Passing in the compound key results in: mongos> db.runCommand({ datasize : "database.collection", keyPattern : { "s" : 1, "s2" : 1 }, min : { "s" : -9999999, "s2" : -9999999 }, max : { "s" : 0, "s2" : 0 }}) { "code" : 13408, "ok" : 0, "errmsg" : "exception: keyPattern must equal shard key" } It seems that either mongo shouldn't allow you to shard a collection based on only the prefix of a compound index, or the dataSize command should be smart enough to use the compound index in this situation. I think the latter is preferable.
thomas.schubert commented on Wed, 22 Jun 2016 16:41:28 +0000: Hi dai@foursquare.com, After additional investigation and consulting with our Sharding and Query teams, we have concluded that this is expected behavior. The dataSize command requires a singlekey index. Before reaching this conclusion, we considered two alternatives. The first option would be to put stronger checks in place to enforce that sharded clusters always have singlekey shard key index. Please note this approach would not change the behavior of dataSize and would not resolve your issue. The second option would be to allow dataSize to use multikey indexes. If dataSize used a multikey index, its performance would be significantly impacted. The work required to ease this constraint does not appear to be worth the benefit given the performance implications of using multikey indexes and number of users impacted by this behavior. We expect that sharded clusters are sharded using a singlekey index. If this restriction is observed dataSize behaves as expected. Therefore, we will be closing this ticket 'works as designed.' As you identified, there is a simple workaround: create a singlekey index. Thank you for your help investigating this issue. Kind regards, Thomas thomas.schubert commented on Fri, 29 Jan 2016 06:37:10 +0000: Hi dai@foursquare.com, Thank you for the additional information. We have a working reproduction of this behavior: the dataSize command does not recognize a compound multikey index as a valid index. Please continue to watch this ticket for updates. Kind regards, Thomas dai@foursquare.com commented on Thu, 28 Jan 2016 23:44:44 +0000: Here is the output you requested. I have edited out the original shard, host, database, and collection names along with the ports. This is on production, where the {s:1} index exists: mongos> db.test_coll.find({"s": 1, "s2": 1}).limit(1).explain() { "queryPlanner" : { "mongosPlannerVersion" : 1, "winningPlan" : { "stage" : "SINGLE_SHARD", "shards" : [ { "shardName" : "shard0", "connectionString" : "shard0/host1:27000,host2:27000,host3:27000", "serverInfo" : { "host" : "host1", "port" : 27000, "version" : "3.0.6", "gitVersion" : "1ef45a23a4c5e3480ac919b28afcba3c615488f2" }, "plannerVersion" : 1, "namespace" : "test_db.test_coll", "indexFilterSet" : false, "parsedQuery" : { "$and" : [ { "s" : { "$eq" : 1 } }, { "s2" : { "$eq" : 1 } } ] }, "winningPlan" : { "stage" : "LIMIT", "limitAmount" : 1, "inputStage" : { "stage" : "SHARDING_FILTER", "inputStage" : { "stage" : "FETCH", "filter" : { "s2" : { "$eq" : 1 } }, "inputStage" : { "stage" : "IXSCAN", "keyPattern" : { "s" : 1 }, "indexName" : "s_1", "isMultiKey" : false, "direction" : "forward", "indexBounds" : { "s" : [ "[1.0, 1.0]" ] } } } } }, "rejectedPlans" : [ { "stage" : "LIMIT", "limitAmount" : 1, "inputStage" : { "stage" : "SHARDING_FILTER", "inputStage" : { "stage" : "FETCH", "inputStage" : { "stage" : "IXSCAN", "keyPattern" : { "s" : 1, "s2" : 1 }, "indexName" : "s_1_s2_1", "isMultiKey" : true, "direction" : "forward", "indexBounds" : { "s" : [ "[1.0, 1.0]" ], "s2" : [ "[1.0, 1.0]" ] } } } } } ] } ] } }, "ok" : 1 } This is in staging, where the {s:1} index does not exist: mongos> db.test_coll.find({"s": 1, "s2": 1}).limit(1).explain() { "queryPlanner" : { "mongosPlannerVersion" : 1, "winningPlan" : { "stage" : "SINGLE_SHARD", "shards" : [ { "shardName" : "shard0", "connectionString" : "shard0/host1:27000,host2:27000,host3:27000", "serverInfo" : { "host" : "host1", "port" : 27000, "version" : "3.0.6", "gitVersion" : "1ef45a23a4c5e3480ac919b28afcba3c615488f2" }, "plannerVersion" : 1, "namespace" : "test_db.test_coll", "indexFilterSet" : false, "parsedQuery" : { "$and" : [ { "s" : { "$eq" : 1 } }, { "s2" : { "$eq" : 1 } } ] }, "winningPlan" : { "stage" : "LIMIT", "limitAmount" : 1, "inputStage" : { "stage" : "KEEP_MUTATIONS", "inputStage" : { "stage" : "SHARDING_FILTER", "inputStage" : { "stage" : "FETCH", "inputStage" : { "stage" : "IXSCAN", "keyPattern" : { "s" : 1, "s2" : 1 }, "indexName" : "s_1_s2_1", "isMultiKey" : true, "direction" : "forward", "indexBounds" : { "s" : [ "[1.0, 1.0]" ], "s2" : [ "[1.0, 1.0]" ] } } } } } }, "rejectedPlans" : [ ] } ] } }, "ok" : 1 } thomas.schubert commented on Thu, 28 Jan 2016 19:37:11 +0000: Hi dai@foursquare.com, Can you please post the output of the following command with the appropriate substitution for the collection name? db.test_coll.find({"s": 1, "s2": 1}).limit(1).explain() Thank you, Thomas dai@foursquare.com commented on Thu, 28 Jan 2016 00:23:39 +0000: Hi Ramon, I just checked both our production and staging clusters, all shards in both clusters have the {s:1, s2:1} index. I even further checked that all replicas have do indeed have the index. We would definitely have noticed in production if this index is missing, as it would be likely for all queries hitting a shard without the index to time out (or at least latency would drastically increase). I forgot to update here that in production I did end up creating an index on {s:1} as a workaround, though I still think that should be unnecessary. In staging I did not yet add the {s:1} index, and am still getting the original errors. Thanks for following up on this! ramon.fernandez commented on Wed, 27 Jan 2016 20:04:15 +0000: dai@foursquare.com, apologies for the radio silence. I just managed to reproduce this problem with the help of a colleague, but for that I had to go into one of my shards and drop the {s:1, s2:1} index from it. This leads me to believe the index catalog in at least one of your shards is incorrect or corrupt. While I think it would be very hard to find out how that happened, it should be easy to check the health/consistency of the index catalogs in all your shards. Can you please run db.test_coll.getIndexes() on all your shards and see if any is missing this index? Depending on the size of your dataset one workaround may be to create new index on {s:1}. If you find problems in the index catalog with {s:1, s2:1} you may need to do this anyway while you repair the catalog. dai@foursquare.com commented on Sat, 10 Oct 2015 00:39:52 +0000: Ramon, Your test does seem to be representative of the scenario I described. I'm not sure why I got those errors and you didn't. I can try dropping the index I added and running the dataSize command again on Monday. For now I tried doing the same on our staging cluster which is an exact copy of production (now has data that has been stale for a couple weeks), and am still getting the same errors. See below for config metadata, getIndexes() output, and dataSize command output (I've searched/replaced our db name to test_db and collection name to test_coll, everything else is the same): mongos> use config switched to db config mongos> db.collections.find({ "_id" : "test_db.test_coll" }) { "_id" : "test_db.test_coll", "lastmod" : ISODate("1970-01-17T04:33:27.914Z"), "dropped" : false, "key" : { "s" : 1 }, "unique" : false, "lastmodEpoch" : ObjectId("53601d6a3706bef02f5d2510") } mongos> use test_db switched to db test_db mongos> db.test_coll.getIndexes() [ { "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "test_db.test_coll" }, { "v" : 1, "key" : { "s" : 1, "s2" : 1 }, "name" : "s_1_s2_1", "ns" : "test_db.test_coll" }, { "v" : 1, "key" : { "c" : 1 }, "name" : "c_1", "ns" : "test_db.test_coll" } ] mongos> db.runCommand({ datasize : "test_db.test_coll", keyPattern : { "s" : 1 }, min : { "s" : -999999 }, max : { "s" : 0 }}) { "estimate" : false, "ok" : 0, "errmsg" : "couldn't find valid index containing key pattern", "$gleStats" : { "lastOpTime" : Timestamp(0, 0), "electionId" : ObjectId("55fb2c555523a6950356822f") } } mongos> db.runCommand({ datasize : "test_db.test_coll", keyPattern : { "s" : 1, "s2" : 1 }, min : { "s" : -999999 }, max : { "s" : 0 }}) { "code" : 13408, "ok" : 0, "errmsg" : "exception: keyPattern must equal shard key" } This environment is on version 3.0.6. ramon.fernandez commented on Fri, 9 Oct 2015 22:17:42 +0000: dai@foursquare.com, further investigation shows that SERVER-19640 was introduced in 3.1.2, so you were right about this not being a duplicate – apologies for the confusion. However I wasn't able to reproduce the error message you describe using 3.0.5. I created a collection with documents with the following shape and created a compound index: mongos> db.foo.findOne() { "_id" : ObjectId("5618380f8c98ff28700f88d3"), "a" : 1, "b" : 1, "c" : 1 } mongos> db.foo.getIndexes() [ { "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "test.foo" }, { "v" : 1, "key" : { "a" : 1, "b" : 1 }, "name" : "a_1_b_1", "ns" : "test.foo" } ] I then sharded the collection on {a:1}: mongos> sh.status() ... databases: { "_id" : "admin", "partitioned" : false, "primary" : "config" } { "_id" : "test", "partitioned" : true, "primary" : "shard01" } test.foo shard key: { "a" : 1 } chunks: shard01 8 shard02 8 { "a" : { "$minKey" : 1 } } -->> { "a" : 15 } on : shard02 Timestamp(2, 0) ... I then run the dataSize command: mongos> db.runCommand({ datasize : "test.foo", keyPattern : {a : 1}, min: {a:-9999}, max:{a:9999}}) { "size" : 475632, "numObjects" : 9907, "millis" : 6, "ok" : 1 } Is this a representative scenario? If not, did I miss any details where your use case differs from the reproducer above? dai@foursquare.com commented on Fri, 9 Oct 2015 21:55:04 +0000: Hi Ramon, I do not believe this is a duplicate of SERVER 19640. That issue seems to have been introduced in 3.1.2. The issue I'm describing is present in 3.0.5, and is not related to collection namespaces, but with the keyPattern parameter of the dataSize command. The current solution to my issue is to add an index identical to the shard key, in this case { s : 1 } , which duplicates the existing compound shard key. This is unnecessary wasted space taken up by the index. ramon.fernandez commented on Fri, 9 Oct 2015 21:46:33 +0000: dai@foursquare.com, I believe the issue you describe is a duplicate of SERVER-19640, which has been fixed in the current development version. I'm working on confirming this.