Loading...
Loading...
When doing a AND query between text search and another field, the other field seems ignored Below, there should be only one result
xgen-internal-githook commented on Wed, 1 Feb 2017 18:41:40 +0000: Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'} Message: SERVER-27392 fix bug in FTS index's check for indexed arrays Branch: master https://github.com/mongodb/mongo/commit/521a57162820b3b1100d20408f5ad0cd89fdc80e caub commented on Tue, 13 Dec 2016 21:03:49 +0000: Thanks Thomas, great answers, This way of embedding multiple translations in the same document is really practical, and efficient, but you can't avoid this (rare) case of false-friends matches like shown in first example. It would be great if Text search could give which language matched and be able to $and/filter it I could store simple documents (not nested), then I can avoid false-friends, but it's not really good for performance I think, because you need to link those small multilanguage documents to the main document, then do _id: {$in: allIdsFound} + other queries in the main collection, without being able to .limit or .skip for finding this ids, and then having to possibly repopulated after... Anyway, this is getting far from the issue to raised, just wanting to explain the context thomas.schubert commented on Tue, 13 Dec 2016 19:59:06 +0000: Hi caub, Thank you for the report. The behavior you are observing with the first index is expected. $language:'fr' changes the tokenizer on the text search, and x.language':'fr' matches against any subdocument in the array containing this equality. Consequently it is is expected that { "_id" : ObjectId("584867ec4fb941395745c6ba"), "i" : 1, "x" : [ { "name" : "fart", "language" : "en" }, { "name" : "pet", "language" : "fr" } ] } is returned since the first document in the array contains the name fart and the second document in the array has language set to "fr". For more information, please review our documentation. The second example you have presented is a bug, and I will be modifying this ticket to highlight this issue. The index {'x.language':1, 'x.name': 'text'} does not support multikey values on x.language, and MongoDB should reject the index in this case. See the example below for its expected behavior: > db.c.insert({a:[1.2],b:'a'}) WriteResult({ "nInserted" : 1 }) > db.c.createIndex({'a':1,'b':'text'}) { "ok" : 0, "errmsg" : "cannot have a multi-key as a prefix to a text index", "code" : 16675, "codeName" : "Location16675" } Please continue to watch for updates regarding the fix for this issue. Kind regards, Kelsey caub commented on Tue, 13 Dec 2016 10:42:59 +0000: I've also tried: > db.a.dropIndexes() > db.a.createIndex({'x.language':1, 'x.name': 'text'}) > db.a.find({'x.language':'fr', $text:{$search: 'fart', $language:'fr'}}) > db.a.find({'x.language':'fr', $text:{$search: 'fart', $language:'en'}}) > db.a.find({'x.language':'en', $text:{$search: 'fart', $language:'en'}}) // empty results caub commented on Tue, 13 Dec 2016 08:22:40 +0000: steps to reproduce: > db.a.find() > db.a.createIndex({'x.name':'text'}) { "createdCollectionAutomatically" : true, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 } > db.a.insert({i:1, x:[{name:'pet', language:'en'}, {name:'animal domestique', language:'fr'}]}) WriteResult({ "nInserted" : 1 }) > db.a.insert({i:1, x:[{name:'fart', language:'en'}, {name:'pet', language:'fr'}]}) WriteResult({ "nInserted" : 1 }) > db.a.insert({i:1, x:[{name:'wax', language:'en'}, {name:'farter', language:'fr'}]}) WriteResult({ "nInserted" : 1 }) > db.a.find({$text:{$search:'fart', $language:'fr'}, 'x.language':'fr'}) { "_id" : ObjectId("584868134fb941395745c6bb"), "i" : 1, "x" : [ { "name" : "wax", "language" : "en" }, { "name" : "farter", "language" : "fr" } ] } { "_id" : ObjectId("584867ec4fb941395745c6ba"), "i" : 1, "x" : [ { "name" : "fart", "language" : "en" }, { "name" : "pet", "language" : "fr" } ] }
Issue summary for SERVER-27392 Issue Description: In certain situations, compound text indexes in MongoDB 3.4 may have incorrectly indexed documents where either non-text prefix or suffix fields are multikey. Documents containing an array in fields referenced by a compound text index will not be returned by queries which use this index. Expected Behavior: Documents containing multikey values for the prefix or suffix fields should have been rejected by the index, causing the index build to fail or rejecting certain inserts to the collection. Indexes Affected This only affects compound text indexes where the prefix or suffix fields are non-text. For example: collection.createIndex({ "a.b" : 1, "c" : "text" }) Other compound text indexes are unaffected. Example: // Only occurrs in MongoDB 3.4 db.foo.drop() db.foo.createIndex({ "a.b" : 1, "a.c" : "text" }) // Should be rejected due to multikey index prefix but is accepted db.foo.insert({ "a" : [ { "b" : 1, "c": "foo" }, { "b" : 2, "c" : "bar" } ] }) // As a result, the indexed value for db.foo.find({ "a.b" : 1, "$text" : { "$search" : "foo" } }) // nothing is returned Affected Versions 3.4.0 - 3.4.24 - Compound text indexes allowed invalid index keys. 3.6.0 and later - Compound text indexes no longer allow invalid index keys. However, indexes already containing invalid index keys would continue to be affected until those index keys were removed or the index was rebuilt. See the Remediation section below. Diagnosis Run the validate() command on collections with compound text indexes. The command will return the following error if impacted: text index contains an array in document Note the performance considerations of validate before proceeding. If you already know what indexes are impacted, proceed to remediation. Remediation If validate() reports that a "text index contains an array", this issue can be remediated by re-building the affected indexes after removing or updating any documents containing data that would be incompatible with the index. The following steps may help to guide this effort. 1. Use $type to query the collection for documents containing multi-key values in the index prefix. For example, for the index [ "a.b" : 1, "c" : "text" ], issuing the following query may identify all documents with multi-key values for "a.b": db.collection.find({ "$or" : [ { "a": { "$type" : 4 } }, { "a.b" : { "$type" : 4 } } ] } ) You may also want to consider creating partial indexes to improve the performance of these operations. For example: db.collection.createIndex({ "a" : 1 }, { "partialFilterExpression" : { "a" : { "$type" : 4 } } } ) db.collection.createIndex({ "a.b" : 1 }, { "partialFilterExpression" : { "a" : { "$type" : 4 } } } ) 2. Update or remove these documents to remove the invalid multi-key values from the collection. 3. Drop and rebuild the compound text index. Previous description: > db.a.find() > db.a.createIndex({'x.name':'text'}) { "createdCollectionAutomatically" : true, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 } > db.a.insert({i:1, x:[{name:'pet', language:'en'}, {name:'animal domestique', language:'fr'}]}) WriteResult({ "nInserted" : 1 }) > db.a.insert({i:1, x:[{name:'fart', language:'en'}, {name:'pet', language:'fr'}]}) WriteResult({ "nInserted" : 1 }) > db.a.insert({i:1, x:[{name:'wax', language:'en'}, {name:'farter', language:'fr'}]}) WriteResult({ "nInserted" : 1 }) > db.a.find({$text:{$search:'fart', $language:'fr'}, 'x.language':'fr'}) { "_id" : ObjectId("584868134fb941395745c6bb"), "i" : 1, "x" : [ { "name" : "wax", "language" : "en" }, { "name" : "farter", "language" : "fr" } ] } { "_id" : ObjectId("584867ec4fb941395745c6ba"), "i" : 1, "x" : [ { "name" : "fart", "language" : "en" }, { "name" : "pet", "language" : "fr" } ] } > db.a.dropIndexes() > db.a.createIndex({'x.language':1, 'x.name': 'text'}) > db.a.find({'x.language':'fr', $text:{$search: 'fart', $language:'fr'}}) > db.a.find({'x.language':'fr', $text:{$search: 'fart', $language:'en'}}) > db.a.find({'x.language':'en', $text:{$search: 'fart', $language:'en'}}) // empty results
MongoDB Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.