...
The text search feature appears to stem the search term, but not the data itself. When I have 'physical' in the document for example, and I search 'physical', the search term becomes 'physic' but does not return what should be a straight match.
sderickson commented on Thu, 3 Oct 2013 18:28:33 +0000: After more testing, I figured it out: these objects I'm actually working with are much larger, but I cut down on the number of properties to keep the issue succinct. One of the other properties these docs actually have is 'language' which specifies what programming language the 'code' field is in. I see now according to the docs: http://docs.mongodb.org/manual/tutorial/specify-language-for-text-index/ 'language' is a field used to specify what spoken language the document is in for the text search. In my case, the database was trying to stem the document for the language 'coffeescript'. We'll figure out a different property name for this case. Sorry for submitting a false issue, and thanks for helping me sort this one out for myself. dan@10gen.com commented on Thu, 3 Oct 2013 04:12:26 +0000: I am unable to reproduce this reported behavior in 2.5.2. All text searches return successfully. Are you doing anything different than the following steps? > db.version() 2.5.2 > db.f.drop() true > db.f.insert({ "name": "Physical", "index": true, "code": "...", "description": "This Thang has physical presence (position, size, shape).", "system": "physics", }) > db.f.ensureIndex({index: 1, name: 'text', description: 'text', system: 'text'}, {background:true, sparse:true}) > db.f.getIndexes() [ { "v" : 1, "key" : { "_id" : 1 }, "ns" : "fts.f", "name" : "_id_" }, { "v" : 1, "key" : { "index" : 1, "_fts" : "text", "_ftsx" : 1 }, "ns" : "fts.f", "name" : "index_1_name_text_description_text_system_text", "background" : true, "sparse" : true, "weights" : { "description" : 1, "name" : 1, "system" : 1 }, "default_language" : "english", "language_override" : "language", "textIndexVersion" : 1 } ] > db.f.runCommand('text', {search: 'physical', filter: {index:true}}) { "queryDebugString" : "physic||||||", "language" : "english", "results" : [ { "score" : 2.5833333333333335, "obj" : { "_id" : ObjectId("524ced9663b033a904373eea"), "name" : "Physical", "index" : true, "code" : "...", "description" : "This Thang has physical presence (position, size, shape).", "system" : "physics" } } ], "stats" : { "nscanned" : 1, "nscannedObjects" : 0, "n" : 1, "nfound" : 1, "timeMicros" : 198 }, "ok" : 1 } > db.f.runCommand('text', {search: 'presence', filter: {index:true}}) { "queryDebugString" : "presenc||||||", "language" : "english", "results" : [ { "score" : 0.5833333333333334, "obj" : { "_id" : ObjectId("524ced9663b033a904373eea"), "name" : "Physical", "index" : true, "code" : "...", "description" : "This Thang has physical presence (position, size, shape).", "system" : "physics" } } ], "stats" : { "nscanned" : 1, "nscannedObjects" : 0, "n" : 1, "nfound" : 1, "timeMicros" : 123 }, "ok" : 1 } > db.f.runCommand('text', {search: 'size', filter: {index:true}}) { "queryDebugString" : "size||||||", "language" : "english", "results" : [ { "score" : 0.5833333333333334, "obj" : { "_id" : ObjectId("524ced9663b033a904373eea"), "name" : "Physical", "index" : true, "code" : "...", "description" : "This Thang has physical presence (position, size, shape).", "system" : "physics" } } ], "stats" : { "nscanned" : 1, "nscannedObjects" : 0, "n" : 1, "nfound" : 1, "timeMicros" : 165 }, "ok" : 1 } > db.f.runCommand('text', {search: 'physics', filter: {index:true}}) { "queryDebugString" : "physic||||||", "language" : "english", "results" : [ { "score" : 2.5833333333333335, "obj" : { "_id" : ObjectId("524ced9663b033a904373eea"), "name" : "Physical", "index" : true, "code" : "...", "description" : "This Thang has physical presence (position, size, shape).", "system" : "physics" } } ], "stats" : { "nscanned" : 1, "nscannedObjects" : 0, "n" : 1, "nfound" : 1, "timeMicros" : 132 }, "ok" : 1 }
Add a document to the collection like this: { "name": "Physical", "index": true, "code": "...", "description": "This Thang has physical presence (position, size, shape).", "system": "physics", } Add an index to the system with values: {index: 1, name: 'text', description: 'text', system: 'text'} that is sparse. The one in my system looks like this: { "v": 1, "key": { "index": 1, "_fts": "text", "_ftsx": 1 }, "ns": "coco.level.components", "name": "search index", "sparse": true, "background": true, "safe": null, "weights": { "description": 1, "name": 1, "system": 1 }, "default_language": "english", "language_override": "language", "textIndexVersion": 1 } Now from the client, run searches like: db.level.components.runCommand('text', {search: 'physical', filter: {index:true}}) // fail db.level.components.runCommand('text', {search: 'presence', filter: {index:true}}) // fail db.level.components.runCommand('text', {search: 'size', filter: {index:true}}) // hit db.level.components.runCommand('text', {search: 'physics', filter: {index:true}}) // fail Each of the failures, when I checked the results, had stemmed the search term. So 'physics' and 'physical' became 'physic', 'presence' became 'presenc'. 'size' stayed the same, and hit the sample. Update the document to have 'presenc' instead of 'presence', and then search for 'presence', and the search is success. So if the document text is stemmed manually, it seems, the search works. I don't know if the system is failing to stem the words before indexing them, but that's what it looks like based on these tests.
Click on a version to see all relevant bugs
MongoDB Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.