...
I've attached a repro script, which is just the generated test trimmed down to make it run much faster. The generated test came from this failing task. The offending query is the following: [js_test:agg_fuzzer-8818-1618157020929-0] "command" : { [js_test:agg_fuzzer-8818-1618157020929-0] "aggregate" : "fuzzer_coll", [js_test:agg_fuzzer-8818-1618157020929-0] "pipeline" : [ [js_test:agg_fuzzer-8818-1618157020929-0] { [js_test:agg_fuzzer-8818-1618157020929-0] "$sort" : { [js_test:agg_fuzzer-8818-1618157020929-0] "_id" : 1 [js_test:agg_fuzzer-8818-1618157020929-0] } [js_test:agg_fuzzer-8818-1618157020929-0] }, [js_test:agg_fuzzer-8818-1618157020929-0] { [js_test:agg_fuzzer-8818-1618157020929-0] "$limit" : 20 [js_test:agg_fuzzer-8818-1618157020929-0] }, [js_test:agg_fuzzer-8818-1618157020929-0] { [js_test:agg_fuzzer-8818-1618157020929-0] "$graphLookup" : { [js_test:agg_fuzzer-8818-1618157020929-0] "from" : "fuzzer_coll", [js_test:agg_fuzzer-8818-1618157020929-0] "startWith" : { [js_test:agg_fuzzer-8818-1618157020929-0] "$setUnion" : [ [js_test:agg_fuzzer-8818-1618157020929-0] [ [js_test:agg_fuzzer-8818-1618157020929-0] /Dobra/ [js_test:agg_fuzzer-8818-1618157020929-0] ], [js_test:agg_fuzzer-8818-1618157020929-0] "$array", [js_test:agg_fuzzer-8818-1618157020929-0] "$obj.obj.obj.obj.array" [js_test:agg_fuzzer-8818-1618157020929-0] ] [js_test:agg_fuzzer-8818-1618157020929-0] }, [js_test:agg_fuzzer-8818-1618157020929-0] "connectFromField" : "obj.obj.obj.obj.obj.obj", [js_test:agg_fuzzer-8818-1618157020929-0] "connectToField" : "obj.obj.obj.obj.date", [js_test:agg_fuzzer-8818-1618157020929-0] "as" : "array", [js_test:agg_fuzzer-8818-1618157020929-0] "depthField" : "obj.obj.num", [js_test:agg_fuzzer-8818-1618157020929-0] "restrictSearchWithMatch" : { [js_test:agg_fuzzer-8818-1618157020929-0] "$or" : [ [js_test:agg_fuzzer-8818-1618157020929-0] { [js_test:agg_fuzzer-8818-1618157020929-0] "obj.obj.obj.num" : { [js_test:agg_fuzzer-8818-1618157020929-0] "$not" : { [js_test:agg_fuzzer-8818-1618157020929-0] "$bitsAnyClear" : 1001 [js_test:agg_fuzzer-8818-1618157020929-0] } [js_test:agg_fuzzer-8818-1618157020929-0] } [js_test:agg_fuzzer-8818-1618157020929-0] } [js_test:agg_fuzzer-8818-1618157020929-0] ] [js_test:agg_fuzzer-8818-1618157020929-0] } [js_test:agg_fuzzer-8818-1618157020929-0] } [js_test:agg_fuzzer-8818-1618157020929-0] } [js_test:agg_fuzzer-8818-1618157020929-0] ], [js_test:agg_fuzzer-8818-1618157020929-0] "maxTimeMS" : 30000, [js_test:agg_fuzzer-8818-1618157020929-0] "cursor" : { [js_test:agg_fuzzer-8818-1618157020929-0] [js_test:agg_fuzzer-8818-1618157020929-0] }, [js_test:agg_fuzzer-8818-1618157020929-0] "$db" : "fuzzer" Interestingly, due to SERVER-56083, the query does not run using SBE in either the control (4.4) or experimental (5.0) versions. I suspect, however, that the queries executed recursively by $graphLookup end up using SBE, which is where things have gone wrong. Looking at the output from the fuzzer, I can see that both servers returned 20 documents with the same _id values. However, none of the 20 documents returned by 4.4 were equal to any of the documents returned by 5.0. Drilling down, I looked specifically at the documents returned by each version with _id:0. I can see that the only difference between the two is the "array" field, which is what is produced by the $graphLookup stage. For the document with _id:0, I dumped the contents of "array" from the lefthand-side server into a collection called "l" and the contents of "array" from the righthand-side server into a collection called "r", and this shows that the lengths of the arrays were different: MongoDB Enterprise > db.l.count() 190 MongoDB Enterprise > db.r.count() 188 I believe this means that a few documents were missed by the $graphLookup when SBE was enabled. The following query shows exactly which were missed, specifically for the $graphLookup associated with _id:0: MongoDB Enterprise > db.l.aggregate([{$lookup: {from: "r", as: "match", localField: "_id", foreignField: "_id"}}, {$match: {match: {$size: 0}}}]).pretty() { "_id" : 1078, "str" : "orchid", "num" : NumberLong(2178), "date" : ISODate("2019-04-22T20:50:00.826Z"), "array" : [ NumberLong(32525), NumberLong(44047), ISODate("2019-06-13T02:41:44.348Z"), "Tools", "Grove Product", 28213, ISODate("2019-03-13T20:33:11.273Z"), ISODate("2019-09-15T05:04:20.672Z"), ISODate("2019-11-11T06:16:33.307Z"), { "_id" : 1079, "str" : "Georgia Iraqi Dinar Lead", "num" : 69643, "date" : NaN, "obj" : { "_id" : 1080, "str" : "generate THX virtual", "num" : NumberLong(94448), "array" : [ NumberDecimal("-643.826448748712"), [ ], NumberDecimal("272.28819709550726"), null, "e-tailers Mission" ], "obj" : { "_id" : 1081, "str" : "envisioneer evolve Forks", "num" : NumberDecimal("-397.33322040453436"), "date" : ISODate("2019-07-05T14:28:07.010Z"), "obj" : { } } } } ], "obj" : { "_id" : 1082, "str" : "Intranet", "num" : 74811, "array" : [ { "_id" : 1083, "date" : ISODate("2019-05-15T16:46:30.286Z"), "array" : [ { "_id" : 1084, "str" : null, "num" : 20959, "date" : null, "array" : [ ], "obj" : { } }, null, NaN, ISODate("2019-05-13T14:55:24.201Z") ] } ], "obj" : { "_id" : 1085, "array" : [ 66718, "back-end intangible", [ "Multi-tiered turquoise Credit Card Account", NumberDecimal("-4.27904291615323"), { "_id" : 1086, "date" : ISODate("2019-10-15T15:08:13.433Z"), "array" : [ ] } ] ], "obj" : { "_id" : 1087, "num" : NumberDecimal("-740.555235767481"), "date" : ISODate("2019-11-23T19:12:50.139Z") }, "num" : NumberLong(0) } }, "match" : [ ] } { "_id" : 469, "str" : "Tasty Rubber Pizza Bond Markets Units European Composite Unit (EURCO) robust", "num" : NumberLong(81616), "date" : ISODate("2019-08-05T01:32:15.829Z"), "array" : [ ISODate("2019-01-28T14:08:07.624Z"), "New Jersey Consultant", 23767, 72540, ISODate("2019-12-28T19:30:44.661Z"), NumberLong(32526), "SQL", [ ], [ ], 75037 ], "obj" : { "_id" : 470, "str" : "Checking Account", "num" : NumberLong(79901), "date" : ISODate("2019-02-14T12:40:50.831Z"), "array" : [ "CSS navigate Singapore", null, { "_id" : 471, "str" : "quantify", "num" : NumberDecimal("-793.7706100048798"), "date" : ISODate("2019-02-20T16:39:48.120Z"), "array" : [ NaN, NaN ], "obj" : { "_id" : 472, "obj" : { "_id" : 473, "num" : 29622, "date" : ISODate("2019-09-20T13:10:23.122Z"), "array" : [ null, ISODate("2019-10-15T02:26:25.924Z"), [ ] ], "obj" : { } } } } ], "obj" : { "_id" : 474, "date" : NaN, "array" : [ ], "obj" : { "_id" : 475, "str" : "Mobility", "num" : NumberDecimal("217.89759387808635") }, "num" : NumberLong(0) } }, "match" : [ ] }
david.storch commented on Thu, 15 Apr 2021 19:42:57 +0000: After some more investigation, it looks like the known failures involving $graphLookup are manifestations of other known issues. Most of them are SERVER-56102 as mentioned above. The failure involving $concat is part of SERVER-53900. I am closing this ticket as a duplicate. If we see future failures of $graphLookup, I would expect them to be manifestations of other underlying problems, not necessarily caused by $graphLookup itself. I suspect we are seeing $graphLookup fail because this is one of the few ways that we hit SBE at the moment in the aggregation_multiversion_fuzzer due to SERVER-56083. david.storch commented on Wed, 14 Apr 2021 16:50:31 +0000: I just noticed that the $graphLookup involves $bitsAnyClear inside the restrictWithMatch match expression. Therefore, it seems likely that the differing results are due to SERVER-56102, which I have linked as related! We should have the same assignee work on this ticket and SERVER-56102, as they now appear to be closely related (probably duplicates). We should still use this ticket to investigate the second test failure that I mentioned above. david.storch commented on Wed, 14 Apr 2021 14:54:56 +0000: I'm realizing that I'm not sure how "l" and "r" map to 4.4 and 5.0 in my description above. So when this gets investigated, we should keep in mind that either SBE is returning extraneous results or has missing results. I'm not sure which. david.storch commented on Wed, 14 Apr 2021 14:29:57 +0000: As part of this ticket, we should also investigate this task failure. In this case, there is a $graphLookup with the "restrictWithMatch" option. An expression inside "restrictWithMatch" throws an exception when running against 5.0 (SBE on) but the query succeeds when running against 4.4 (SBE off).
python3 buildscripts/resmoke.py run --suites=jstestfuzz --additionalFeatureFlags=featureFlagSBE repro.js
Click on a version to see all relevant bugs
MongoDB Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.