BugZero | MongoDB BugID 2108223 - BSONObjectTooLarge when $merge during aggregation

MongoDB - Defect ID: 2108223

BSONObjectTooLarge when $merge during aggregation

MongoDB - Defect ID: 2108223

BSONObjectTooLarge when $merge during aggregation

Last updated on 3/10/2023

Overall: 6.16.1

Severity: 6.46.4

Community: 6.46.4

Lifecycle: 9.19.1

What is the BugZero Risk Score?

Vendor details

Priority: Major - P3
Status: Closed

Overall: 6.16.1

Severity: 6.46.4

Community: 6.46.4

Lifecycle: 9.19.1

What is the BugZero Risk Score?

Vendor details

Priority: Major - P3
Status: Closed

Info

Summary The aggregation query is executed through the Java Web application server.(Java WAS) When the $merge stage is executed at the end of the mongo aggregation, sometimes the “BSONObjectTooLarge” error occurs. The “BSONObjectTooLarge” occurs in the the various size of the result set, which size is from 20k docs to 300k docs. a hours after the error occurs, if I run the errored query through Java WAS, it again shows the BSONObjectTooLarge error, but right after the error shows, if I run the query again, the query executed well with no errors. In addition, when I run that query through the mongo client, such as mongo shell(mongo) or ‘NoSQLBooster for MongoDB’, it does NOT shows any error. I uses the 5 or more different aggregation queries to compute the big size of documents, the one of the common part is the $merge stage is at the end of the aggregation queries. In addition, big size result set may causes the error but not always shows the error. com.mongodb.internal.connection.ProtocolHelper.getCommandFailureException(ProtocolHelper.java:198) com.mongodb.internal.connection.InternalStreamConnection$2$1.onResult(InternalStreamConnection.java:517) com.mongodb.internal.connection.InternalStreamConnection$2$1.onResult(InternalStreamConnection.java:503) com.mongodb.internal.connection.InternalStreamConnection$MessageHeaderCallback$MessageCallback.onResult(InternalStreamConnection.java:826) com.mongodb.internal.connection.InternalStreamConnection$MessageHeaderCallback$MessageCallback.onResult(InternalStreamConnection.java:790) com.mongodb.internal.connection.InternalStreamConnection$5.completed(InternalStreamConnection.java:650) com.mongodb.internal.connection.InternalStreamConnection$5.completed(InternalStreamConnection.java:647) com.mongodb.internal.connection.AsynchronousChannelStream$BasicCompletionHandler.completed(AsynchronousChannelStream.java:250) com.mongodb.internal.connection.AsynchronousChannelStream$BasicCompletionHandler.completed(AsynchronousChannelStream.java:233) java.base/sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:127) java.base/sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:282) java.base/sun.nio.ch.WindowsAsynchronousSocketChannelImpl$ReadTask.completed(WindowsAsynchronousSocketChannelImpl.java:581) java.base/sun.nio.ch.Iocp$EventHandlerTask.run(Iocp.java:387) java.base/sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:112) java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) java.base/java.lang.Thread.run(Thread.java:829), {{}} messag: {{}} Command failed with error 10334 (BSONObjectTooLarge): 'PlanExecutor error during aggregation :: caused by :: BSONObj size: 29719999 (0x16C57DBF) is invalid. Size must be between 0 and 16793600(16MB) First element: update: "indicator_by_flow_pt10m"' on server myserver.com:27617. the full response is {\"ok\": 0.0, \"errormsg\":\"PlanExecutor error during aggregation :: caused by :: BSONObj size: 29719999 (0x16C57DBF) is invalid. Size must be between 0 and 16793600(16MB) First element: update: \\\"indicator_by_flow_pt10m\\\"\", \"code\": 10334, \"codName: \"BSONObjectTooLarge\", \"$clusterTime\": {\"clusterTime\", {\"$timestamp\": {\"t\": 1659918034, \"i\": 2031}}, \"signature\": {\"hash\": {\"$binary\": {\"base64\": \"JqcdMT7Jg3PTSajRBz+J082pjPs=\", \"subType\": \"00\"}}, \"keyId\": 7097070755241787396}}, \"operationTime\": {\"$timestamp\": {\"t\": 1659918034, \"i\": 1990}}} {{}} Please provide the version of the driver. If applicable, please provide the MongoDB server version and topology (standalone, replica set, or sharded cluster). mongo version : v5.5, replica set driver version: org.mongodb:mongodb-driver-reactivestreams:4.5.1 How to Reproduce the below is the query i used, but the there are several different aggregation queries like the below, the aggregation queries are different but the common thing is the $merge stage at the end of the queries. // mydata collection document{ "_id" : ObjectId("6188c90f6c7455e39997e209"), "sessiontime" : ISODate("2021-11-08T15:40:00.000+09:00"), "appid" : "fdjksjldglakdgladglkla", "appver" : "3.5.5", "devicdid" : "jkdggsjal;hjglkjdks;ajglkdgjsa;", "devver" : "9", "devno" : "my-device-name1", "sessiontime2" : ISODate("2021-11-09T00:40:00.000+09:00"), "stzd" : ISODate("2021-11-08T09:00:00.000+09:00"), "sstime" : ISODate("2021-11-08T15:40:00.000+09:00"), "devicecount" : 1 } db.mydata.aggregate([ { "$match": { "sessiontime": { "$gte": ISODate("2022-05-23T08:00:00Z"), "$lt": ISODate("2022-05-24T08:00:00Z") } } }, { "$group": { "_id": { "sessiontime": "$sessiontime", "appid": "$appid", "appver": "$appver", "devicdid": "$devicdid", "devver": "$devver", "devno": "$devno" } } }, { "$replaceWith": { "$mergeObjects": [ { "sessiontime": "$_id.sessiontime", "appid": "$_id.appid", "appver": "$_id.appver", "devicdid": "$_id.devicdid", "devver": "$_id.devver", "devno": "$_id.devno" } ] } }, { "$lookup": { "from": "mycoll1", "let": { "sessiontime": "$sessiontime", "appid": "$appid", "appver": "$appver", "devicdid": "$devicdid", "devver": "$devver", "devno": "$devno" } , "pipeline": [ { "$match": { "$expr": { "$and": [ { "$eq": [ "$sessiontime", "$$sessiontime" ] } , { "$eq": [ "$appid", "$$appid" ] }, { "$eq": [ "$appver", "$$appver" ] }, { "$eq": [ "$devicdid", "$$devicdid" ] }, { "$eq": [ "$devver", "$$devver" ] }, { "$eq": [ "$devno", "$$devno" ] } ] } } } ], "as": "data" } }, { "$unwind": "$data" }, { "$group": { "_id": { "appid": "$data.appid", "appver": "$data.appver", "devicdid": "$data.devicdid", "devver": "$data.devver", "devno": "$data.devno", "sessiontime2": "$data.sessiontime" } , "devicecount": { "$sum": "$data.devicecount" } } }, { "$replaceWith": { "$mergeObjects": [ { "sessiontime2": "$_id.sessiontime2", "appid": "$_id.appid", "appver": "$_id.appver", "devicdid": "$_id.devicdid", "devver": "$_id.devver", "devno": "$_id.devno", "devicecount": "$devicecount" } ] } }, { "$merge": { "into": "mycoll2", "on": [ "sessiontime2", "appid", "appver", "devicdid", "devver", "devno" ], "whenMatched": "merge", "whenNotMatched": "insert" } } ]) Additional Background

Top User Comments

mihai.andrei commented on Tue, 4 Oct 2022 21:31:00 +0000: As david.storch@mongodb.com pointed out, this is likely a manifestation of SERVER-66289. Unless the example query is using primary read preference, I am going to go ahead and close this ticket as a duplicate of SERVER-66289. Please feel free to reach out if you have any other questions! david.storch commented on Wed, 21 Sep 2022 14:56:19 +0000: I didn't get a chance to review this carefully, but I'm wondering if this is a manifestation of SERVER-66289. Do we know if the failing operation is running on a secondary node or at least has non-primary read preference? JIRAUSER1265262 commented on Tue, 20 Sep 2022 03:11:25 +0000: Namhun, Specifically, I am suspecting an issue in your aggregation pipeline here. When I investigate this in compass on a standalone MongoDB 5.0.5 instance, I see you are generating a potentially massive array of documents in your $lookup stage. Once you have enough documents at this stage, this array can exceed the maximum document size value. This is an anti-pattern. For each record that ends up in your initial $match, an extra document is made in your $lookup stage, and each document contains a data value that itself is another copy of the document. If enough documents get to this stage, even if they are not individually over 16MB, they are being put into these nested data arrays and causing each of these documents to become larger. Then you will get this error: Your end stages didn't seem to have anything obviously wrong with them. I'd encourage you to check this part of your pipeline out, and use explain() or MongoDB Compass to troubleshoot your query further. I will need more information from you (such as a code reproduction, and the output of your explain() when running the aggregation) to evaluate other possibilities here. I'm not exactly sure what your use case is, but you may find suggestions for using $lookup with $mergeObjects in our docs here. I also still encourage you to check out our MongoDB Developer Community Forums as you will likely find further advice for improving your aggregation pipeline there. However, I do see that in our Aggregation Pipeline Limits docs, we assert that: If any single document exceeds the BSON Document Size limit, the aggregation produces an error. The limit only applies to the returned documents. During the pipeline processing, the documents may exceed this size. The db.collection.aggregate() method returns a cursor by default. And I think that this may be somewhat unclear in this case - I will follow up on that. Regards, Christopher JIRAUSER1270688 commented on Thu, 15 Sep 2022 05:17:54 +0000: Hi, Chris There days, i am working on other problem, so I couldn't check this. The last thing I did for this issue, I just add one more aggregation execution when the Exception occurs, and this helps some but not all, still I have an issue BSONObjectTooLarge even after retry. I have found the interesting logs. Our api runs every 10 minutes, and if failed, it try again after 10 minutes, maximum try is 6. Thus I can see the logs 6 failed one, this is the error cases before I added retry when BSONObjectTooLarge. In 6 logs, stack traces are same and error messages are same but BSONObj szie, see the below(sorry but I could not copy and paste the logs directly because of the company's security policy, so I just typed it) I can't sure the logs I just posted has meaning but, I'm just feel weird that the 'BSONObj size' caused by the error is variant in the same queries. Anyway hopes to be any help // the first thing is the fastest exection. { "stack" : ... "message" : "Command failed with error 10334 (BSONObjectTooLarge): 'PlanExecutor error during aggregation :: caused by :: BSONObj size: 20028216( 0x1319B38) is invalid. Size must be between 0 and 16793600(16MB) First element: ... }, { "stack" : ... "message" : "Command failed with error 10334 (BSONObjectTooLarge): 'PlanExecutor error during aggregation :: caused by :: BSONObj size: 20003933( 0x1313C5D) is invalid. Size must be between 0 and 16793600(16MB) First element: ... }, { "stack" : ... "message" : "Command failed with error 10334 (BSONObjectTooLarge): 'PlanExecutor error during aggregation :: caused by :: BSONObj size: 19968660( 0x130B294) is invalid. Size must be between 0 and 16793600(16MB) First element: ... }, { "stack" : ... "message" : "Command failed with error 10334 (BSONObjectTooLarge): 'PlanExecutor error during aggregation :: caused by :: BSONObj size: 19926786( 0x1300F02) is invalid. Size must be between 0 and 16793600(16MB) First element: ... }, { "stack" : ... "message" : "Command failed with error 10334 (BSONObjectTooLarge): 'PlanExecutor error during aggregation :: caused by :: BSONObj size: 19909481( 0x12FCB69) is invalid. Size must be between 0 and 16793600(16MB) First element: ... } By the way, what is your comments at the below mean? I think that the immediate document size does not have any limitation because the Mongo doc says "The limit only applies to the returned documents.". Am I wrong? > the aggregation pipeline has a chance of incidentally creating a single document larger than the max size with your workload just due to the amount of objects being merged (not that any of them are individually large). Take care JIRAUSER1265262 commented on Wed, 7 Sep 2022 19:24:33 +0000: Hi Namhun, We still need additional information to diagnose the problem. If this is still an issue for you, would you please provide additional information to reproduce what you're experiencing? JIRAUSER1265262 commented on Tue, 16 Aug 2022 17:23:45 +0000: Namhun, Thanks for the response. From what I can tell, the aggregation pipeline has a chance of incidentally creating a single document larger than the max size with your workload just due to the amount of objects being merged (not that any of them are individually large). However, I'm not sure if your Java workload differs from what you've reported. I would be interested in your assertion that it errors out on the first attempt in your Java server, but succeeds a second time. To better investigate this part, a more detailed reproduction featuring your use case would be helpful. Specifically, can you create a code reproduction we can use to verify your claim using the Java driver? As it stands, I don't currently see a problem unless you run in to the issue I mentioned earlier. Regards, Christopher JIRAUSER1270688 commented on Tue, 16 Aug 2022 00:34:22 +0000: Thanks for your reply, chris.kelly@mongodb.com I understand your answer and I fully understand it is possible to have a large document which has more than the 16MB size when a field value has the big size such as 10MB, like you mentioned. and I have known the fact that a single doc size over 16MB causes that error. https://www.mongodb.com/docs/manual/core/aggregation-pipeline-limits/ > Each document in the result set is subject to the 16 megabyte BSON Document Size limit. If any single document exceeds the BSON Document Size limit, the aggregation produces an error. The limit only applies to the returned documents. However, in our database, the string field which is over the 10MB does not exists, and, as I mentioned in the question, the werid thing is that the query, which occurs the BSONObjectSizeTooLarge, works well when the execution is done on the mongo client such as mongo shell or 'NoSQLBooster for MongoDB' with the same dataset Moreover it works too when the query is executed twice, the first execution fails and the second execution,right after the failure, is done well. Anyway, thanks your reply, and I will investigate more, too take care JIRAUSER1265262 commented on Mon, 15 Aug 2022 18:26:20 +0000: Namhun, The BSONObjectTooLarge error you're getting is likely due to the max document size restriction in MongoDB. EDIT: I misunderstood initially - in your case, you are hitting this BSONObjectTooLarge error not because your documents start out being large, but potentially because of this of the mergeObjects step in your pipeline: { "$replaceWith": { "$mergeObjects": [ { "sessiontime2": "$_id.sessiontime2", "appid": "$_id.appid", "appver": "$_id.appver", "devicdid": "$_id.devicdid", "devver": "$_id.devver", "devno": "$_id.devno", "devicecount": "$devicecount" } ] } } If the amount of data you end up merging here is over 16MB, you will receive this error. This is because $mergeObjects combines multiple documents into a single document. I was able to reproduce this issue when the documents being merged at this stage were over 16MB when combined (I did this by inflating the size of devicdid and appid to ~10MB, then merging) Since this is working as designed, we'd like to encourage you to start by asking our community for help by posting on the MongoDB Developer Community Forums. If the discussion there leads you to suspect a bug in the MongoDB server, then we'd want to investigate it as a possible bug here in the SERVER project. Regards, Christopher

Steps to Reproduce

5.9Defect ID: 2956672
Some time-series tests implicitly rely on measurement insertion order for unordered inserts when checking bucket catalog stats
6.14Defect ID: 2965528
Remove push, publish_packages, and crypt_push tasks from Graviton 4 variants in v7.0 and v8.0
6.14Defect ID: 2947969
[SBE] Release storage engine resources when saveState() or restoreState() throws
5.68Defect ID: 2919474
StackLocator broken by v5 toolchain ASAN
5.88Defect ID: 2968769
Make new write path helper functions use acquireAndValidateBucketsCollection instead of acquireCollection

Ready to prevent the next vendor outage?

Get a demo

OPERATIONAL DEFECT DATABASE

MongoDB - Defect ID: 2108223

BSONObjectTooLarge when $merge during aggregation

MongoDB - Defect ID: 2108223

BSONObjectTooLarge when $merge during aggregation

Last updated on 3/10/2023

Vendor details

Vendor details

Description

Info

Top User Comments

Steps to Reproduce

Links

Top MongoDB defects by risk score

Ready to prevent the next vendor outage?