...
Hi Team, Recently we migrated from mongo 4.0.27 to 4.2.23. After the migration, we see the performance is degraded. Note that, the workload, client, storage engine, dbParameters , client driver version and indexing remains same but, only the mongo version upgrade. We have been running CRUD operations on a replica-set but with the 4.2 we see the response time is going high and clients timeouts (we set readTimeout/socketTimeout as 1200). Due to the slow response , we see the maxWaitQueueSize also exceeded resulting in dropping of the database request. Again, with the same setup if we downgrade the mongo version alone, it performs better. . We understand there are many features introduced in mongo 4.2. We even tried changing the FeatureCompatibilityVersion to 4.0 on a mongod running with 4.2 but didn’t see any improvement. To reiterate, both the test performed on the same setup so no change in HW,resources, dbparameters, Storage engine and the CRUD operations. Mongo- Server Version: 4.2.23 Mongo Client Java Driver version: 3.12.9 Standalone or Replica : Replica set PSA . Issues seen in 5 members replicaSet as well (4 data bearing + 1 arbiter) Server Option: /usr/bin/mongod --ipv6 --slowms 500 --storageEngine wiredTiger --wiredTigerCacheSizeGB 8 --enableMajorityReadConcern false --bind_ip_all --port 27737 --dbpath=/var/data/sessions.1/3/set08 --replSet set08 --fork --pidfilepath /var/run/sessionmgr-27737.pid --oplogSize 5120 --logpath /var/log/mongodb-27737.log --logappend --quiet Storage Engine : WT dbPath : in RAMdisk/tmpfs Replica-Set Config: set08:PRIMARY> rs.conf() { "_id" : "set08", "version" : 9, "protocolVersion" : NumberLong(1), "writeConcernMajorityJournalDefault" : false, "members" : [ { "_id" : 0, "host" : "sessionmgr04:27737", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 2, "tags" : { }, "slaveDelay" : NumberLong(0), "votes" : 1 }, { "_id" : 1, "host" : "sessionmgr03:27737", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 3, "tags" : { } , "slaveDelay" : NumberLong(0), "votes" : 1 }, { "_id" : 2, "host" : "arbitervip:27737", "arbiterOnly" : true, "buildIndexes" : true, "hidden" : true, "priority" : 0, "tags" : { } , "slaveDelay" : NumberLong(0), "votes" : 1 } ], "settings" : { "chainingAllowed" : true, "heartbeatIntervalMillis" : 2000, "heartbeatTimeoutSecs" : 1, "electionTimeoutMillis" : 10000, "catchUpTimeoutMillis" : -1, "catchUpTakeoverDelayMillis" : 30000, "getLastErrorModes" : { } , "getLastErrorDefaults" : { "w" : 1, "wtimeout" : 0 } , "replicaSetId" : ObjectId("636d3c916a35f79c756653c8") } } Please give us the portal link to upload the mongotop,stat,logs and diagnostics.data.
JIRAUSER1258850 commented on Wed, 7 Dec 2022 18:13:21 +0000: Hi Edwin, Thanks for your initial analysis. As I mentioned in the case notes, just by changing the mongo server we are not seeing the same issue. So how do you explain this as application/client issue. The spike in connection from 66 to 88 is not a sharp. Also with 4.2.x the load on the mongo is not reflected in cpu,load. So it does prove that mongo 4.2+ is suffering some performance issues. Even with the lower DB Transaction Per TPS, the response time is very high in mongo 4.2. So we are looking for your valuable input by reviewing the parameters/config and recommend us some tuning to get the performance back. As I said, its the similar issue we are seeing with mongo 4.4 as well. So your input is highly appreciated. Regards, Venkataraman JIRAUSER1257066 commented on Tue, 6 Dec 2022 12:19:50 +0000: Hi veramasu@hcl.com Thank you for your patience while I investigate this issue. It appears that the workload on 4.2 is noisier than the workload in 4.0, but the throughputs remain the same at 2k reads per second and 2k writes per second. sessionmgr03_27737 Looking closer into the noisiness, there's evidence that latency is increased as you've described. Prior to the increased latency we see sharp increases in connections (B), which are subsequently reaped once operations complete (C). Following the connection count decrease, throughput appears to resume to normal. This behavior appears to be evident of a connection storm occurring on the application. For this issue we'd like to encourage you to ask our community for help by posting on the MongoDB Developer Community Forums. If the discussion there leads you to suspect a bug in the MongoDB server, then we'd want to investigate it as a possible bug here in the SERVER project. Best, Edwin JIRAUSER1258850 commented on Mon, 5 Dec 2022 06:11:22 +0000: Hi Edwin, Can you please share us your findings? Regards, Venkat JIRAUSER1258850 commented on Wed, 30 Nov 2022 19:21:54 +0000: Hi Edwin, Thanks for your reply. As I said the upload has 2 tarballs having 4.0 and 4.2 test results namely Test_4.0_Run.tar.gz and Test_4_2.Run.tar.gz . for the timeline, 4.2 -> from "Nov 19 03:35" to "Nov 19 04:10". you can check the mongostat-sessionmgr03-27737-2022-11-19_03-31-36.txt and mongotop-sessionmgr03-27737-2022-11-19_03-31-36.txt in the Test_4_2.Run.tar.gz you can observe that the response time is going in 4 or 5 digits. 4.0 -> from "Nov 21 04:00" to "Nov 21 06:00" . again you can check the mongostat-sessionmgr03-27737-2022-11-21_02-28-18.txt and mongotop-sessionmgr03-27737-2022-11-21_02-28-18.txt in the Test_4.0_Run.tar.gz. we had been using 3.6 for a very long time and 4.0 migration went smooth with no performance degradation issues. But as soon as we moved to 4.2, we started seeing the degradation. We dont use any of the mongo 4.x features(client application remains same). We also tried 4.4 and observed similar issues. So right now we are in stuck upgrading the mongo version from 4.0 to any higher version. Your help is greatly appreciated. JIRAUSER1257066 commented on Wed, 30 Nov 2022 10:29:00 +0000: Hi veramasu@hcl.com, Thank you for your patience. Can you please clarify the timestamps for when you ran the tests on both 4.2 and 4.0? Best, Edwin JIRAUSER1258850 commented on Tue, 29 Nov 2022 05:13:53 +0000: Hi Edwin, Let me know if you need any further details from us. Can we expect some initial analysis? Regards, Venkat JIRAUSER1258850 commented on Thu, 24 Nov 2022 23:16:00 +0000: Thanks Edwin for your reply. I uploaded set of logs for both 4.0 and 4.2 run. it contains the mongodb logs and diagnostics.data. also it has the mongotop and mongostat collected during the run. you can observe mongo 4.2 is having higher response time. JIRAUSER1257066 commented on Thu, 24 Nov 2022 16:34:07 +0000: I've created a secure upload portal for you. Files uploaded to this portal are hosted on Box, are visible only to MongoDB employees, and are routinely deleted after some time. For each node in the replica set spanning a time period that includes the workload for both 4.0.27 and 4.2.23, would you please archive (tar or zip) and upload to that link: the mongod logs the $dbpath/diagnostic.data directory (the contents are described here) Including diagnostic data prior to the upgrade will be important for comparing performance between the two versions. Best, Edwin
the Test runs a CRUD operation in a loop. with mongo 4.2 we see the response time is going high where as with 4.0.27 its not.