...
BugZero found this defect 2734 days ago.
I have next mongo configuration: 1. 2 data nodes in replica set 2. 1 arbiter for the replica set 3. Java based client Both data nodes are configured next way: replication: oplogSizeMB: 1024 replSetName: arb storage: dbPath: /mnt/raid10/mongo journal: enabled: true commitIntervalMs: 500 directoryPerDB: true syncPeriodSecs: 60 engine: wiredTiger wiredTiger: engineConfig: directoryForIndexes: true
thomas.schubert commented on Fri, 15 Sep 2017 04:05:57 +0000: Hi dorlov, Sorry for the delay getting back to you. I've examined the diagnostic.data and believe that the behavior you're observing is expected. During checkpoints, the system becomes I/O bound and queries are impacted. Due to the nature of replication, this bottleneck may have larger impact as reads queue behind the oplog applier. From the provided data, I do not see anything to indicate a bug in the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-users group. Kind regards, Kelsey dorlov commented on Wed, 16 Aug 2017 07:34:33 +0000: Files are uploaded thomas.schubert commented on Tue, 15 Aug 2017 18:55:07 +0000: I've created a secure upload portal for you to use. Files uploaded to this portal are only visible to MongoDB employees investigating this issue and are routinely deleted after some time. thomas.schubert commented on Tue, 15 Aug 2017 16:00:25 +0000: Hi dorlov, Thanks for reporting this behavior. So we can continue to investigate, would you please provide an archive of the diagnostic.data in the $dbpath? Please be sure to include diagnostic.data for both primary and secondary nodes so we can compare. Thank you, Thomas
Works fine steps: 1. Setup java client with read preferences "primaryPreferred" 2. Check read performance (aprox. 2200 rps with avg. time 0.3 ms) 3. During fsync operation (every minute) response time dropped to 0.5 ms Change secondary and primary via rs.stepDown - time pattern is the same Works strange steps: 1. Setup java client with read preferences "secondaryPreferred" 2. Check read performance (aprox. 2200 rps with avg. time 0.3 ms) 3. During fsync operation (every minute) response time dropped to 10 ms (20 times slower) Change secondary and primary via rs.stepDown - time pattern is the same Looks like secondary node works incorrect during fsync events and blocks most read operations