...
While doing some benchmarks, I found out strange performance degradation for MongoDB for read only workload on YCSB (WorkloadC). You can see it on the right side of the graph where the number of clients is getting bigger: I tried to investigate using perf tool and figured out that the only two metrics that are growing significantly are number of cpu migrations and syscall `sched_yield`. On the next graph you can see the dynamics of cpu migration event, number of `sched_yield` also grew 3 times bigger. Judging from source code I can assume that it's somehow related to spin locks, since the only place where I found `sched_yield` is `spin_lock.cpp`.
thomas.schubert commented on Thu, 25 Oct 2018 19:00:45 +0000: Hi erthalion, I'm going to close this ticket for the time being since there have been a number of fixes that have landed that we expect to improve performance on this branch, please feel free to comment with an update after running new tests and we'll reopen the ticket. Thank you, Kelsey erthalion commented on Tue, 27 Mar 2018 06:52:32 +0000: Hi Kelsey, Thank you for the information. Yes, back then I was testing MongoDB 3.4.4. Soon I'm going to do another round of benchmarks with new versions of all the databases, and we can compare the performance. thomas.schubert commented on Thu, 18 Jan 2018 22:27:36 +0000: Hi erthalion, My understanding is that these tests were executed with MongoDB 3.4.4, is that correct? I'm curious whether you see the same behavior on a more recent version of MongoDB 3.4, which would included WT-3345. As you can see in WT-3345, we made significant improvements to the rwlocks in WiredTiger, which may affect the performance you're observing in these benchmarks. Thank you, Kelsey erthalion commented on Sat, 30 Dec 2017 08:37:26 +0000: Hi Henrik, Thanks for your response. Yes, I'm aware about that. But at the same time I made the same kind of test for PostgreSQL and MySQL, and performance degradation was not that significant there - that's why I thought it's strange and maybe worth mentioning. henrik.edin commented on Tue, 12 Dec 2017 19:35:36 +0000: Hi erthalion, MongoDB uses a thread-per-connection model. This means that when the amount of connections increases the amount of context switches between threads also increases. As context switches aren’t free this is unfortunately a behavior that can be expected. There are several reasons why this can be expensive: cache misses, page faults, contention on spin locks, etc. In the public source depot, you can see experimentation on a thread pool model where different connections can execute on the same thread. It is too early to promise any results but you might find it worthwhile to follow that project. Henrik ian@10gen.com commented on Fri, 1 Dec 2017 15:26:03 +0000: Thanks for filing this, Dmitry! We believe that some upcoming work by the Platforms team on rate-limiting AsyncIO will improve the behavior you're seeing towards the right side of the graph. henrik.ingo@10gen.com commented on Tue, 14 Nov 2017 15:56:53 +0000: Just adding a note that I met Dmitry at Highload++ and asked him to file this. (Thanks Dmitry!) I don't know more about this than is also described here, but his observation that a spinlock would be involved caught my interest. Also note that our current performance testing is on much more powerful instances than this, so we wouldn't experience these conditions (at least not with the same YCSB setup). thomas.schubert commented on Mon, 13 Nov 2017 21:34:20 +0000: Hi erthalion, Thank you for the report; I've assigned this issue to the Storage Team for evaluation. Kind regards, Kelsey
Run YCSB/WorkloadC on MongoDB 3.4/3.2 (m4.xlarge).