...
BugZero found this defect 2885 days ago.
I have a 32 GB WT server running 3.0.8 The "bytes in cache" always seem to hover around ~1GB even though I have configured the cache size to be around ~26GB. The total data size is around 40GB. Why is the rest of the cache not being used? > db.serverStatus().wiredTiger.cache { "tracked dirty bytes in the cache" : 0, "tracked bytes belonging to internal pages in the cache" : 27634342, "bytes currently in the cache" : 950556531, "tracked bytes belonging to leaf pages in the cache" : 27889653082, "maximum bytes configured" : 27917287424, "tracked bytes belonging to overflow pages in the cache" : 0, "bytes read into cache" : 17875008775, "bytes written from cache" : 120063944822, "pages evicted by application threads" : 0, "checkpoint blocked page eviction" : 4, "unmodified pages evicted" : 0, "page split during eviction deepened the tree" : 1, "modified pages evicted" : 90924, "pages selected for eviction unable to be evicted" : 20, "pages evicted because they exceeded the in-memory maximum" : 1289, "pages evicted because they had chains of deleted items" : 14418, "failed eviction of pages that exceeded the in-memory maximum" : 15, "hazard pointer blocked page eviction" : 6, "internal pages evicted" : 0, "maximum page size at eviction" : 10499573, "eviction server candidate queue empty when topping up" : 1241, "eviction server candidate queue not empty when topping up" : 3, "eviction server evicting pages" : 0, "eviction server populating queue, but not evicting pages" : 1243, "eviction server unable to reach eviction goal" : 0, "pages split during eviction" : 1374, "pages walked for eviction" : 55451388, "eviction worker thread evicting pages" : 2086, "in-memory page splits" : 1238, "percentage overhead" : 8, "tracked dirty pages in the cache" : 0, "pages currently held in the cache" : 16807, "pages read into cache" : 2583552, "pages written from cache" : 7389684 } The only thing special about this server is that it has a large number of dbs, around 4k. Each db has around 3 collections.
thomas.schubert commented on Wed, 24 May 2017 08:34:01 +0000: Hi dharshanr@scalegrid.net, We haven’t heard back from you for some time, so I’m going to mark this ticket as resolved. If this is still an issue for you, please provide additional information and we will reopen the ticket. Regards, Thomas mark.agarunov commented on Fri, 7 Apr 2017 19:12:03 +0000: Hello dharshanr@scalegrid.net, Thank you for providing the data. Looking over this, the configured cache size is reported as 18GB in the diagnostic data, however above I see you have it configured for 26GB. Is it possible the diagnostic data is from a different node? Additionally, from the diagnostic data, it seems that the cache is being used as expected - there haven't been enough requests to read much data into the cache. Thanks, Mark dharshanr@scalegrid.net commented on Thu, 6 Apr 2017 16:56:06 +0000: Mark - were you able to glean anything from the logs? dharshanr@scalegrid.net commented on Sat, 25 Mar 2017 18:25:44 +0000: Done. I have uploaded the zip file of the logs. mark.agarunov commented on Thu, 23 Mar 2017 17:31:16 +0000: Hello dharshanr@scalegrid.net, I've generated a secure upload portal so that you can send us the files. Thanks, Mark dharshanr@scalegrid.net commented on Thu, 23 Mar 2017 16:15:34 +0000: Is there a location I can upload the traces? I think they are too big to attach to the bug. dharshanr@scalegrid.net commented on Wed, 22 Mar 2017 22:22:25 +0000: The issue is persistent. The cache size never goes beyond 1GB. How long of a duration do you want the traces for? mark.agarunov commented on Wed, 22 Mar 2017 20:43:19 +0000: Hello dharshanr@scalegrid.net, Thank you for the report. To better investigate this behavior, please run the following commands and provide the ss.log and iostat.log files that are created: delay=1 mongo --eval "while(true) {print(JSON.stringify(db.serverStatus({tcmalloc:true}))); sleep(1000*${delay:?})}" >ss.log & iostat -k -t -x ${delay:?} >iostat.log & Please leave this running until the issue happens again so that there is a complete log. Thanks, Mark