BugZero | MongoDB BugID 226893 - WT high memory usage due to high amount of free me...

MongoDB - Defect ID: 226893

WT high memory usage due to high amount of free memory accumulated by TCMalloc

MongoDB - Defect ID: 226893

WT high memory usage due to high amount of free memory accumulated by TCMalloc

Last updated on December 6th, 2022

BugZero Risk Score
6.5 Medium

Overall: 6.5

Severity: 6.4

Community: 8.8

Lifecycle: 9.1

What is the BugZero Risk Score?

MongoDB Integration

Learn more about where this data comes from

MongoDB Integration

Learn more

Bug Scrub Advisor

Streamline upgrades with automated vendor bug scrubs

Bug Scrub Advisor

Learn more

BugZero Enterprise

Wish you caught this bug sooner? Get proactive today.

BugZero Enterprise

Learn more

Bug Details

Priority: Major - P3
Status: Closed

Description

Info

Environment: Stand alone mongod 3.1.6v WiredTiger configured with zlib compression and 3GB cache size Observation/Issues: When running w/ 100 + threads, WT cache reached to 8GB (2.6 x original cache size). When running w/ 9 threads, WT cache reached to 5GB (1.6 x original cache size). Breakdown of the 5G memory highlight the followings: 3 GB in the cache size (as expected) 1.7 GB - Current_total_thread_cache_bytes (MB) (higher than expectation) 0.45 GB - Total central_cache_free_bytes (MB) (higher than expectation) Problem: Our memory cache policy set for 1GB free, however the machine accumulate memory above this threshold.

Top User Comments

mark.benvenuto commented on Thu, 22 Oct 2015 21:54:28 +0000: alexander.gorrod Are you seeing similar thread_cache_free_bytes as you see in SERVER-20306? Do you think this is operating system specific? At first glance, I do not believe it is an OS specific issue. The Windows specific version makes TCMalloc heap private to WiredTiger so all these allocations in the TCMalloc heap are only from WT. I can dig into TCMalloc statistics (MallocExtension::instance()->GetStats() to see if there is something about our memory allocation patterns that are causing issues. I wonder if the thread free lists grow too large, or something else is going wrong? I am planning on modify the server to call GetStats. If you attach GDB, this script may help get detailed information: https://gist.github.com/alk/1148755. alexander.gorrod commented on Thu, 22 Oct 2015 05:05:34 +0000: I ran the same test with tcmalloc 2.4, and the behavior is not improved. My current recommendation is to leave this alone in the MongoDB code. It is possible to replicate the aggressive decommit behavior by setting an environment variable: TCMALLOC_AGGRESSIVE_DECOMMIT=true Upgrading to tcmalloc 2.4 in MongoDB shows a degradation in behavior for this workload. mark.benvenuto Could you review this ticket and let me know if you can think of anything else we could do to alleviate the issue? alexander.gorrod commented on Wed, 21 Oct 2015 02:36:48 +0000: A further resource is that there has been a change in upstream tcmalloc implementation to add a maximum memory budget for tcmalloc: https://github.com/gperftools/gperftools/issues/451 The configuration option is TCMALLOC_HEAP_LIMIT_MB. That code hasn't made it into a release of tcmalloc yet, but it should help with this problem in the future. alexander.gorrod commented on Wed, 21 Oct 2015 02:17:31 +0000: I reproduced this on a Windows 2012 r2, running with the latest 3.2 release candidate I see the same behavior as reported here: You can see that the run went for between 6 to 8 hours. The total thread cache bytes grows to about 3.5GB. I then made an adjustment to the tcmalloc configuration to enable aggressive decommit, and it generated the following results: The run was going for between 12 and 15 hours. You can see that the total thread cache bytes only grows to 1.4GB - which is still above the configured 1GB maximum, but much less worrying than 3.5x the configured maximum. I intend to re-run again with a newer release of tcmalloc (2.4, up from 2.2). The more recent release of tcmalloc enables aggressive decommit as the default. nick@innsenroute.com commented on Sat, 17 Oct 2015 16:18:04 +0000: And as a further note, from a practical perspective, this isn't much of an issue for me, although from a technical perspective I'm sure you want to track it down. A WT caching strategy that addresses the issues with Windows file system cache has a far bigger impact in my world that this (SERVER-19795, SERVER-20991). nick@innsenroute.com commented on Sat, 17 Oct 2015 16:13:14 +0000: A quick note that in some recent tests with 3.2RC0, I only see overage with lower WT caps. For example, setting the cache size to nick@innsenroute.com commented on Mon, 12 Oct 2015 16:27:08 +0000: @Alexander Gorrod - I can provide you with the repro for this, although 20306 looks like it is probably the same issue and the repro there looks more straight-forward. Email me if you want the repro. alexander.gorrod commented on Mon, 12 Oct 2015 04:59:48 +0000: I haven't been able to reproduce this behavior by reconstructing the described workload. There is another ticket for a similar memory consumption issue SERVER-20306. I think we should close this ticket as cannot reproduce. ramon.fernandez do you agree? eitan.klein commented on Wed, 26 Aug 2015 17:36:46 +0000: More details from Nick J Configuration: 3.1.7, Stand Alone WT configured for 3 GB of cache. ~100 threads produced foot print of 8GB heap allocation. but with ~5 GB of free memory in TCMalloc (per thread cache + central cache) michael.cahill commented on Tue, 25 Aug 2015 02:33:29 +0000: It looks like this is really concerning caching within tcmalloc – from the stats, the WiredTiger cache stays flat at ~2.4GB and tcmalloc reports ~2.5GB memory allocated, so there is no significant amount of additional memory allocated by MongoDB or WiredTiger. MongoDB sets tcmalloc.max_total_thread_cache_bytes to 1GB by default – it seems to be going over that in this case. acm, is there someone on your team who can investigate?

Steps to Reproduce

mongod --dbpath=d:\mongo --port=27200 --wiredTigerCacheSizeGB=3 --wiredTigerJournalCompressor=zlib --wiredTigerCollectionBlockCompressor=zlib User workload (Nick J)

Change history

No changes to display

Links

Relevant Products

Click on a version to see all relevant bugs

Affected versions:No known affected versions

Fixed versions: No known fixed versions

Relevant Products

Click on a version to see all relevant bugs

Affected versions:No known affected versions

Fixed versions: No known fixed versions

Top MongoDB Defects by Risk Score

5.9Defect ID: 2956672
Some time-series tests implicitly rely on measurement insertion order for unordered inserts when checking bucket catalog stats
6.14Defect ID: 2965528
Remove push, publish_packages, and crypt_push tasks from Graviton 4 variants in v7.0 and v8.0
6.14Defect ID: 2947969
[SBE] Release storage engine resources when saveState() or restoreState() throws
5.68Defect ID: 2919474
StackLocator broken by v5 toolchain ASAN
5.88Defect ID: 2968769
Make new write path helper functions use acquireAndValidateBucketsCollection instead of acquireCollection

MongoDB Integration

Learn more about where this data comes from

MongoDB Integration

Learn more

Bug Scrub Advisor

Streamline upgrades with automated vendor bug scrubs

Bug Scrub Advisor

Learn more

BugZero Enterprise

Wish you caught this bug sooner? Get proactive today.

BugZero Enterprise

Learn more

Ready to prevent the next vendor outage?

Get a demo

OPERATIONAL DEFECT DATABASE

MongoDB - Defect ID: 226893

WT high memory usage due to high amount of free memory accumulated by TCMalloc

MongoDB - Defect ID: 226893

WT high memory usage due to high amount of free memory accumulated by TCMalloc

Last updated on December 6th, 2022

BugZero Risk Score6.5 Medium

Bug Details

Info

Top User Comments

Steps to Reproduce

Links

Top MongoDB Defects by Risk Score

Ready to prevent the next vendor outage?

BugZero Risk Score
6.5 Medium