...
Using mongo::Mutex instead of raw std::mutex appears to inhibit TSAN's ability to detect lock order inversions. We should either fix the issue or only run TSAN on --enable-diagnostic-latches=off variants. Reproducer attached in 'steps to reproduce'.
xgen-internal-githook commented on Tue, 9 Jul 2024 01:13:27 +0000: Author: {'name': 'George Wangensteen', 'email': 'george.wangensteen@mongodb.com', 'username': 'gewa24'} Message: SERVER-88159 Run TSAN with diagnostic latches off (#24446) GitOrigin-RevId: 281f3b93b6d7cacf4b158b49f5b9adcc3db75a4d Branch: v8.0 https://github.com/mongodb/mongo/commit/7c279a50d614af568eacc05e8e004f2696fde288 xgen-internal-githook commented on Tue, 2 Jul 2024 08:42:50 +0000: Author: {'name': 'auto-revert-app[bot]', 'email': '166078896+auto-revert-app[bot]@users.noreply.github.com', 'username': 'auto-revert-app[bot]'} Message: Revert "SERVER-88159 Run TSAN with diagnostic latches off (#24177)" (#24211) GitOrigin-RevId: d2fa6d8a3eebef5c8c40f1430e7b00b13d817308 Branch: v8.0 https://github.com/mongodb/mongo/commit/4b0329cb2a704c0bd2eff3e2bf6bcbe3e62e1459 xgen-internal-githook commented on Mon, 1 Jul 2024 18:21:48 +0000: Author: {'name': 'George Wangensteen', 'email': 'george.wangensteen@mongodb.com', 'username': 'gewa24'} Message: SERVER-88159 Run TSAN with diagnostic latches off (#24177) GitOrigin-RevId: 07bdebf640a21b1215ecbd24d23ad076c160fc8b Branch: v8.0 https://github.com/mongodb/mongo/commit/65d8bc8ee07935e5312ba6e9efd95b7ce1c33e58 george.wangensteen commented on Thu, 27 Jun 2024 13:51:23 +0000: In this ticket, we changed the configuration of our TSAN builders to use standard library mutexes/disable diagnostic latches, to ensure that environment matches our prod configuration and improve coveredge, and filed tickets for all the new bugs found. SERVER-91934 was filed to track root-causing the negative interaction between TSAN and diagnostic latches. xgen-internal-githook commented on Wed, 26 Jun 2024 21:19:12 +0000: Author: {'name': 'George Wangensteen', 'email': 'george.wangensteen@mongodb.com', 'username': 'gewa24'} Message: SERVER-88159 Run TSAN with diagnostic latches off (#23999) GitOrigin-RevId: a111441f5746afdb31bbe6ab88aae38e61c7edf0 Branch: master https://github.com/mongodb/mongo/commit/d99e481430058da3fd1f6dd83cd37c8dab5fc873
Add a unittest like this that contains a simple lock-order-inversion: +TEST(GeorgeTest, LockOrderInversion) { +#ifndef MONGO_CONFIG_USE_RAW_LATCHES +std::cout << "YYYY Using mongo::Mutex" << std::endl; +#else +std::cout << "YYYY Using raw std::mutex" << std::endl; +#endif + LOGV2(24148, "XXXX IN George Test"); + auto m1 = MONGO_MAKE_LATCH("m1"); + auto m2 = MONGO_MAKE_LATCH("m2"); + stdx::thread t1([&] { + stdx::lock_guard lg(m1); + LOGV2(24148, "XXXX Got m1 t1"); + stdx::lock_guard lg2(m2); + LOGV2(24148, "XXXX Got m2 t1"); + + }); + + t1.join(); + stdx::thread t2([&] { + stdx::lock_guard lg(m2); + LOGV2(24148, "XXXX Got m2 t2"); + stdx::lock_guard lg2(m1); + LOGV2(24148, "XXXX Got m1 t2"); + + }); + t2.join(); + +} + Then, create two TSAN compile configurations, one with and one without diagnostic latches: ./buildscripts/scons.py --dbg=on --opt=on --use-libunwind=off --link-model=dynamic --variables-files=./etc/scons/mongodbtoolchain_stable_clang.vars --ninja ICECC=icecc CCACHE=ccache --sanitize=thread --allocator=system --use-diagnostic-latches=on NINJA_PREFIX=tsan-latches and ./buildscripts/scons.py --dbg=on --opt=on --use-libunwind=off --link-model=dynamic --variables-files=./etc/scons/mongodbtoolchain_stable_clang.vars --ninja ICECC=icecc CCACHE=ccache --sanitize=thread --allocator=system NINJA_PREFIX=tsan Then, run the test under each configuration. On my VWS, with the no-diagnostic-latches variant/raw std::mutex, TSAN determinsitically identifies the data race and aborts the program. With the mongo::Mutex/diagnostic-latches, running the test multiple times produces only success outputs and TSAN does not report any issues.