BugZero | MongoDB BugID 297340 - Certain queries are 2000x time slower with WiredTi...

MongoDB - Defect ID: 297340

Certain queries are 2000x time slower with WiredTiger than with MMAPv1

MongoDB - Defect ID: 297340

Certain queries are 2000x time slower with WiredTiger than with MMAPv1

Last updated on 10/4/2016

Overall: 6.16.1

Severity: 6.46.4

Community: 77.0

Lifecycle: 9.19.1

What is the BugZero Risk Score?

Vendor details

Priority: Major - P3
Status: Closed

Overall: 6.16.1

Severity: 6.46.4

Community: 77.0

Lifecycle: 9.19.1

What is the BugZero Risk Score?

Vendor details

Priority: Major - P3
Status: Closed

Info

In certain queries, where there isn't an index available, wiredTiger takes way too long, when comparison with MMAPv1. In the example below the difference is 6min vs 0.2 sec. Example: db.forms.find({'dimension': 2,'degree':2 }).count() And, there isn't any entry with the key "degree". wiredTiger: config: engine: 'wiredTiger' wiredTiger: collectionConfig: blockCompressor: none We also have tried with snappy and zlib. The filesystem is ext4, and we have also tried xfs without a significant difference (when compared to MMAPv1). log entry: 2016-06-28T05:09:22.454+0000 I COMMAND [conn4] command hmfs.forms command: count { count: "forms", query: { dimension: 2, degree: 2 } } planSummary: IXSCAN { dimension: -1, field_label: 1, level_norm: -1 } fromMultiPlanner:1 keyUpdates:0 writeConflicts:0 numYields:13831 reslen:62 locks:{ Global: { acquireCount: { r: 27664 } }, Database: { acquireCount: { r: 13832 } }, Collection: { acquireCount: { r: 13832 } } } protocol:op_query 369464ms MMAPv1 config: engine: 'mmapv1' log entry: 2016-06-28T05:10:12.412+0000 I COMMAND [conn15] command hmfs.forms command: count { count: "forms", query: { dimension: 2, degree: 2 } } planSummary: IXSCAN { dimension: -1, field_label: 1, level_norm: -1 } keyUpdates:0 writeConflicts:0 numYields:447 reslen:62 locks:{ Global: { acquireCount: { r: 896 } }, MMAPV1Journal: { acquireCount: { r: 448 } }, Database: { acquireCount: { r: 448 } }, Collection: { acquireCount: { R: 448 } } } protocol:op_query 181ms This two examples were run on the same machine, on different empty dbs where the only difference is the storage engine. In both cases, the collections was restored from the same dump file. Perhaps we are doing something wrong, if so, please enlighten us, as we are planning switch all our servers to MMAPv1.

Top User Comments

thomas.schubert commented on Mon, 3 Oct 2016 21:44:18 +0000: Hi edgarcosta, We have not been able to reproduce this issue. I see that in your logs after the WiredTiger cache is hot, the query takes ~600ms to complete. As you know, MMAPv1 relies on the filesystem cache, whereas WiredTiger utilizes both its own cache and the filesystem cache. This difference may explain the results you are observing. Kind regards, Thomas edgarcosta commented on Fri, 1 Jul 2016 06:29:37 +0000: Thank you. Also, I did these experiments on a "n1-highmem-4" machine from google compute engine, i.e., 4 cpu with 26 GB of memory. Further, mongo was installed directly from the mongodb repository: "http://repo.mongodb.org/apt/ubuntu trusty/mongodb-org/3.2" mongodb-org/trusty,now 3.2.7 amd64 [installed] mongodb-org-mongos/trusty,now 3.2.7 amd64 [installed,automatic] mongodb-org-server/trusty,now 3.2.7 amd64 [installed,automatic] mongodb-org-shell/trusty,now 3.2.7 amd64 [installed,automatic] mongodb-org-tools/trusty,now 3.2.7 amd64 [installed,automatic] I have attached ticket.tar where: $ tar -tf ticket.tar ticket/ ticket/diagnostic.data.tar.bz2 ticket/mongod-mmap.conf ticket/log-mmap ticket/log-wt ticket/mongod-wt.conf $ tar -tf diagnostic.data.tar.bz2 mongodb-mmap/diagnostic.data/ mongodb-mmap/diagnostic.data/metrics.2016-06-28T03-03-13Z-00000 mongodb-mmap/diagnostic.data/metrics.interim mongodb-wt/diagnostic.data/ mongodb-wt/diagnostic.data/metrics.interim mongodb-wt/diagnostic.data/metrics.2016-06-28T05-03-02Z-00000 mongodb-wt/diagnostic.data/metrics.2016-06-28T03-03-04Z-00000 thomas.schubert commented on Wed, 29 Jun 2016 05:49:02 +0000: Hi edgarcosta, Thank you for opening this ticket. I've downloaded the dataset and ran the same query twice on WiredTiger with default configuration (first run with the cache empty, the second run with the cache hot). When the cache was empty it took 14 seconds, with the cache hot it took 0.2 seconds to complete. The performance is on the same order of magnitude for MMAPv1. Please see the logs of the queries below. WiredTiger: 2016-06-29T01:38:57.747-0400 I COMMAND [conn3] command dump.forms command: count { count: "forms", query: { dimension: 2.0, degree: 2.0 }, fields: {} } planSummary: IXSCAN { dimension: -1, field_label: 1, level_norm: -1 } fromMultiPlanner:1 keyUpdates:0 writeConflicts:0 numYields:1700 reslen:47 locks:{ Global: { acquireCount: { r: 3402 } }, Database: { acquireCount: { r: 1701 } }, Collection: { acquireCount: { r: 1701 } } } protocol:op_command 14476ms 2016-06-29T01:39:01.427-0400 I COMMAND [conn3] command dump.forms command: count { count: "forms", query: { dimension: 2.0, degree: 2.0 }, fields: {} } planSummary: IXSCAN { dimension: -1, field_label: 1, level_norm: -1 } keyUpdates:0 writeConflicts:0 numYields:447 reslen:47 locks:{ Global: { acquireCount: { r: 896 } }, Database: { acquireCount: { r: 448 } }, Collection: { acquireCount: { r: 448 } } } protocol:op_command 216ms MMAPv1: 2016-06-30T15:35:24.301-0400 I COMMAND [conn8] command dump.forms command: count { count: "forms", query: { dimension: 2.0, degree: 2.0 }, fields: {} } planSummary: IXSCAN { dimension: -1, field_label: 1, level_norm: -1 } fromMultiPlanner:1 keyUpdates:0 writeConflicts:0 numYields:3015 reslen:47 locks:{ Global: { acquireCount: { r: 6032 } }, MMAPV1Journal: { acquireCount: { r: 3016 } }, Database: { acquireCount: { r: 3016 } }, Collection: { acquireCount: { R: 3016 } } } protocol:op_command 18768ms 2016-06-30T15:35:27.790-0400 I COMMAND [conn8] command dump.forms command: count { count: "forms", query: { dimension: 2.0, degree: 2.0 }, fields: {} } planSummary: IXSCAN { dimension: -1, field_label: 1, level_norm: -1 } keyUpdates:0 writeConflicts:0 numYields:447 reslen:47 locks:{ Global: { acquireCount: { r: 896 } }, MMAPV1Journal: { acquireCount: { r: 448 } }, Database: { acquireCount: { r: 448 } }, Collection: { acquireCount: { R: 448 } } } protocol:op_command 161ms To continue to investigate this issue, would you please archive (tar or zip) the $dbpath/diagnostic.data directory and attach it to this ticket for both the WiredTiger and MMAPv1 nodes? Thank you, Thomas

Steps to Reproduce

Start 2 monodb servers, one with mmpav1 and the other with wiredTiger. Dowload: http://lmfdb.warwick.ac.uk/data/dump/latest/hmfs/forms.bson (70GB) http://lmfdb.warwick.ac.uk/data/dump/latest/hmfs/forms.metadata.json restore that collection, and perform the query on both servers.

5.9Defect ID: 2956672
Some time-series tests implicitly rely on measurement insertion order for unordered inserts when checking bucket catalog stats
6.14Defect ID: 2965528
Remove push, publish_packages, and crypt_push tasks from Graviton 4 variants in v7.0 and v8.0
6.14Defect ID: 2947969
[SBE] Release storage engine resources when saveState() or restoreState() throws
5.68Defect ID: 2919474
StackLocator broken by v5 toolchain ASAN
5.88Defect ID: 2968769
Make new write path helper functions use acquireAndValidateBucketsCollection instead of acquireCollection

Ready to prevent the next vendor outage?

Get a demo

OPERATIONAL DEFECT DATABASE

MongoDB - Defect ID: 297340

Certain queries are 2000x time slower with WiredTiger than with MMAPv1

MongoDB - Defect ID: 297340

Certain queries are 2000x time slower with WiredTiger than with MMAPv1

Last updated on 10/4/2016

Vendor details

Vendor details

Description

Info

Top User Comments

Steps to Reproduce

Links

Top MongoDB defects by risk score

Ready to prevent the next vendor outage?