...
Running ../mongo-latest/bin/mongod --version db version v3.2.0-rc4-21-g86e7b69 git version: 86e7b69a6c52c926d28a60d816faefa6db81eb96 OpenSSL version: OpenSSL 1.0.1f 6 Jan 2014 allocator: tcmalloc modules: enterprise build environment: distmod: ubuntu1404 distarch: x86_64 target_arch: x86_64 which I believe is equivalent to what ended up being 3.2.0-rc5 I'm observing the following at the end of a run that inserts 200M documents: When server is started with default (snappy) block compression, the stats at the end of the run are approximately: "db" : "iibench", "collections" : 1, "objects" : 200000000, "avgObjSize" : 1119, "dataSize" : 208.4299921989441, "storageSize" : 51.31330490112305, "numExtents" : 0, "indexes" : 4, "indexSize" : 22.347862243652344, "ok" : 1 When the same run is done with block compression zlib, the stats are: "db" : "iibench", "collections" : 1, "objects" : 200000000, "avgObjSize" : 1119, "dataSize" : 208.4299921989441, "storageSize" : 21.949535369873047, "numExtents" : 0, "indexes" : 4, "indexSize" : 2.2369346618652344, "ok" : 1 On longer run the difference was only 2x but I would not have expected any size difference since we don't apply block compression to indexes.
keith.bostic commented on Sun, 12 Jun 2016 22:31:46 +0000: I've looked at this pretty carefully now, and I don't think there are any unexpected behaviors here. The short version is the actual data size in the index-5 file is roughly 4GB, and the data rates for this index file are such that it's not unlikely for a significant percentage of the file's blocks to be dirtied between checkpoints, which means a real data to file size ratio of 2-3x isn't unreasonable (which is what I'm seeing). I've verified the number of blocks being written between and during checkpoints, and verified that in-use blocks appear at the end of the file (which prevents any compaction of the file). There are probably changes we could make to improve the likelihood of the file being compacted (for example, a longer pause between checkpoints makes things much better, or even adjusting when we switch to/from best-fit and first-fit allocation as blocks are chosen from the freelist), but as it stands, this behavior isn't unexpected. I'm going to go ahead and close this one. alexander.gorrod commented on Fri, 4 Dec 2015 06:35:44 +0000: This is interesting asya. I was able to reproduce without an oplog. I don't think there is a problem with the volume of data being inserted. If I compare using a standalone WiredTiger utility, the indexes contain the right information. I've also noticed that if I re-open the files, the sizes shrink down to be similar. I suspect that there are timing differences between when running with zlib and snappy that mean we are keeping a different volume of checkpoint data pinned in the index files - which causes the file size to vary. It's worth investigating, but we aren't applying block compression to the indexes or storing invalid data. So I don't think this is a high priority. FWIW my process to reproduce is: $ cat s21767.conf processManagement: fork: true systemLog: destination: file path: "/home/alexg/work/mongo/data/oplog-27017.log" storage: dbPath: "/home/alexg/work/mongo/data/db/" engine: "wiredTiger" wiredTiger: collectionConfig: blockCompressor: "zlib" engineConfig: cacheSizeGB: 12 $ rm -rf ./data/db/* && ./mongod --config=./s21767.conf $ (cd ../iibench-mongodb && sh ./run.simple.bash) I then can attach to the mongod in a shell and run: > use iibench switched to db iibench > show collections purchases_index > db.purchases_index.stats() asya commented on Fri, 4 Dec 2015 05:52:10 +0000: The sizes of index files in data directory correspond to what we output in stats - I'm running another test now, but I had checked that the size of files matched before. These tests are all with oplog on, but I saw similar diffs without oplog. My mongodb config looks like this (zlib example): processManagement: fork: true systemLog: destination: file path: "/home/asya/logs/oplog-27017.log" storage: dbPath: "/disk1/db/oplog-27017" engine: "wiredTiger" wiredTiger: collectionConfig: blockCompressor: "zlib" engineConfig: cacheSizeGB: 24 replication: oplogSizeMB: 2000 replSetName: "oplog" Journal is on a separate disk from data files. Both disks are ext4 and not xfs. alexander.gorrod commented on Fri, 4 Dec 2015 05:44:18 +0000: Thanks asya Could you include the command lines/configuration files you are using for mongod to generate the different results? It would also be helpful if you can upload a sample output database directory - it could be from a very small run. michael.cahill commented on Fri, 4 Dec 2015 05:42:10 +0000: asya, thanks for opening a ticket. Can you include ls -l output for the database directory? I'm curious about whether we're chasing how the index is written, or how "indexSize" is calculated...
git clone https://github.com/asya999/iibench-mongodb.git cd iibench-mongodb sed -i "s/export NUM_LOADER_THREADS=.*/export NUM_LOADER_THREADS=20/" run.simple.bash sed -i "s/export MAX_ROWS=.*/export MAX_ROWS=300000000/" run.simple.bash ./run.simple.bash