...
BugZero found this defect 2827 days ago.
2016-12-13T13:46:32.739+0300 E STORAGE [TTLMonitor] WiredTiger error (0) [1481625992:739037][11026:0x7f48f45a1700], file:index-86--4095658065937658735.wt, WT_CURSOR.remove: read checksum error for 12288B block at offset 1547100160: block header checksum of 1781300651 doesn't match expected checksum of 1690850977 2016-12-13T13:46:32.739+0300 E STORAGE [TTLMonitor] WiredTiger error (0) [1481625992:739112][11026:0x7f48f45a1700], file:index-86--4095658065937658735.wt, WT_CURSOR.remove: index-86--4095658065937658735.wt: encountered an illegal file format or internal value 2016-12-13T13:46:32.739+0300 E STORAGE [TTLMonitor] WiredTiger error (-31804) [1481625992:739133][11026:0x7f48f45a1700], file:index-86--4095658065937658735.wt, WT_CURSOR.remove: the process must exit and restart: WT_PANIC: WiredTiger library panic 2016-12-13T13:46:32.739+0300 I - [TTLMonitor] Fatal Assertion 28558 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 361 2016-12-13T13:46:32.739+0300 I - [TTLMonitor] ***aborting after fassert() failure 2016-12-13T13:46:32.770+0300 F - [TTLMonitor] Got signal: 6 (Aborted). 0x7f48fe8dd9b1 0x7f48fe8dcaa9 0x7f48fe8dcf8d 0x7f48fbfb4370 0x7f48fbc191d7 0x7f48fbc1a8c8 0x7f48fdb725ab 0x7f48fe5f76e6 0x7f48fdb7c948 0x7f48fdb7ca3c 0x7f48fdb7cc94 0x7f48ff1e30b5 0x7f48ff200a0d 0x7f48ff204cb8 0x7f48ff224dfd 0x7f48ff1f63cb 0x7f48ff243e62 0x7f48fe5d200d 0x7f48fe5c8a46 0x7f48fdfca67a 0x7f48fdfccbf3 0x7f48fdd72291 0x7f48fdd72666 0x7f48fdd4648f 0x7f48fdee8b3e 0x7f48fdf0e2f3 0x7f48fe21103a 0x7f48fe21195b 0x7f48fe211a8d 0x7f48fe5ffe9c 0x7f48fe601260 0x7f48fe6018f8 0x7f48fe84d04d 0x7f48ff352860 0x7f48fbfacdc5 0x7f48fbcdb73d ----- BEGIN BACKTRACE ----- {"backtrace":[{"b":"7F48FD35F000","o":"157E9B1","s":"_ZN5mongo15printStackTraceERSo"},{"b":"7F48FD35F000","o":"157DAA9"},{"b":"7F48FD35F000","o":"157DF8D"},{"b":"7F48FBFA5000","o":"F370"},{"b":"7F48FBBE4000","o":"351D7","s":"gsignal"},{"b":"7F48FBBE4000","o":"368C8","s":"abort"},{"b":"7F48FD35F000","o":"8135AB","s":"_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj"},{"b":"7F48FD35F000","o":"12986E6"},{"b":"7F48FD35F000","o":"81D948","s":"__wt_eventv"},{"b":"7F48FD35F000","o":"81DA3C","s":"__wt_err"},{"b":"7F48FD35F000","o":"81DC94","s":"__wt_panic"},{"b":"7F48FD35F000","o":"1E840B5","s":"__wt_bm_read"},{"b":"7F48FD35F000","o":"1EA1A0D","s":"__wt_bt_read"},{"b":"7F48FD35F000","o":"1EA5CB8","s":"__wt_page_in_func"},{"b":"7F48FD35F000","o":"1EC5DFD","s":"__wt_row_search"},{"b":"7F48FD35F000","o":"1E973CB","s":"__wt_btcur_remove"},{"b":"7F48FD35F000","o":"1EE4E62"},{"b":"7F48FD35F000","o":"127300D","s":"_ZN5mongo21WiredTigerIndexUnique8_unindexEP11__wt_cursorRKNS_7BSONObjERKNS_8RecordIdEb"},{"b":"7F48FD35F000","o":"1269A46","s":"_ZN5mongo15WiredTigerIndex7unindexEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdEb"},{"b":"7F48FD35F000","o":"C6B67A","s":"_ZN5mongo17IndexAccessMethod12removeOneKeyEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdEb"},{"b":"7F48FD35F000","o":"C6DBF3","s":"_ZN5mongo17IndexAccessMethod6removeEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdERKNS_19InsertDeleteOptionsEPl"},{"b":"7F48FD35F000","o":"A13291","s":"_ZN5mongo12IndexCatalog14_unindexRecordEPNS_16OperationContextEPNS_17IndexCatalogEntryERKNS_7BSONObjERKNS_8RecordIdEbPl"},{"b":"7F48FD35F000","o":"A13666","s":"_ZN5mongo12IndexCatalog13unindexRecordEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdEbPl"},{"b":"7F48FD35F000","o":"9E748F","s":"_ZN5mongo10Collection14deleteDocumentEPNS_16OperationContextERKNS_8RecordIdEPNS_7OpDebugEbb"},{"b":"7F48FD35F000","o":"B89B3E","s":"_ZN5mongo11DeleteStage6doWorkEPm"},{"b":"7F48FD35F000","o":"BAF2F3","s":"_ZN5mongo9PlanStage4workEPm"},{"b":"7F48FD35F000","o":"EB203A","s":"_ZN5mongo12PlanExecutor11getNextImplEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE"},{"b":"7F48FD35F000","o":"EB295B","s":"_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE"},{"b":"7F48FD35F000","o":"EB2A8D","s":"_ZN5mongo12PlanExecutor11executePlanEv"},{"b":"7F48FD35F000","o":"12A0E9C","s":"_ZN5mongo10TTLMonitor13doTTLForIndexEPNS_16OperationContextENS_7BSONObjE"},{"b":"7F48FD35F000","o":"12A2260","s":"_ZN5mongo10TTLMonitor9doTTLPassEv"},{"b":"7F48FD35F000","o":"12A28F8","s":"_ZN5mongo10TTLMonitor3runEv"},{"b":"7F48FD35F000","o":"14EE04D","s":"_ZN5mongo13BackgroundJob7jobBodyEv"},{"b":"7F48FD35F000","o":"1FF3860"},{"b":"7F48FBFA5000","o":"7DC5"},{"b":"7F48FBBE4000","o":"F773D","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.4.0", "gitVersion" : "f4240c60f005be757399042dc12f6addbc3170c1", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.10.0-514.2.2.el7.x86_64", "version" : "#1 SMP Tue Dec 6 23:06:41 UTC 2016", "machine" : "x86_64" }, "somap" : [ { "b" : "7F48FD35F000", "elfType" : 3, "buildId" : "ACE5CADA1313A0B04B71DBBEB60CC944FA9ACDD6" }, { "b" : "7FFC6BFFD000", "elfType" : 3, "buildId" : "183CE4B56A9471419F233CCEF078E0504837ABF5" }, { "b" : "7F48FCECF000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "D0018CA5E24522ED0DC1844556FA8DBC4B39D5C3" }, { "b" : "7F48FCAE5000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "8756D2315BF50F8610875B1AFF128198FB9D202D" }, { "b" : "7F48FC8DD000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "82E77ADE22BC9FFF8D3458BD37331E7EDF174C28" }, { "b" : "7F48FC6D9000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "C5F560504E1AF52E29679C3B52FF11121015D6BB" }, { "b" : "7F48FC3D7000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "721C7CC9488EFA25F83B48AF713AB27DBE48EF3E" }, { "b" : "7F48FC1C1000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "408B46E291B2D4C9612E27C0509D165D7E186D40" }, { "b" : "7F48FBFA5000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "C3DEB1FA27CD0C1C3CC575B944ABACBA0698B0F2" }, { "b" : "7F48FBBE4000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "8B2C421716985B927AA0CAF2A05D0B1F452367F7" }, { "b" : "7F48FD13D000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "8F3E366E2DB73C330A3791DEAE31AE9579099B44" }, { "b" : "7F48FB996000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "A2499C359AA179EE23324ED949C0E508E4434F10" }, { "b" : "7F48FB6AF000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "E09A34D9083DC6FEAF7018C09D55631DEEE2836D" }, { "b" : "7F48FB4AB000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "BF54B7C8932E450769FBBB8B18864D1DD70BBC67" }, { "b" : "7F48FB279000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "BF8F00D7CB849ADB0B7A4703BC7B8D66AEE6A49C" }, { "b" : "7F48FB063000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "EA8E45DC8E395CC5E26890470112D97A1F1E0B65" }, { "b" : "7F48FAE54000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "1E7A92FDD6FB3871DA97F4BCA2E147E72B6B6E1F" }, { "b" : "7F48FAC50000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "2E01D5AC08C1280D013AAB96B292AC58BC30A263" }, { "b" : "7F48FAA36000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "FE7AE845A123A3DFC0FDC2408BCBC2BA8B61B158" }, { "b" : "7F48FA80F000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "76687CA31A406854DF3BCF8D03055656F56E6892" }, { "b" : "7F48FA5AE000", "path" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "AE64AA461A26E01F60408013D361749D56DD0AE1" } ] }} mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x7f48fe8dd9b1] mongod(+0x157DAA9) [0x7f48fe8dcaa9] mongod(+0x157DF8D) [0x7f48fe8dcf8d] libpthread.so.0(+0xF370) [0x7f48fbfb4370] libc.so.6(gsignal+0x37) [0x7f48fbc191d7] libc.so.6(abort+0x148) [0x7f48fbc1a8c8] mongod(_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj+0x0) [0x7f48fdb725ab] mongod(+0x12986E6) [0x7f48fe5f76e6] mongod(__wt_eventv+0x422) [0x7f48fdb7c948] mongod(__wt_err+0x9D) [0x7f48fdb7ca3c] mongod(__wt_panic+0x24) [0x7f48fdb7cc94] mongod(__wt_bm_read+0x135) [0x7f48ff1e30b5] mongod(__wt_bt_read+0x20D) [0x7f48ff200a0d] mongod(__wt_page_in_func+0x1138) [0x7f48ff204cb8] mongod(__wt_row_search+0x66D) [0x7f48ff224dfd] mongod(__wt_btcur_remove+0x31B) [0x7f48ff1f63cb] mongod(+0x1EE4E62) [0x7f48ff243e62] mongod(_ZN5mongo21WiredTigerIndexUnique8_unindexEP11__wt_cursorRKNS_7BSONObjERKNS_8RecordIdEb+0x12D) [0x7f48fe5d200d] mongod(_ZN5mongo15WiredTigerIndex7unindexEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdEb+0x86) [0x7f48fe5c8a46] mongod(_ZN5mongo17IndexAccessMethod12removeOneKeyEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdEb+0x3A) [0x7f48fdfca67a] mongod(_ZN5mongo17IndexAccessMethod6removeEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdERKNS_19InsertDeleteOptionsEPl+0xD3) [0x7f48fdfccbf3] mongod(_ZN5mongo12IndexCatalog14_unindexRecordEPNS_16OperationContextEPNS_17IndexCatalogEntryERKNS_7BSONObjERKNS_8RecordIdEbPl+0xD1) [0x7f48fdd72291] mongod(_ZN5mongo12IndexCatalog13unindexRecordEPNS_16OperationContextERKNS_7BSONObjERKNS_8RecordIdEbPl+0x96) [0x7f48fdd72666] mongod(_ZN5mongo10Collection14deleteDocumentEPNS_16OperationContextERKNS_8RecordIdEPNS_7OpDebugEbb+0x17F) [0x7f48fdd4648f] mongod(_ZN5mongo11DeleteStage6doWorkEPm+0x51E) [0x7f48fdee8b3e] mongod(_ZN5mongo9PlanStage4workEPm+0x63) [0x7f48fdf0e2f3] mongod(_ZN5mongo12PlanExecutor11getNextImplEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0x19A) [0x7f48fe21103a] mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x4B) [0x7f48fe21195b] mongod(_ZN5mongo12PlanExecutor11executePlanEv+0x6D) [0x7f48fe211a8d] mongod(_ZN5mongo10TTLMonitor13doTTLForIndexEPNS_16OperationContextENS_7BSONObjE+0x184C) [0x7f48fe5ffe9c] mongod(_ZN5mongo10TTLMonitor9doTTLPassEv+0x460) [0x7f48fe601260] mongod(_ZN5mongo10TTLMonitor3runEv+0x308) [0x7f48fe6018f8] mongod(_ZN5mongo13BackgroundJob7jobBodyEv+0x16D) [0x7f48fe84d04d] mongod(+0x1FF3860) [0x7f48ff352860] libpthread.so.0(+0x7DC5) [0x7f48fbfacdc5] libc.so.6(clone+0x6D) [0x7f48fbcdb73d] ----- END BACKTRACE -----
thomas.schubert commented on Fri, 16 Dec 2016 22:31:49 +0000: Hi nexcode, I'm glad to hear the repair process was successful. If you encounter this issue again, let us know and will continue to investigate. Please note that we recommend running MongoDB with RAID-10. Kind regards, Thomas nexcode commented on Wed, 14 Dec 2016 05:47:39 +0000: Now it working, but we will monitor it! nexcode commented on Wed, 14 Dec 2016 05:08:30 +0000: 2016-12-14T04:03:27.264+0300 I STORAGE [initandlisten] finished checking dbs 2016-12-14T04:03:27.264+0300 I NETWORK [initandlisten] shutdown: going to close listening sockets... 2016-12-14T04:03:27.265+0300 I NETWORK [initandlisten] removing socket file: /tmp/mongodb-27017.sock 2016-12-14T04:03:27.265+0300 I NETWORK [initandlisten] shutdown: going to flush diaglog... 2016-12-14T04:03:27.268+0300 I STORAGE [initandlisten] WiredTigerKVEngine shutting down 2016-12-14T04:03:27.757+0300 I STORAGE [initandlisten] shutdown: removing fs lock... 2016-12-14T04:03:27.757+0300 I CONTROL [initandlisten] now exiting 2016-12-14T04:03:27.757+0300 I CONTROL [initandlisten] shutting down with code:0 Now I try to start database... nexcode commented on Tue, 13 Dec 2016 21:48:00 +0000: 1. WiredTiger on local SSD. Software raid 0 2. It's ok! 3. db.copyDatabase() from 3.2.8 (migrate on new server) 4. no 5. no 6. yes 7. ttl, compound, etc... 8. no I run --repair process... It's a long time. DB size is 100GB. We use 20 threads CPU. And we have a large number of requests. For server status is constantly monitored. We use CentOS 7. thomas.schubert commented on Tue, 13 Dec 2016 18:38:03 +0000: Hi nexcode, This assertion failure generally indicates that some or all of the data files have become corrupt in some way. It's not clear if the corruption is in the index or the data itself, and in cases like this, it's very difficult to be confident that the corruption is isolated beyond the file level. To help us understand what's going on here, I've assembled a list of routine questions about data storage and the configuration of your environment. But, please understand that it is unlikely that we will be able to determine the root cause of this issue. What kind of underlying storage mechanism are you using? Are the storage devices attached locally or over the network? Are the disks SSDs or HDDs? What kind of RAID and/or volume management system are you using? Would you please check the integrity of your disks? Has the database always been running MongoDB 3.4.0? If not please describe the upgrade/downgrade cycles the database has been through. Have you manipulated (copied or moved) the underlying database files? If so, was mongod running? Preceding the corruption, were there any other server errors logged? Are you using journaling? What kinds of indexes do you have (TTL, etc)? Have you run out of disk space recently? To resolve this issue, I would recommend performing a initial sync, or starting mongod with --repair. Thank you, Thomas