BugZero | MongoDB BugID 785333 - repair after crash due to a full storage

MongoDB - Defect ID: 785333

repair after crash due to a full storage

MongoDB - Defect ID: 785333

repair after crash due to a full storage

Last updated on 6/11/2019

Overall: 6.36.3

Severity: 6.46.4

Community: 6.46.4

Lifecycle: 9.19.1

What is the BugZero Risk Score?

Vendor details

Priority: Major - P3
Status: Closed

Overall: 6.36.3

Severity: 6.46.4

Community: 6.46.4

Lifecycle: 9.19.1

What is the BugZero Risk Score?

Vendor details

Priority: Major - P3
Status: Closed

Info

Hi, I have an issue on mongod. One day ago , my hard disk storage was full so mongodb shut down. I add Go on my hard disk and now when i want to restart my mongo , it doesn't work . I m on unbutu 16 and i tryied to do that : mongod --dbpath /var/lib/mongodb --repair but i have systematically, after a while , this error: // 2019-05-31T14:13:25.375+0000 E STORAGE [initandlisten] WiredTiger error (-31802) [1559312005:375886][8390:0x7fddddd90a40], file:collection-2818--1460086324816199596.wt, WT_CURSOR.insert: __wt_block_read_off, 302: collection-2818--1460086324816199596.wt: fatal read error: WT_ERROR: non-specific WiredTiger error Raw: [1559312005:375886][8390:0x7fddddd90a40], file:collection-2818--1460086324816199596.wt, WT_CURSOR.insert: __wt_block_read_off, 302: collection-2818--1460086324816199596.wt: fatal read error: WT_ERROR: non-specific WiredTiger error 2019-05-31T14:13:25.375+0000 E STORAGE [initandlisten] WiredTiger error (-31804) [1559312005:375912][8390:0x7fddddd90a40], file:collection-2818--1460086324816199596.wt, WT_CURSOR.insert: __wt_panic, 523: the process must exit and restart: WT_PANIC: WiredTiger library panic Raw: [1559312005:375912][8390:0x7fddddd90a40], file:collection-2818--1460086324816199596.wt, WT_CURSOR.insert: __wt_panic, 523: the process must exit and restart: WT_PANIC: WiredTiger library panic 2019-05-31T14:13:25.375+0000 E STORAGE [initandlisten] WiredTiger error (-31804) [1559312005:375939][8390:0x7fddddd90a40], file:collection-2818--1460086324816199596.wt, txn-recover: __txn_op_apply, 281: operation apply failed during recovery: operation type 4 at LSN 225/73807232: WT_PANIC: WiredTiger library panic Raw: [1559312005:375939][8390:0x7fddddd90a40], file:collection-2818--1460086324816199596.wt, txn-recover: __txn_op_apply, 281: operation apply failed during recovery: operation type 4 at LSN 225/73807232: WT_PANIC: WiredTiger library panic 2019-05-31T14:13:25.376+0000 E STORAGE [initandlisten] WiredTiger error (-31804) [1559312005:375995][8390:0x7fddddd90a40], file:collection-2818--1460086324816199596.wt, txn-recover: __wt_txn_recover, 740: Recovery failed: WT_PANIC: WiredTiger library panic Raw: [1559312005:375995][8390:0x7fddddd90a40], file:collection-2818--1460086324816199596.wt, txn-recover: __wt_txn_recover, 740: Recovery failed: WT_PANIC: WiredTiger library panic 2019-05-31T14:13:25.383+0000 E STORAGE [initandlisten] WiredTiger error (0) [1559312005:383068][8390:0x7fddddd90a40], connection: __wt_cache_destroy, 384: cache server: exiting with 7 pages in memory and 0 pages evicted Raw: [1559312005:383068][8390:0x7fddddd90a40], connection: __wt_cache_destroy, 384: cache server: exiting with 7 pages in memory and 0 pages evicted 2019-05-31T14:13:25.383+0000 E STORAGE [initandlisten] WiredTiger error (0) [1559312005:383126][8390:0x7fddddd90a40], connection: __wt_cache_destroy, 389: cache server: exiting with 149858 image bytes in memory Raw: [1559312005:383126][8390:0x7fddddd90a40], connection: __wt_cache_destroy, 389: cache server: exiting with 149858 image bytes in memory 2019-05-31T14:13:25.383+0000 E STORAGE [initandlisten] WiredTiger error (0) [1559312005:383146][8390:0x7fddddd90a40], connection: __wt_cache_destroy, 393: cache server: exiting with 154498 bytes in memory Raw: [1559312005:383146][8390:0x7fddddd90a40], connection: __wt_cache_destroy, 393: cache server: exiting with 154498 bytes in memory 2019-05-31T14:13:25.383+0000 E STORAGE [initandlisten] WiredTiger error (0) [1559312005:383163][8390:0x7fddddd90a40], connection: __wt_cache_destroy, 400: cache server: exiting with 32496 bytes dirty and 1 pages dirty Raw: [1559312005:383163][8390:0x7fddddd90a40], connection: __wt_cache_destroy, 400: cache server: exiting with 32496 bytes dirty and 1 pages dirty 2019-05-31T14:13:25.388+0000 E - [initandlisten] Assertion: Location28718: -31809: WT_TRY_SALVAGE: database corruption detected src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp 504 2019-05-31T14:13:25.396+0000 I STORAGE [initandlisten] exception in initAndListen: Location28718: -31809: WT_TRY_SALVAGE: database corruption detected, terminating 2019-05-31T14:13:25.406+0000 I CONTROL [initandlisten] now exiting 2019-05-31T14:13:25.406+0000 I CONTROL [initandlisten] shutting down with code:100 I tryied to do that more than once , but still the same issue. Even when i add --nojournal it doesn't work I removed mongod.lock for that. Do you have any solution to resolved it? Thanks a lot Thomas

Top User Comments

eric.sedor commented on Tue, 11 Jun 2019 20:47:32 +0000: tigrou34; I did want to add one more thing: We don't support the method described in the blog you linked. Our recommended course of action for this error in the future would be to first attempt an initial sync from an unaffected replica set node, and then try --repair using MongoDB version 4.0.9. eric.sedor commented on Wed, 5 Jun 2019 16:59:33 +0000: tigrou34 We are glad to hear this! Thank you for letting us know. tigrou34 commented on Wed, 5 Jun 2019 08:30:16 +0000: I foudn the solution !! I follow this tutorial : http://www.alexbevi.com/blog/2016/02/10/recovering-a-wiredtiger-collection-from-a-corrupt-mongodb-installation/ So i did: I installed wiredtiger wiredtiger salvage : ./wt v -h /var/lib/mongodb -C "extensions=[./ext/compressors/snappy/.libs/libwiredtiger_snappy.so]" -R salvage collection-2818-1460086324816199596.wt wiredtiger dump : ./wt -v -h /var/lib/mongodb -C "extensions=[./ext/compressors/snappy/.libs/libwiredtiger_snappy.so]" -R dump -f../collection.dump collection-2818–1460086324816199596.wt wiredtiger load to replace the old file corrupt with the new one : ./wt -v -h /var/lib/mongodb -C "extensions=[./ext/compressors/snappy/.libs/libwiredtiger_snappy.so]" -R load -f ../collection.dump -r collection-2818–1460086324816199596 remove the /var/lib/mongodb/mongod.lock and relaunch mongodb like this : mongod --dbpath /var/lib/mongodb --storageEngine wiredTiger --nojournal and also to be sure : mongod --repair --dbpath /var/lib/mongodb I hope it help someone else tigrou34 commented on Tue, 4 Jun 2019 13:53:37 +0000: Here is the log of mongod after the failed repair log_after_repair_failed.log tigrou34 commented on Tue, 4 Jun 2019 10:31:04 +0000: Thanks Eric I provide you all you have requested First , the description of my storage description. It s a simple standalone mongo db mechanism. Disk is a ssd disk of 50go with no raid system (we have upgraded since but nothing change, still the same issue). I started to store datas since around 4 months, it s the beginning of the project that why it's a standalone database with no cluster and with no backup for the moment. BUT datas on it are important for us. All our infra is on ovh , on cloud server with ubuntu 16.04. I didn't upgrade anything on this server since the beginning of this project (hardware and mongo). I join log when the crash came and you see the error after the reboot (crash_mongo_first_error.log) . Also you can see the error when i m using this command : mongod --repair --dbpath /var/lib/mongodb/ in the log "log_repair.log" When i restart mongo without --repair , i have the same error than before the repair. For information , i tried to create a new cloud server and run mongodb but still the same error, so it s not a disk error isn't it? Thansk a lot for your help crash_mongo_first_error.log log_repair.log eric.sedor commented on Mon, 3 Jun 2019 22:15:49 +0000: tigrou34, this error message leads us to suspect some form of physical corruption. Please make a complete copy of the database's $dbpath directory to work off of and safeguard the current $dbpath. Our ability to determine the source of this corruption depends greatly on your ability to provide: The logs for the affected node, including before, leading up to, and after the first sign of corruption. The complete logs of the repair operation. The logs of any attempt to start mongod after the repair operation completed. A description of the underlying storage mechanism in use, including details like: What file system and/or volume management system is in use? Is data storage locally attached or network-attached? Are disks RAIDed and if so how? Are disks SSDs or HDDs? A description of your backup method, if any. A description of your disks have been recently checked for integrity? A history of the deployment, including: a timeline of version changes a timeline of hardware upgrade/downgrade cycles or configuration changes a timeline of disaster recovery or backup restoration activities a timeline of any manipulations of the underlying database files, including copies or moves, and information about whether mongod was running during each manipulation. The ideal resolution is to perform a clean resync from an unaffected node. Can you please also provide specific detail around what you mean when you say "systematically, after a while" Thanks in advance.

Steps to Reproduce

mongod --dbpath /var/lib/mongodb --repair

5.9Defect ID: 2956672
Some time-series tests implicitly rely on measurement insertion order for unordered inserts when checking bucket catalog stats
6.14Defect ID: 2965528
Remove push, publish_packages, and crypt_push tasks from Graviton 4 variants in v7.0 and v8.0
6.14Defect ID: 2947969
[SBE] Release storage engine resources when saveState() or restoreState() throws
5.68Defect ID: 2919474
StackLocator broken by v5 toolchain ASAN
5.88Defect ID: 2968769
Make new write path helper functions use acquireAndValidateBucketsCollection instead of acquireCollection

Ready to prevent the next vendor outage?

Get a demo

OPERATIONAL DEFECT DATABASE

MongoDB - Defect ID: 785333

repair after crash due to a full storage

MongoDB - Defect ID: 785333

repair after crash due to a full storage

Last updated on 6/11/2019

Vendor details

Vendor details

Description

Info

Top User Comments

Steps to Reproduce

Links

Top MongoDB defects by risk score

Ready to prevent the next vendor outage?