BugZero | MongoDB BugID 342159 - Secondary Mongod failed with read checksum error

MongoDB - Defect ID: 342159

Secondary Mongod failed with read checksum error

MongoDB - Defect ID: 342159

Secondary Mongod failed with read checksum error

Last updated on 1/3/2017

Overall: 6.26.2

Severity: 6.46.4

Community: 66.0

Lifecycle: 9.19.1

What is the BugZero Risk Score?

Vendor details

Priority: Major - P3
Status: Closed

Overall: 6.26.2

Severity: 6.46.4

Community: 66.0

Lifecycle: 9.19.1

What is the BugZero Risk Score?

Vendor details

Priority: Major - P3
Status: Closed

Info

On a 28 shard cluster with each shard being a 3 node replicaset running MongoDb 3.2.5 WiredTiger, I saw a single secondary Mongod failed with read checksum error as below. The environment is CentOS Linux release 7.0.1406 and the mongod process writes to local disk. Also attaching the full log life of the secondary that failed. 2016-12-28T07:11:01.763-0500 I INDEX [repl writer worker 13] build index done. scanned 6723 total records. 0 secs 2016-12-28T07:11:07.614-0500 I COMMAND [conn18188] command local.oplog.rs command: getMore { getMore: 14529216539, collection: "oplog.rs", maxTimeMS: 5000, term: 41, lastKnownCommittedOpTime: { ts: Timestamp 1482927067000|217, t: 41 } } cursorid:14529216539 keyUpdates:0 writeConflicts:0 numYields:0 nreturned:3 reslen:105611 locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 287484 } }, oplog: { acquireCount: { r: 1 } } } protocol:op_command 287ms 2016-12-28T07:11:19.374-0500 E STORAGE [thread2] WiredTiger (0) [1482927079:374811][17531:0x7faa199e1700], file:trancheinfodb_20161228/collection-392--4692130608470797293.wt, WT_SESSION.checkpoint: read checksum error for 4096B block at offset 339968: block header checksum of 1570021396 doesn't match expected checksum of 111389135 2016-12-28T07:11:19.374-0500 E STORAGE [thread2] WiredTiger (0) [1482927079:374861][17531:0x7faa199e1700], file:trancheinfodb_20161228/collection-392--4692130608470797293.wt, WT_SESSION.checkpoint: trancheinfodb_20161228/collection-392--4692130608470797293.wt: encountered an illegal file format or internal value 2016-12-28T07:11:19.374-0500 E STORAGE [thread2] WiredTiger (-31804) [1482927079:374871][17531:0x7faa199e1700], file:trancheinfodb_20161228/collection-392--4692130608470797293.wt, WT_SESSION.checkpoint: the process must exit and restart: WT_PANIC: WiredTiger library panic 2016-12-28T07:11:19.374-0500 I - [thread2] Fatal Assertion 28558

Top User Comments

thomas.schubert commented on Tue, 3 Jan 2017 22:51:18 +0000: Hi darshan.shah@interactivedata.com, Thank you for reporting this issue. This assertion failure generally indicates that some or all of the data files have become corrupt in some way. Unfortunately, in cases like this without a clear reproduction it is challenging to determine the root cause of the corruption. Please ensure the integrity of your disk layer and if let us know if this issue reoccurs so we can continue to investigate. My recommendation to resolve this issue would be to resync the affected node. Kind regards, Thomas

Steps to Reproduce

5.9Defect ID: 2956672
Some time-series tests implicitly rely on measurement insertion order for unordered inserts when checking bucket catalog stats
6.14Defect ID: 2965528
Remove push, publish_packages, and crypt_push tasks from Graviton 4 variants in v7.0 and v8.0
6.14Defect ID: 2947969
[SBE] Release storage engine resources when saveState() or restoreState() throws
5.68Defect ID: 2919474
StackLocator broken by v5 toolchain ASAN
5.88Defect ID: 2968769
Make new write path helper functions use acquireAndValidateBucketsCollection instead of acquireCollection

Ready to prevent the next vendor outage?

Get a demo

OPERATIONAL DEFECT DATABASE

MongoDB - Defect ID: 342159

Secondary Mongod failed with read checksum error

MongoDB - Defect ID: 342159

Secondary Mongod failed with read checksum error

Last updated on 1/3/2017

Vendor details

Vendor details

Description

Info

Top User Comments

Steps to Reproduce

Links

Top MongoDB defects by risk score

Ready to prevent the next vendor outage?