...
We've a MongoDB 3.0.2 replica running on Windows. We had 2 nodes in the replica with security enabled, but had to remove one for maintenance, so the issue happened while the replica had only one node. We than see the following Mongo log lines: 2015-08-08T06:37:19.115-0400 E STORAGE [conn37692] WiredTiger (-28936) [1439030239:114410][4392:2004759424], session.commit_transaction: journal/WiredTigerLog.0000000714 write error: failed to write 22528 bytes at offset 23841280: The specified network name is no longer available. 2015-08-08T06:37:19.115-0400 I - [conn37692] Invariant failure: s->commit_transaction(s, NULL) resulted in status UnknownError -28936: The specified network name is no longer available. at src\mongo\db\storage\wiredtiger\wiredtiger_recovery_unit.cpp 290 2015-08-08T06:37:19.381-0400 E STORAGE [conn37617] WiredTiger (-28936) [1439030239:380436][4392:2004759424], session.commit_transaction: journal/WiredTigerLog.0000000714 write error: failed to write 4352 bytes at offset 23863808: The specified network name is no longer available. 2015-08-08T06:37:19.382-0400 I - [conn37617] Invariant failure: s->commit_transaction(s, NULL) resulted in status UnknownError -28936: The specified network name is no longer available. at src\mongo\db\storage\wiredtiger\wiredtiger_recovery_unit.cpp 290 2015-08-08T06:37:19.488-0400 E STORAGE [WiredTigerRecordStoreThread for local.oplog.rs] WiredTiger (-28936) [1439030239:488447][4392:2004759424], session.commit_transaction: journal/WiredTigerLog.0000000714 write error: failed to write 128 bytes at offset 23868160: The specified network name is no longer available. 2015-08-08T06:37:19.488-0400 I - [WiredTigerRecordStoreThread for local.oplog.rs] Invariant failure: s->commit_transaction(s, NULL) resulted in status UnknownError -28936: The specified network name is no longer available. at src\mongo\db\storage\wiredtiger\wiredtiger_recovery_unit.cpp 290 2015-08-08T06:37:20.951-0400 I CONTROL [conn37692] mongod.exe index_collator_extension+0x141cf3 2015-08-08T06:37:20.951-0400 I CONTROL [conn37692] mongod.exe index_collator_extension+0xfaf17 2015-08-08T06:37:20.951-0400 I CONTROL [conn37692] mongod.exe index_collator_extension+0xedd21 2015-08-08T06:37:20.951-0400 I CONTROL [conn37692] mongod.exe index_collator_extension+0xb4ab 2015-08-08T06:37:20.951-0400 I CONTROL [conn37692] mongod.exe index_collator_extension+0xb16b 2015-08-08T06:37:20.951-0400 I CONTROL [conn37692] mongod.exe mongo::parseNumberFromStringWithBase+0x1b5ffe 2015-08-08T06:37:20.951-0400 I CONTROL [conn37692] mongod.exe mongo::parseNumberFromStringWithBase+0x1b9ad3 2015-08-08T06:37:20.951-0400 I CONTROL [conn37692] mongod.exe mongo::parseNumberFromStringWithBase+0x2efe5d 2015-08-08T06:37:20.952-0400 I CONTROL [conn37692] mongod.exe mongo::parseNumberFromStringWithBase+0x2efa9b 2015-08-08T06:37:20.952-0400 I CONTROL [conn37692] mongod.exe mongo::parseNumberFromStringWithBase+0x14c05b 2015-08-08T06:37:20.952-0400 I CONTROL [conn37692] mongod.exe mongo::parseNumberFromStringWithBase+0x1495e2 2015-08-08T06:37:20.952-0400 I CONTROL [conn37692] mongod.exe mongo::parseNumberFromStringWithBase+0x148230 2015-08-08T06:37:20.952-0400 I CONTROL [conn37692] mongod.exe mongo::parseNumberFromStringWithBase+0x149c3f 2015-08-08T06:37:20.952-0400 I CONTROL [conn37692] mongod.exe mongo::parseNumberFromStringWithBase+0x14f251 2015-08-08T06:37:20.952-0400 I CONTROL [conn37692] mongod.exe mongo::parseNumberFromStringWithBase+0x15ef1a 2015-08-08T06:37:20.952-0400 I CONTROL [conn37692] mongod.exe mongo::parseNumberFromStringWithBase+0x1614b3 2015-08-08T06:37:20.952-0400 I CONTROL [conn37692] mongod.exe mongo::parseNumberFromStringWithBase+0x15f380 2015-08-08T06:37:20.952-0400 I CONTROL [conn37692] mongod.exe mongo::parseNumberFromStringWithBase+0x2c8ce3 2015-08-08T06:37:20.952-0400 I CONTROL [conn37692] mongod.exe mongo::parseNumberFromStringWithBase+0x2c9103 2015-08-08T06:37:20.953-0400 I CONTROL [conn37692] mongod.exe mongo::parseNumberFromStringWithBase+0x215e1b 2015-08-08T06:37:20.953-0400 I CONTROL [conn37692] mongod.exe mongo::parseNumberFromStringWithBase+0x2109f2 2015-08-08T06:37:20.953-0400 I CONTROL [conn37692] mongod.exe index_collator_extension+0x43e568 2015-08-08T06:37:20.953-0400 I CONTROL [conn37692] mongod.exe index_collator_extension+0x1071c3 2015-08-08T06:37:20.953-0400 I CONTROL [conn37692] mongod.exe index_collator_extension+0x161a91 2015-08-08T06:37:20.953-0400 I CONTROL [conn37692] MSVCR120.dll beginthreadex+0x107 2015-08-08T06:37:20.953-0400 I CONTROL [conn37692] MSVCR120.dll endthreadex+0x192 2015-08-08T06:37:20.953-0400 I CONTROL [conn37692] kernel32.dll BaseThreadInitThunk+0xd 2015-08-08T06:37:20.957-0400 I CONTROL [conn37692] 2015-08-08T06:37:20.957-0400 I - [conn37692] ***aborting after invariant() failure 2015-08-08T06:38:47.685-0400 I CONTROL ***** SERVER RESTARTED ***** 2015-08-08T06:38:47.795-0400 I CONTROL Hotfix KB2731284 or later update is not installed, will zero-out data files 2015-08-08T06:38:48.159-0400 I CONTROL Trying to start Windows service 'MongoDB' 2015-08-08T06:38:48.160-0400 I STORAGE Service running 2015-08-08T06:38:48.324-0400 W - [initandlisten] Detected unclean shutdown - \\STORAGE\FOLDER\mongod.lock is not empty. 2015-08-08T06:38:48.326-0400 W STORAGE [initandlisten] Recovering data from the last clean checkpoint. 2015-08-08T06:38:48.329-0400 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=4G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0), 2015-08-08T06:40:05.322-0400 E STORAGE [initandlisten] WiredTiger (-28998) [1439030405:322774][4460:2004759424], connection: \\STORAGE\FOLDER\journal/WiredTigerLog.0000000715: The system cannot find the file specified. a\journal 2015-08-08T06:40:05.331-0400 E STORAGE [initandlisten] WiredTiger (-28994) [1439030405:331775][4460:2004759424], file:WiredTiger.wt, sweep-server: WiredTiger.wt SetEndOfFile error: The handle is invalid. he file specified. a\journal 2015-08-08T06:40:05.332-0400 E STORAGE [initandlisten] WiredTiger (-28994) [1439030405:332775][4460:2004759424], file:WiredTiger.wt, sweep-server: Final close of file:WiredTiger.wt failed: The handle is invalid. he file specified. a\journal 2015-08-08T06:40:05.338-0400 I - [initandlisten] Assertion: 28595:-28998: The system cannot find the file specified. 2015-08-08T06:40:05.366-0400 I STORAGE [initandlisten] exception in initAndListen: 28595 -28998: The system cannot find the file specified. , terminating 2015-08-08T06:40:05.366-0400 I CONTROL [initandlisten] dbexit: rc: 49 2015-08-08T06:40:23.877-0400 I CONTROL ***** SERVER RESTARTED ***** 2015-08-08T06:40:23.877-0400 I CONTROL Hotfix KB2731284 or later update is not installed, will zero-out data files 2015-08-08T06:40:24.279-0400 I CONTROL Trying to start Windows service 'MongoDB' 2015-08-08T06:40:24.281-0400 I STORAGE Service running 2015-08-08T06:40:24.510-0400 W - [initandlisten] Detected unclean shutdown - \\STORAGE\FOLDER\mongod.lock is not empty. 2015-08-08T06:40:24.513-0400 W STORAGE [initandlisten] Recovering data from the last clean checkpoint. 2015-08-08T06:40:24.514-0400 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=4G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0), 2015-08-08T06:40:34.965-0400 I CONTROL [initandlisten] MongoDB starting : pid=1728 port=27017 dbpath=\STORAGE\FOLDER 64-bit host=miragemgt 2015-08-08T06:40:34.965-0400 I CONTROL [initandlisten] targetMinOS: Windows 7/Windows Server 2008 R2 2015-08-08T06:40:34.965-0400 I CONTROL [initandlisten] db version v3.0.2 2015-08-08T06:40:34.966-0400 I CONTROL [initandlisten] git version: 6201872043ecbbc0a4cc169b5482dcf385fc464f 2015-08-08T06:40:34.966-0400 I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.0.1m-fips 19 Mar 2015 2015-08-08T06:40:34.966-0400 I CONTROL [initandlisten] build info: windows sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack='Service Pack 1') BOOST_LIB_VERSION=1_49 2015-08-08T06:40:34.966-0400 I CONTROL [initandlisten] allocator: system 2015-08-08T06:40:34.966-0400 I CONTROL [initandlisten] options: { config: "C:\mongod1.cfg", net: { port: 27017 }, operationProfiling: { slowOpThresholdMs: 5000 }, replication: { oplogSizeMB: 2048, replSetName: "REPLICA-NAME" }, security: { authorization: "enabled", keyFile: "C:\securityFile.txt" }, service: true, storage: { dbPath: "\\STORAGE\FOLDER", engine: "wiredTiger", journal: { enabled: true }, wiredTiger: { engineConfig: { cacheSizeGB: 4 } } }, systemLog: { destination: "file", logAppend: true, path: "C:\logs\mongod1.log", quiet: true } } 2015-08-08T06:40:35.012-0400 I REPL [initandlisten] Did not find local replica set configuration document at startup; NoMatchingDocument Did not find replica set configuration document in local.system.replset After which we identify the replica configuration disappeared. We recall rs.initiate() which succeeds, but than find the entire database data is gone, we've only one database (other than the default admin) with only one collection, and we see it contains no data. Why did this happen? what should've we done differently?
ramon.fernandez commented on Thu, 20 Aug 2015 13:27:56 +0000: Thanks for your report sternr-windowslive, and sorry you run into this. From the log snippets above it seems that the following sequence of events took place: mongod aborted normal operation when the storage disappeared under it mongod was restarted, detected the unclean shutdown, and tried to recover from the last checkpoint, but could not find the necessary journal files mongod then tried to open the WiredTiger.wt file, but failed. Chances are this file was corrupted due to an incorrect filesystem shutdown. Unable to find this critical file, mongod stopped. The last restart in the logs appears to have found a clean environment, without database files or replica set configuration Without knowing more details about what exactly happened during these events is very difficult to piece the puzzle together, but it seems clear that mongod was not able to cope with the corruption created by the incident with the storage layer. As for future recommendations, the MongoDB Operations Best Practices white paper is a good place to start. Since the SERVER project is for reporting bugs or feature suggestions for the MongoDB server I'm going to close this ticket. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag, where your question will reach a larger audience. A question like this involving more discussion would be best posted on the mongodb-user group. See also our Technical Support page for additional support resources. Regards, Ramón.