...
We had a single database instance (version 3.2.11 no replica set) running on windows server. Yesterday morning we realized that all data older than 02:30 approx in the morning was lost. Looking in the MongoDb logs we find this piece of log happened at that time. e:/ourfolder/data/mongodb/data\WiredTiger.turtle.set to e:/ourfolder/data/mongodb/data\WiredTiger.turtle: file-rename: MoveFileW: Cannot create a file when that file already exists. I included part of the log that spot the problem as attachment. If you look at the logs, you can verify that the server restarted, it tried to recover the db, but after restart there are nothing inside dbs, we found only data that was generated after the restart. No error in operating system log, no log of disk failure for the virtual machine. If I look at the disk, I see that there are 7.5 GB approx of various .wt collection files, so the old data is still there. We tried with wt utility to recover the data, but we had no success. We tried to recover wt files, we tried to dump them, but with no success. It seems that the wiredtiger.wt file was completely rewritten when the server restarted, wt dump command can dump all the .wt files created after the restart, but it cannot dump files older than server restart. Is there any way to dump the content of a .wt collection file without the WiredTiger.wt file? In such a situation it could be better to avoid server restart and automatic repair but simply stopping the server and letting a human check what happened? (maybe we can save the old situation and use the wt dump command before the restore process) It can be related to this ticket SERVER-18850
keith.bostic commented on Thu, 23 Mar 2017 10:47:02 +0000: It means that there's still a chance to lose data for the same issue? We do not believe there's any chance to lose data for the same issue. We're keeping the window of vulnerability as short as possible as a defensive measure, just in case we're wrong! mtb.snowboard@gmail.com commented on Thu, 23 Mar 2017 07:31:33 +0000: so the window of vulnerability is as short as possible It means that there's still a chance to lose data for the same issue? michael.cahill commented on Thu, 23 Mar 2017 06:28:31 +0000: The fix for this issue has been merged into WiredTiger's develop branch, it will be in the next development release of MongoDB. xgen-internal-githook commented on Thu, 23 Mar 2017 06:27:58 +0000: Author: {u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith.bostic@mongodb.com'} Message: SERVER-28194 Missing WiredTiger.turtle file loses data (#3337) There's a two step process on Windows to rename files (including the turtle file), remove the original and then move the replacement into place – a DeleteFileW followed by a MoveFileW. If we crash in the middle (and in SERVER-28194, it looks like there's a weirder failure mode, where the DeleteFileW succeeded, but the file was still there), we can be left without a turtle file, which will lose all of the data in the database. Add the MOVEFILE_WRITE_THROUGH flag to the MoveFileEx call. If we somehow end up in a copy-then-delete path, that flag adds a disk flush after the copy phase, so the window of vulnerability is as short as possible. Branch: develop https://github.com/wiredtiger/wiredtiger/commit/6bd63027a6fd00db3f0f379acb929c22cd1b7f6f xgen-internal-githook commented on Thu, 23 Mar 2017 06:27:56 +0000: Author: {u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith.bostic@mongodb.com'} Message: SERVER-28194 Missing WiredTiger.turtle file loses data (#3337) There's a two step process on Windows to rename files (including the turtle file), remove the original and then move the replacement into place – a DeleteFileW followed by a MoveFileW. If we crash in the middle (and in SERVER-28194, it looks like there's a weirder failure mode, where the DeleteFileW succeeded, but the file was still there), we can be left without a turtle file, which will lose all of the data in the database. Add the MOVEFILE_WRITE_THROUGH flag to the MoveFileEx call. If we somehow end up in a copy-then-delete path, that flag adds a disk flush after the copy phase, so the window of vulnerability is as short as possible. Branch: develop https://github.com/wiredtiger/wiredtiger/commit/6bd63027a6fd00db3f0f379acb929c22cd1b7f6f JIRAUSER1269368 commented on Thu, 16 Mar 2017 13:52:45 +0000: Thanks Mark, I've uploaded two files, the original file were zipped with 7z (mongo runs on windows), then to avoid the 5 GB limit, I've splitted in two files with tar. The end result is a tar subdivided in two file that contains a 7zip file with data file and the full log file of mongo (maybe the log file could be of some help) I hope that this is ok. Thanks a lot for the support. Gian Maria. mark.agarunov commented on Tue, 14 Mar 2017 16:40:53 +0000: Hello alkampfer, I've generated a secure upload portal for you to send us the data. Note that there is a 5GB file size limit, however if your data is greater than 5GB this limitation is easy to work around: split -d -b 5300000000 filename.tgz part. This will produce a series of part.XX where XX is a number; you can then upload these files via the secure portal and we'll stitch them back together. Please note that while we will attempt to restore the data, there is no guarantee that it will be successful. Thanks, Mark JIRAUSER1269368 commented on Thu, 9 Mar 2017 10:44:24 +0000: Hi Mark, The database was not restored from a backup for what I know, but if it was restored, 100% it was done with a mongodump then mongorestore process, we never manipulate data directory directly and we never saved data directory for backup, but always use mongodump. There are other people that could access that server, but I strongly believe that they never touched anything. files backup is 5 GB size approx, if I need to upload to you have you some secure way for doing it, or I can give you privately FTP or some other mechanism to transfer the file? Also, for future reference, is there any tool that can read RAW content of collection .wt files extracting documents? For what I saw, wt.exe utility can dump data only if the WiredTiger.wt files is ok. Thanks a lot for the help. Gian Maria mark.agarunov commented on Mon, 6 Mar 2017 23:41:43 +0000: Hello alkampfer, Thank you for the report. Looking over the output you've provided, there may be a few causes of the behavior you're seeing. To better investigate the root of the issue, there are a couple things I'd like to clarify: Was the database restored from a backup at any point? If it was, what was the method involved (mongorestore, manual file copy, etc) Was there any manipulation of the database files directly? With regards to the repair, we can attempt to recover the database, but first we will need to narrow down what caused the failure in the first place. Additionally, provided the cause of the issue becomes apparent, you would need to upload the entire database for us to attempt a recovery. Thanks, Mark JIRAUSER1269368 commented on Sat, 4 Mar 2017 09:18:27 +0000: I was pretty sure I've choosed bug to report the issue but it seems that it is a new feature. I cannot move the issue to change the type, could some admin please change the type. Thanks a lot.