...
Recently, full collection validation has started taking an exceptionally long time in the backup_restore tests. Since the validate cmd will grab the PBWM lock, it will block oplog application, and end up timing out any calls to awaitReplication. Ideally, we should be able to fix this by improving the performance of collection validation, but a quicker fix might be to see if we can avoid calling awaitReplication while in a validate cmd.
jason.chan commented on Tue, 26 May 2020 18:18:13 +0000: We closed this since we expect WT-6229 to have fixed the underlying issue of validation being too slow. suganthi.mani commented on Mon, 18 May 2020 16:35:40 +0000: FYI, awaitReplication is called as part of the waitForReplication background hook. Also, we call full collection validation by default when the node shuts down. Another alternative is that secondary backup node shutdown can call rst.stop method using skipValidation: true which will skip collection validation. This will assure we don't call collection validation when fsm workload and WaitForReplication is in progress. jason.chan commented on Mon, 18 May 2020 16:32:31 +0000: suganthi.mani points out that another alternative is to switch to use a background validation in these tests, but this would require the changes to be made in SERVER-47681, and we will also lose the coverage provided by full:true.
Click on a version to see all relevant bugs
MongoDB Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.