...
When using VEEAM, or any other backup application using BOOST to perform backups is using the Virtual Synthetics feature, it creates new backups from existing ones by stitching together parts of the previous backups on the DD and then adds the differences. The earlier backups used for the stitching are called "base files".Most backup applications will read but not modify the base files used for synthesizing new backup images, however VEEAM works differently, when performing backups it overwrites parts of the base files already on disk.When outgoing MTree replication is configured for this VEEAM LSU/MTree it is possible that one backup file being replicated is modified by BOOST during the synthesis of new backup files. If the source DD is running DDOS 6.x and recipe replication is enabled (a speed/performance optimization option in DDOS 6.x and later) this can result in the wrong checksums arriving on the destination DD which could cause the FS (File System) to fail repeatedly with messages such as the following: Feb 27 04:05:19 mtree-repl-dd.example.com ddfs[10654]: ERROR: MSG-INTRNL-00001: PANIC: ddr/repl/mrepl_replica.c: mrepl_finish_file_transfer_common: 3712: !(orig_chksum == repl_chksum).
The way VEEAM synthesizes new backups from existing ones, there could be overwrites of some parts of the files being replicated when used as a base file for synthesizing new backup images. This creates confusion for recipe replication when the VEEAM Storage Unit is also configured to be used as part of MTree replication, possibly causing the target DD to PANIC.Note this defect only applies to the destination end for MTree replication when the source is: Running DDOS 6.0.1.0 or earlier (for example, all DDOS 6.0.0.x would be affected)Running DDOS 6.x earlier than DDOS 6.0.2.0 or 6.1.1.1Running VEEAM backups to a LSU/MTree , and that same MTree is being replicated to the target using MTree replicationBOOST backups with Virtual Synthetics enabled are performed to the same LSU/MTree When this defect is encountered, it can result in the replicated destination MTree to become unavailable with multiple FS process restarts. Those who may be using this setup or are planning to configure their systems in this manner are encouraged to either employ the workaround explained below, or upgrade to the fixed DDOS 6.0.2.0 or 6.1.1.1 (or any later release). Note: it is possible that the same PANIC string on the destination end for MTree replication could occur for issues other than this one, as the error merely indicates a checksum in the MTree replication snapshot did not match. For any doubts about the issue being described here or how to circumvent a problem, please contact your contracted support provider and reference this KB article number 491049.
DD Engineering has identified the root cause for the FS PANICs on the destination node and has committed a fix in the following releases: DDOS 6.0.2.0 and laterDDOS 6.1.1.1 and later Anyone affected by this defect, or is planning to set up a similar configuration is advised to upgrade the source DD to the mentioned releases at the earliest.For those unwilling to upgrade or those facing the problem before the fixed release becomes available there is a workaround.It consists of disabling the recipe replication optimization on the source DDOS 6.x system.This optimization is only present in DDOS 6.x and later, the only drawback of disabling it would be lowered replication speeds to be equal to those achieved on DDOS 5.7.Prior to deployment you should first confirm if this workaround would be applicable to the current setup: Check if the source DD is running DDOS 6.x prior to the release that has this fixed (bug fixed in DDOS 6.0.2.0 and 6.1.1.1 onwards)Confirm that the DD configured for VEEAM backups is also configured for MTree replication for the subject LSU/MTree as the source (checking a recent ASUP would be the easiest way to confirm), for example: CTX: 20 Mode: source Destination: mtree://destination-dd.example.com/data/col1/destination_MTree Enabled: yes If all these conditions above apply this system could be subject to the aforementioned defects, and could cause the destination replication FS to eventually crash.To apply the workaround, one must first make sure there are no replication or BOOST backups running, and then make the registry setting change, which requires no downtime. Before commencing this process, please read the CAUTION statement located below the final step of this procedure. Make sure DD to DD replication is disabled on the source DD running DDOS 6.x: # replication disable all Also, make sure there are no ongoing BOOST backups or BOOST MFR to or from the potentially offending VEEAM LSU/MTree before applying the registry setting. If necessary temporarily disable backups and MFR to or from this LSU: # ddboost file-replication show active all # ddboost file-replication show stats This registry change requires SE mode privileges. NOTE: SE commands have been deprecated in DDOS versions 7.7.5.25, 7.10.1.15, 7.13.0.15, 6.2.1.110 and above and are accessible only by Dell employees. From SE mode, change the registry setting to disable the use of recipe replication: # se sysparam set RECIPE_REPL_ENABLED=FALSE Confirm the system parameter was properly set and is showing as "FALSE" (disabled) # se sysparam show RECIPE_REPL_ENABLED Name Description Current Default Override ------------------- ------------------------------------------------ ------- ------- -------- RECIPE_REPL_ENABLED Enable recipe replication (apply to source only) FALSE TRUE rpc ------------------- ------------------------------------------------ ------- ------- -------- You may now re-enable DD to DD replication and resume BOOST backups and BOOST MFR to or from the LSU: # replication enable all CAUTION: If recipe processing has been disabled on a Data Domain source system which is also configured as a destination end for replication, the source system(s) for those contexts will also need to have the above process performed (recipe replication disabled). Also, note once upgrading to a fixed release (DDOS 6.0.2.0 or 6.1.1.1 ) the setting will need to be restored so that recipe replication can be leveraged, the upgrade will not reset the registry key. After completing the upgrade re-enable recipe replication by logging into the DD, enter SE privilege mode and run: # se sysparam reset RECIPE_REPL_ENABLED If unsure of the process described above, contact your contracted support provider and reference this KB 491049 article.