...
The link of a CG falls into an error due to [SYM failed to find snapset] forcing the entire group into an enabled with no transfer state.Error: One or more links of group cg_name are set to replicate snaps and an error occurred in the snap-based replication process. The following errors were received from the storage: Link = cg_name->cg_name_copy, error = [SYM failed to find snapset]Symptoms found in logs:/files/home/kos/storage/result.logActiveXioArrayHelper_AO_IMPL::xioRefreshConsistentSnapshotFromDevice_i: xioRefreshConsistentSnapshotFromDevice Failed with res.faultString() = SYM failed to find snapset res.arrayRvCode() = e_API_FAILUREprintCommand: methodName = sym.SystemRemoveSnapSet format = ((ssi)(ssi)(ssi)i) numArgs = 10 buffer = (( 0065ff5a961b41979c64b1998bf9xxxx xms 1 )( xxxxxb824ee14c94b5f708ced17f3b85 XIO-HO-C01 1 )( 6fbb339729954axxxxxxxxxx 1 ) 19558 )XioConnection::executeCommand: Command execution fail. methodName = sym.SystemRemoveSnapSet m_client = 0x7f502dxxxxx server = 0x7f5xxxxxxxx URL: http://172.xx.xxx.xxx:11111/RPC2 this = 0x7f5030165xxxxx CleanEnvAndReturnRV: Operation failed. rv.faultString() = RPC failed at server. snapset_not_found env.fault_code = -500 XioArrayHelper: RPC failed at server. snapset_not_found, called from function: xioDeleteConsistentSnapshot:3212 /files/home/kos/control/result.log2018/10/17 10:12:50.135 - #1 - 5040/4313 - WorkManager: GroupCopy(206327186 SiteUID(0x228e3ecc2dxxxxxx) 0): Action refreshArrayConsistentSnapshot failed! value.arrayRvCode() = e_API_FAILURE value.errorStrings() = [SYM failed to find snapset]2018/10/17 10:12:50.550 - #2 - 5040/4313 - StateChange: lastComputedPipeTargetsMap. copy=GroupCopy(2063271xxx SiteUID(0x228e3ecc2xxxxxxx) 0) to copy= GlobalCopy(SiteUID(0x31ba0c434a00xxxxx) 0) pipe target=PT_CLOSED, reason=No exposed snap to replicate -> PT_CLOSED, reason=Array error
Due to a misalignment between XtremIO and the XMS RecoverPoint receives previous entries from the XMS when it retrieves the current list of snapshots that need to be discarded. Occasionally, an unexpected value is received and instead of the previous snapshot which needs to be discarded being used for deletion is used as the current snapshot. This causes the actual snapshot to be deleted. Due to this the calls from the Array begin to fail for the group associated with the snapshot which was deleted which causes the link of the copy to go into the error state of [SYM failed to find snapset], due to the link of the group being down the entire CG falls into an error state of enabled with no transfer.
Workaround:1. Disable the CG, and Re-Enable the CG2. Change the tweak t_xioPeriodicalSnapCleanupGatherInterval from 600000 to 600000000 (x1000) on all Production RPAs. This change will cause the cleanup tool to run once a week instead of every 10 minutes, and reduce the probability to encounter the issue.Resolution:A fix for this issue is available in XtremIO 4.0.27-1, XMS 6.2.1-36