Symptom
During an FMC HA switch role operation , the CLI logs indicate that the switch role operation failed due to "?HA Swtich Roles Transaction is not valid. Remote peer is not in good state to complete swtich roles task. at /usr/local/sf/lib/perl/5.10.1/SF/Synchronize/HADC.pm line 2705? " .
As a result customer was unable to switch role successfully and is left with FMC HA in degraded state.
Conditions
The possible reason this seems to have failed is because of an ongoing synchronization task around the time switch role was initiated and is the reason for FMC HA to go in degraded state however on the FMC UI no notification is displayed and from the logs it seems like the switch role attempt continues to be re tried without timing out until fatal failure is encountered 4 hours later "Fatal task failure (088ead90-97c4-11eb-bb26-5b2381e4eada) Database Switch Roles: Active To Standby : FAILED at /usr/local/sf/lib/perl/5.10.1/SF/ActionQueue.pm line 2522." and even then no UI notification is shown.
Workaround
To recover from the degraded FMC HA status:
1. Pause synchronization from the active FMC and once both FMC came up fine as standalone FMC then
2. Resume synchronization from the original active FMC UI .
3. Monitor the FMC HA status , task for progress and wait till synchronization tasks are completed.
Further Problem Description