Symptoms
Switch rebooted unexpectedly. Post which both the CPs were enabled and HA was synchronized.
Fabric OS: v8.2.3c
/fabos/cliexec/hadump:
Local CP (Slot 6, CP0): Active, Cold Recovered
Remote CP (Slot 7, CP1): Standby, Healthy
HA enabled, Heartbeat Up, HA State synchronized
errdump :
[HAM-1004], SLOT 7 | CHASSIS, INFO, Processor rebooted - Reset., reboot.c
[PLAT-5067], SLOT 7 | CHASSIS, INFO, Disabling PCI timeout detect. Rev 4, modular_proc.
Found that CP in slot-7 was faulted from the emtraceshow2 output.
NOTE: There is a root level command called EMTRACESHOW2. It is part of the supportsave and shows the components starting up after a reboot.
There is a single FLT (fault) listed there for slot 7. There is no other indication of any other problem.
/fabos/cliexec/emtraceshow2:
Object Before SCN Send After privSt m fo date/time
Slot 7 IN(20000) FLT(10014) ---------- IN(20000) 000000 1 0 Jan 29 14:20:51
CP BLADE Slot: 7
Header Version: 2
Power Consume Factor: -40W
Factory Part Num: 60-1000376-10
Factory Serial Num: xxxxx
Manufacture: Day: 16 Month: 6 Year: 2012
Update: Day: 1 Month: 2 Year: 2023
Time Alive: 3742 days
Time Awake: 3 days
Resolution
Fix:1) Reseat the Standby CP which is at Slot 7. It would then go through POST and recover from the error state.2) If reseat of CP Blade does not solve the issue, replace the CP Blade.