...
Crash due to heartbeat timeout from CMCC and CMFP.
* Soft parity errors reported repeatedly. * CPU HOGs are seen. * heartbeat timeouts from CMCC and CMFP are seen. * Crash due to heartbeat timeout. Signature: %EVENTLIB-3-CPUHOG: R0/0: cmand: read asyncon 0x212e813c: 2092ms, Traceback=1#ab88f16d26f26672c358112da7b2bec8 evlib:20471000+AC1C evlib:20471000+AD70 binos:207BB000+DA68 linux-vdso32:810000+3A0 c:1C4A1000+7E268 c:1C4A1000+7F588 c:1C4A1000+7E4E0 c:1C4A1000+6E858 c:1C4A1000+6E8D4 c:1C4A1000+EAA40 env:1E50A000+25354 %EVENTLIB-3-CPUHOG: R0/0: cmand: read asyncon 0x212e813c: 1488ms, Traceback=1#ab88f16d26f26672c358112da7b2bec8 evlib:20471000+AC1C evlib:20471000+AD70 binos:207BB000+DA68 linux-vdso32:810000+3A0 c:1C4A1000+14B940 c:1C4A1000+7E04C c:1C4A1000+6DD6C c:1C4A1000+EAAA4 env:1E50A000+256FC env:1E50A000+24664 %PMAN-5-EXITACTION: R0/0: pvp: Process manager is exiting: Critical process cmcc fault on cc_0_0 (rc=134) %PMAN-5-EXITACTION: R1/0: pvp: Process manager is exiting: Critical process cmcc fault on cc_1_0 (rc=134) %PMAN-0-PROCFAILCRIT: R1/0: pvp: A critical process cmcc has failed (rc 134) %PMAN-3-PROCHOLDDOWN: R1/0: pman: The process cmcc has been helddown (rc 134) %PMAN-0-PROCFAILCRIT: R0/0: pvp: A critical process cmcc has failed (rc 134) %PMAN-3-PROCHOLDDOWN: R0/0: pman: The process cmcc has been helddown (rc 134) %CMCC-3-HB_TIMEOUT: R0/0: cmcc: Peroidic Heartbeat message from RP timed out. %CMCC-3-HB_TIMEOUT: R1/0: cmcc: Peroidic Heartbeat message from RP timed out. %CMFP-3-HB_TIMEOUT: R1/0: cman_fp: Peroidic Heartbeat message from RP timed out. %EVENTLIB-3-CPUHOG: R0/0: cmand: read asyncon 0x212e813c: 1160ms, Traceback=1#ab88f16d26f26672c358112da7b2bec8 evlib:20471000+AC1C evlib:20471000+AD70 binos:207BB000+DA68 linux-vdso32:810000+3A0 pthread:1C647000+1276C env:1E50A000+24790 :20BAF000+11D7D0 :20BAF000+11DAD8 :20BAF000+E49F0 :20BAF000+B01D0 %EVENTLIB-3-CPUHOG: R0/0: cmand: read asyncon 0x212e813c: 2460ms, Traceback=1#ab88f16d26f26672c358112da7b2bec8 evlib:20471000+AC1C evlib:20471000+AD70 binos:207BB000+DA68 linux-vdso32:810000+3A0 c:1C4A1000+3DDA0 c:1C4A1000+60464 c:1C4A1000+71964 c:1C4A1000+6AC44 c:1C4A1000+EADF4 env:1E50A000+25410
Use HA node.
On HA box, node will immediately go for switchover for such repeated errors and this problem can be avoided.