Symptom
N9K-GX2 switches may reboot unexpectedly due to csusd process crash.
The following log entries are generated when this occurs:
%SYSMGR-SLOT1-2-SERVICE_CRASHED: Service "csusd" (PID 9424) hasn't caught signal 11 (core will be saved).
%SYSMGR-SLOT1-2-HAP_FAILURE_SUP_RESET: Service "csusd" in vdc 1 has had a hap failure
%MODULE-2-MOD_DIAG_FAIL: Module 1 (Serial number: XXXXXXXXXXX) reported failure due to Service on linecard had a hap-reset in device DEV_SYSMGR (device error 0xa65)
`show system reset-reason`
----- reset reason for module 1 (from Supervisor in slot 1) ---
1) At 937014 usecs after Tue Aug 22 09:00:26 2023
Reason: Reset Requested due to Fatal Module Error
Service: System manager
Version: 10.2(4)
`show module internal exceptionlog module 1`
********* Exception info for module 1 ********
exception information --- exception instance 1 ----
Module Slot Number: 1
Device Id : 134
Device Name : System Manager
Device Errorcode : 0x00000a65
Device ID : 00 (0x00)
Device Instance : 00 (0x00)
Dev Type (HW/SW) : 10 (0x0a)
ErrNum (devInfo) : 101 (0x65)
System Errorcode : 0x401e008a Service on linecard had a hap-reset
Error Type : FATAL error
PhyPortLayer : 0x0
Port(s) Affected :
Error Description : csusd hap reset
DSAP : 0 (0x0)
UUID : 1 (0x1)
Time : (null) (Ticks: 64E408AA jiffies)
Conditions
Only impacts N9K-C9332D-GX2B & N9K-C9364D-GX2A
Running 10.2(6), 10.3(3), 10.4(1) or earlier releases
Workaround
There are no workarounds for this.
Further Problem Description
The fix for this issue has been committed in 10.2(7)M, 10.3(4a)M, 10.4(2)F and later releases.