...
a) show logging will have mutiple PM faults including HW_OUTPUT_DISABLED. Type of faults varies in each iteration. sysadmin-vm:0_RP1# 0/RP1/ADMIN0:Oct 21 11:52:45.216 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Over Current Fault :DECLARE :0/PM0: Power module is under HW_OC_FAULT condition. 0/RP1/ADMIN0:Oct 21 11:52:45.216 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Over Voltage Fault :DECLARE :0/PM0: Power module is under HW_OV_FAULT condition. 0/RP1/ADMIN0:Oct 21 11:52:45.216 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Error (PM_INPUT_STAGE_OT) :DECLARE :0/PM0: Power module is under HW_INPUT_STAGE_OT condition. 0/RP1/ADMIN0:Oct 21 11:52:45.216 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Shutdown (PM_INTERNAL_FAULT) :DECLARE :0/PM0: Power module is under HW_FAULT_INDUCED_SHUTDOWN condition. 0/RP1/ADMIN0:Oct 21 11:52:45.217 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Fault (PM_THERMAL_SENSOR_FAULT) :DECLARE :0/PM0: Power module is under HW_THERMAL_SENSOR_FAULT condition. 0/RP1/ADMIN0:Oct 21 11:52:45.217 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Error (PM_NO_INPUT_DETECTED) :CLEAR :0/PM0: Power module condition HW_NO_INPUT_DETECTED is cleared. 0/RP1/ADMIN0:Oct 21 11:52:45.217 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Output Disabled :DECLARE :0/PM0: Power module is under HW_OUTPUT_DISABLED condition. 0/RP1/ADMIN0:Oct 21 11:52:45.217 UTC: envmon[3384]: %PKT_INFRA-FM-4-FAULT_MINOR : ALARM_MINOR :Power Module Warning(PEC Error) :DECLARE :0/PM0: Power module is under HW_PEC_ERR condition. 0/RP1/ADMIN0:Oct 21 11:52:45.217 UTC: shelf_mgr[3465]: %INFRA-SHELF_MGR-3-HW_FAILURE_EVENT : HW failure event HW_EVENT_FAILURE, event_reason_str 'HW Temp/Thermal Failure' for card 0/PM0 0/RP1/ADMIN0:Oct 21 11:52:51.400 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Over Current Fault :CLEAR :0/PM0: Power module condition HW_OC_FAULT is cleared. 0/RP1/ADMIN0:Oct 21 11:52:51.400 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Over Voltage Fault :CLEAR :0/PM0: Power module condition HW_OV_FAULT is cleared. 0/RP1/ADMIN0:Oct 21 11:52:51.400 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Error (PM_INPUT_STAGE_OT) :CLEAR :0/PM0: Power module condition HW_INPUT_STAGE_OT is cleared. 0/RP1/ADMIN0:Oct 21 11:52:51.400 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Shutdown (PM_INTERNAL_FAULT) :CLEAR :0/PM0: Power module condition HW_FAULT_INDUCED_SHUTDOWN is cleared. 0/RP1/ADMIN0:Oct 21 11:52:51.401 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Fault (PM_THERMAL_SENSOR_FAULT) :CLEAR :0/PM0: Power module condition HW_THERMAL_SENSOR_FAULT is cleared. 0/RP1/ADMIN0:Oct 21 11:52:51.401 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Error (PM_VIN_VOLT_OOR) :CLEAR :0/PM0: Power module condition HW_VIN_OUT_OF_RANGE is cleared. 0/RP1/ADMIN0:Oct 21 11:52:51.401 UTC: envmon[3384]: %PKT_INFRA-FM-4-FAULT_MINOR : ALARM_MINOR :Power Module Warning(PEC Error) :CLEAR :0/PM0: Power module condition HW_PEC_ERR is cleared. 0/RP1/ADMIN0:Oct 21 11:52:51.418 UTC: shelf_mgr[3465]: %INFRA-SHELF_MGR-3-HW_FAILURE_EVENT : HW failure event HW_EVENT_FAILURE, event_reason_str 'No Input or HW Power Failure' for card 0/PM0 b) admin show environment power will show PSU in failed or no power state sysadmin-vm:0_RP1# show environment power location 0/PM7 Thu Oct 21 11:21:04.936 UTC+00:00 0/PM7 3kW-AC 326.1/0.0 /0.0 20.9/0.0 /0.0 10.4 89.1 FAILED or NO PWR c) show alarm will have mutiple alarms including OUTPUT disabled sysadmin-vm:0_RP1# show alarms Thu Oct 21 11:19:48.467 UTC+00:00 ------------------------------------------------------------------------------- Active Alarms ------------------------------------------------------------------------------- Location Severity Group Set time Description ------------------------------------------------------------------------------- 0/FT2 major environ 10/21/21 10:03:25 Fan tray is removed from chassis. 0/PM7 major environ 10/21/21 11:19:11 Power Module Output Disabled. 0/PM7 major environ 10/21/21 11:19:17 Power Module Fault (LOGIC_CTRL_VOLT_OOR). 0/PM7 major environ 10/21/21 11:19:17 Power Module Fan Fault. d) PSu fail led go Amber sysadmin-vm:0_RP1# show led location 0/PM7 Thu Oct 21 11:20:54.647 UTC+00:00 ============================================================= Location LED Name Mode Color ============================================================= 0/PM7 0/PM7-Fail WORKING AMBER 0/PM7-OK WORKING OFF
Inserting PSU iput feeed cable during PEM oir
Since we do not hit this issue all the time on OIR , if we hit this issue we can recover doing PEM oir again.
1.Only known trigger is PSU OIR. Issue is random and we may not hit in every OIR. 2. symptoms: a) show logging will have mutiple PM faults including HW_OUTPUT_DISABLED. Type of faults varies in each iteration. sysadmin-vm:0_RP1# 0/RP1/ADMIN0:Oct 21 11:52:45.216 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Over Current Fault :DECLARE :0/PM0: Power module is under HW_OC_FAULT condition. 0/RP1/ADMIN0:Oct 21 11:52:45.216 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Over Voltage Fault :DECLARE :0/PM0: Power module is under HW_OV_FAULT condition. 0/RP1/ADMIN0:Oct 21 11:52:45.216 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Error (PM_INPUT_STAGE_OT) :DECLARE :0/PM0: Power module is under HW_INPUT_STAGE_OT condition. 0/RP1/ADMIN0:Oct 21 11:52:45.216 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Shutdown (PM_INTERNAL_FAULT) :DECLARE :0/PM0: Power module is under HW_FAULT_INDUCED_SHUTDOWN condition. 0/RP1/ADMIN0:Oct 21 11:52:45.217 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Fault (PM_THERMAL_SENSOR_FAULT) :DECLARE :0/PM0: Power module is under HW_THERMAL_SENSOR_FAULT condition. 0/RP1/ADMIN0:Oct 21 11:52:45.217 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Error (PM_NO_INPUT_DETECTED) :CLEAR :0/PM0: Power module condition HW_NO_INPUT_DETECTED is cleared. 0/RP1/ADMIN0:Oct 21 11:52:45.217 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Output Disabled :DECLARE :0/PM0: Power module is under HW_OUTPUT_DISABLED condition. 0/RP1/ADMIN0:Oct 21 11:52:45.217 UTC: envmon[3384]: %PKT_INFRA-FM-4-FAULT_MINOR : ALARM_MINOR :Power Module Warning(PEC Error) :DECLARE :0/PM0: Power module is under HW_PEC_ERR condition. 0/RP1/ADMIN0:Oct 21 11:52:45.217 UTC: shelf_mgr[3465]: %INFRA-SHELF_MGR-3-HW_FAILURE_EVENT : HW failure event HW_EVENT_FAILURE, event_reason_str 'HW Temp/Thermal Failure' for card 0/PM0 0/RP1/ADMIN0:Oct 21 11:52:51.400 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Over Current Fault :CLEAR :0/PM0: Power module condition HW_OC_FAULT is cleared. 0/RP1/ADMIN0:Oct 21 11:52:51.400 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Over Voltage Fault :CLEAR :0/PM0: Power module condition HW_OV_FAULT is cleared. 0/RP1/ADMIN0:Oct 21 11:52:51.400 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Error (PM_INPUT_STAGE_OT) :CLEAR :0/PM0: Power module condition HW_INPUT_STAGE_OT is cleared. 0/RP1/ADMIN0:Oct 21 11:52:51.400 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Shutdown (PM_INTERNAL_FAULT) :CLEAR :0/PM0: Power module condition HW_FAULT_INDUCED_SHUTDOWN is cleared. 0/RP1/ADMIN0:Oct 21 11:52:51.401 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Fault (PM_THERMAL_SENSOR_FAULT) :CLEAR :0/PM0: Power module condition HW_THERMAL_SENSOR_FAULT is cleared. 0/RP1/ADMIN0:Oct 21 11:52:51.401 UTC: envmon[3384]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Error (PM_VIN_VOLT_OOR) :CLEAR :0/PM0: Power module condition HW_VIN_OUT_OF_RANGE is cleared. 0/RP1/ADMIN0:Oct 21 11:52:51.401 UTC: envmon[3384]: %PKT_INFRA-FM-4-FAULT_MINOR : ALARM_MINOR :Power Module Warning(PEC Error) :CLEAR :0/PM0: Power module condition HW_PEC_ERR is cleared. 0/RP1/ADMIN0:Oct 21 11:52:51.418 UTC: shelf_mgr[3465]: %INFRA-SHELF_MGR-3-HW_FAILURE_EVENT : HW failure event HW_EVENT_FAILURE, event_reason_str 'No Input or HW Power Failure' for card 0/PM0 b) admin show environment power will show PSU in failed or no power state sysadmin-vm:0_RP1# show environment power location 0/PM7 Thu Oct 21 11:21:04.936 UTC+00:00 0/PM7 3kW-AC 326.1/0.0 /0.0 20.9/0.0 /0.0 10.4 89.1 FAILED or NO PWR c) show alarm will have mutiple alarms including OUTPUT disabled sysadmin-vm:0_RP1# show alarms Thu Oct 21 11:19:48.467 UTC+00:00 ------------------------------------------------------------------------------- Active Alarms ------------------------------------------------------------------------------- Location Severity Group Set time Description ------------------------------------------------------------------------------- 0/FT2 major environ 10/21/21 10:03:25 Fan tray is removed from chassis. 0/PM7 major environ 10/21/21 11:19:11 Power Module Output Disabled. 0/PM7 major environ 10/21/21 11:19:17 Power Module Fault (LOGIC_CTRL_VOLT_OOR). 0/PM7 major environ 10/21/21 11:19:17 Power Module Fan Fault. d) PSu fail led go Amber sysadmin-vm:0_RP1# show led location 0/PM7 Thu Oct 21 11:20:54.647 UTC+00:00 ============================================================= Location LED Name Mode Color ============================================================= 0/PM7 0/PM7-Fail WORKING AMBER 0/PM7-OK WORKING OFF