...
Software Events Symptom CodeIssueDescriptionKB to address the issueSymptom: 15node offlineSymptom: 15 Desc: node 0.4 was online, changingAvamar - Troubleshooting Node OfflineGSAN Degraded Issues on an Avamar System (Resolution Path) Symptom: 4004hfscheckSymptom: 4004 Desc: hfscheck of cp.20230405060524 failed on error: MSG_ERR_DDR_ERRORAvamar-Data Domain Integration: Troubleshooting hfscheck failures with MSG_ERR_DDR_ERROR (Resolution Path)Symptom: 4004 Desc: hfscheck of cp.20230401142652 failed on error: MSG_ERR_CGSAN_FAILEDAvamar - Troubleshooting HFSCHECK failures due to MSG_ERR_CGSAN_FAILED (Resolution Path)Symptom: 4004 Desc: hfscheck of cp.20230331071423 failed on error: MSG_ERR_HFSCHECKERRORSAvamar: Troubleshooting HFSCHECK Failures (Resolution Path)Symptom: 4004 Desc: hfscheck of cp.20230329134440 failed on error: MSG_ERR_TIMEOUTAvamar - Troubleshooting HFSCHECK Failures due to MSG_ERR_TIMEOUT (Resolution Path)Symptom: 4004 Desc: hfscheck of cp.20230328190505 failed on error: MSG_ERR_NODE_DOWNAvamar - Troubleshooting HFSCHECK failures due to MSG_ERR_NODE_DOWNSymptom: 4004 Desc: failed hfscheck maintenance with error MSG_ERR_NO_CHECKPOINTAvamar - Troubleshooting HFSCHECK failures due to MSG_ERR_NO_CHECKPOINTSymptom: 4004 Desc: failed hfscheck maintenance with error MSG_ERR_INPROGRESSAvamar - Troubleshooting HFSCHECK failures due to MSG_ERR_INPROGRESSSymptom: 4004 Desc: hfscheck of cp.20230306140026 failed on error: MSG_ERR_CMD_FAILAvamar - Troubleshooting HFSCHECK failures due to MSG_ERR_CMD_FAILSymptom: 4004 Desc: failed hfscheck maintenance with error MSG_ERR_ROLLING_CHECKHFSCHECK Failed with MSG_ERR_ROLLING_CHECK Due to Avamar Server Time ShiftSymptom: 4202garbage collectionSymptom: 4202 Desc: failed garbage collection with error MSG_ERR_NOPARITYAvamar - Troubleshooting Garbage Collection (GC) Failures (Resolution Path)Symptom: 4202 Desc: failed garbage collection with error MSG_ERR_DISKFULLAvamar OS Capacity (Resolution Path)Symptom: 4202 Desc: failed garbage collection with error MSG_ERR_DDR_ERRORAvamar-Data Domain Integration: Troubleshooting Connectivity issues/All Avamar maintenance activities fail with MSG_ERR_DDR_ERROR (Resolution Path)Symptom: 4202 Desc: failed garbage collection with error MSG_ERR_BADTIMESYNCAvamar - Symptom Code 4202 - failed garbage collection with error MSG_ERR_BADTIMESYNCSymptom: 4202 Desc: failed garbage collection with error MSG_ERR_IO_ERRORAvamar - Troubleshooting failed garbage collection with error MSG_ERR_IO_ERROR (Resolution Path)Symptom: 4202 Desc: failed garbage collection with error MSG_ERR_MISCAvamar - Troubleshooting Garbage Collection Failing with MSG_ERR_MISC (Resolution Path)Symptom: 4202 Desc: failed garbage collection with error MSG_ERR_MISMATCHAvamar - Troubleshooting Garbage Collection (GC) Failures (Resolution Path)Symptom: 4302 checkpointSymptom: 4302 Desc: failed checkpoint maintenance with error MSG_ERR_DDR_ERRORAvamar-Data Domain Integration: Troubleshooting Connectivity issues/All Avamar maintenance activities fail with MSG_ERR_DDR_ERROR (Resolution Path)Symptom: 4302 Desc: failed checkpoint maintenance with error MSG_ERR_ACCESSMODEAvamar checkpoint fails with MSG_ERR_ACCESSMODE due to incomplete GSAN index stripes splitting operationSymptom: 4302 Desc: failed checkpoint maintenance with error MSG_ERR_DISKFULLAvamar maintenance tasks fail with "MSG_ERR_DISKFULL" due to data partition operating system capacity >89%Symptom: 4302 Desc: failed checkpoint maintenance with error MSG_ERR_TIMEOUTAvamar - (Internal Only) Checkpoint fails with MSG_ERR_TIMEOUTSymptom: 4302 Desc: failed checkpoint maintenance with error MSG_ERR_BADTIMESYNCAvamar: Checkpoint failed with result "MSG_ERR_BADTIMESYNC."Symptom: 4302 Desc: failed checkpoint maintenance with error MSG_ERR_EXCEPTIONAvamar: Checkpoint failed with result MSG_ERR_EXCEPTION and Unix exception "Too many links" due to File System CorruptionSymptom: 4302 Desc: failed checkpoint maintenance with error MSG_ERR_OFFLINEAvamar - Troubleshooting Checkpoint Failures (Resolution Path)Symptom: 22407MC flushSymptom: 22407 Desc: Flush of Administrator Server data to server is overdue.Flush of Administrator Server data to server is overdueSymptom: 22409hfscheckSymptom: 22409 Desc: A checkpoint validation (hfscheck) of server checkpoint data is overdue.Avamar - Symptom Code 22409 - Desc: A checkpoint validation (hfscheck) of server checkpoint data is overdue (Resolution Path)Symptom: 22416CapacitySymptom: 22416 Desc: The server storage has exceeded maximum operating capacityAvamar Capacity Troubleshooting, Issues and Questions - All Capacity (Resolution Path)Symptom: 22632partition suspendedSymptom: 22632 Desc: A server disk has become suspended.Suspended PartitionsStripes and Hfscheck Failures on Avamar (Symptom Code 22632)
N/A
Hardware Events Symptom: 52600HWSymptom: 52600 Desc: HARDWARE: Aug 13 21:06:17 avamarnode01 ipmiutil: igetevent: 021f 08/13/22 20:37:50 CRT BMC Processor #80 CATERR Proc Config Error 03 [a1 01 01]Symptom: 52600 Desc: HARDWARE: Apr 19 13:51:10 avamarnode01 ipmiutil: igetevent: 00a5 04/19/22 13:51:09 CRT BMC Critical Interrupt #05 FP NMI Diag Int FP NMI 6f [00 ff ff]Avamar Gen4s Hardware >> Symptom Code 52600 - ipmiutil: igetevent: CRT BMC Critical Interrupt #05 FP NMI Diag Int FP NMISymptom: 52601HWSymptom: 52601 Desc: HARDWARE: Mar 23 14:32:40 avamarnode01 ipmiutil: igetevent: 006d 03/22/22 21:58:44 MAJ EFI System Firmware #06 POST Err Sensor POST Code 8190 6f [a0 90 81]Symptom: 52607HWSymptom: 52607 Desc: HARDWARE: Dec 15 20:27:19 avamarnode01 ipmiutil: igetevent: 00fd 12/15/22 20:57:17 MAJ BMC Fan #32 System Fan 2 Lo Crit thresh actual=1666.00 RPM, threshold=1715.00 RPMAvamar - Gen4s Hardware >> Symptom Code 52607 - MAJ BMC Fan #32 System Fan 2 Lo Crit thresh actual=x RPM threshold=1715.00 RPM (Resolution Path)Symptom: 52617HWSymptom: 52617 Desc: HARDWARE: Jan 5 04:36:18 avamarnode01 MR_MONITOR[5766]: Controller ID: 0 Unexpected sense: PD = Port 0 - 3:2:2Hardware impending failure general hard driveAvamar Gen4s Hardware >> Symptom Code 52617 - Controller ID: 0 Unexpected sense: PD = Port 0 - 3:2:1Hardware impending failure general hard drive failure, CDB = 0x03 0x00 0x00 0x00 0x40 0x00 , Sense = 0x70 0x00 0x00 0x00Symptom: 52617 Desc: HARDWARE: Jan 8 03:30:41 avamarnode01 MR_MONITOR[5728]: Controller ID: 0 Unexpected sense: PD = Port 0 - 3:2:0Write protected, CDB = 0x2e 0x0Avamar Gen4s Hardware >> Symptom Code 52617 - Controller ID:0 Unexpected sense: PD = Port 0 - 3:2:0Write protected, CDB = 0x2e (Physical Disk)Avamar Gen4s Hardware >> Symptom Code 52617 - Controller ID: 0 Unexpected sense: PD = Port 0 - 3:2:12Write protected CDB = 0x2e (SSD)Symptom: 52617 Desc: HARDWARE: Jan 13 12:25:06 avamarnode01 MR_MONITOR[17160]: Controller ID: 0 Unexpected sense: PD #012 = Port 0 - 3:2:8Internal target failure, CDB =Symptom: 52617 Desc: HARDWARE: Jun 11 15:05:40 avamarnode01 MR_MONITOR[6518]: Controller ID: 0 Unexpected sense: PD = Port 0 - 3:2:12Information unit CRC error detected, CDBAvamar Gen4s Hardware >> Symptom Code 52617 - Controller ID: 0 Unexpected sense: PD = Port 0 - 3:2:9Information unit CRC error detected (Physical Disk)Avamar Gen4s Hardware >> Symptom Code 52617/52623 - Controller ID: 0 Unexpected sense: PD = Port 0 - 3:2:12Information unit CRC error detected (SSD)Symptom: 52617 Desc: HARDWARE: Oct 28 00:00:09 avamarnode01 MR_MONITOR[2768]: Controller ID: 0 Unexpected sense: PD #012 = -:-:0Write error - recommend reassignment, CDB = 0Avamar: Gen4s Hardware: Symptom Code 52617 - Unexpected sense: PD = Port 0 - 3:2:12. Write protected (SSD)Symptom: 52618HWSymptom: 52618 Desc: HARDWARE: Jan 1 12:46:56 avamarnode01 MR_MONITOR[26119]: Controller ID: 0 PD Predictive failure: #012 Port 0 - 3:2:4Avamar Gen4s Hardware >> Symptom Code 52618 - Controller ID: 0 PD Predictive failure: Port 0 - #:#:# (Physical Disk ONLY)Symptom: 52619HWSymptom: 52619 Desc: HARDWARE: Jan 24 17:10:54 avamarnode01 MR_MONITOR[5797]: Controller ID: 0 Fatal firmware error: Line 866 in ../../raid/2108vI2o.c Event ID:15Symptom: 52620HWSymptom: 52620 Desc: HARDWARE: Feb 9 08:10:07 avamarnode01 MR_MONITOR[2440]: Controller ID: 0 Battery temperature is highAvamar - Gen4s Hardware >> Symptom Code 52620 - Controller ID: 0 Battery temperature is highSymptom: 52621HWSymptom: 52621 Desc: HARDWARE: Feb 10 06:56:26 avamarnode01 MR_MONITOR[8760]: Controller ID: 0 BBU disabled; changing WB logical drives to WT, Forced WB VDs are not affectedAvamar : Gen4s Hardware >> Symptom Code 52621 - BBU disabled; changing WB logical drives to WT, ForcedSymptom: 52622HWSymptom: 52622 Desc: HARDWARE: Mar 1 19:20:13 avamarnode01 MR_MONITOR[6682]: Controller ID: 0 Error: Port 0 - 3:2:0 ( Error 250)Symptom: 52622 Desc: HARDWARE: Mar 20 18:12:22 avamarnode01 MR_MONITOR[18283]: Controller ID: 0 PD Reset: PD #012 = Port 0 - 3:2:0, Error #012 = 3, Path =#012 0Avamar - Gen4s Hardware >> Symptom Code 52622 - Controller ID: 0 PD Reset: PD = Port 0 - 3:2:3 Error = 3 Path = 0x5001517E3B0BF0A3Symptom: 52623HWSymptom: 52623 Desc: HARDWARE: MR_MONITOR[2292]: Controller ID: 0 Unexpected sense: PD #012 = Port 0 - 3:2:11Power on, reset, or bus device resetAvamar - Gen4S Hardware >> Power on, reset, or bus device reset occurred (POR) - Single DiskSymptom: 52626HWSymptom: 52626 Desc: HARDWARE: Jan 1 12:55:50 avamarnode01 MR_MONITOR[6681]: Controller ID: 0 Command timeout on PD: PD = Port 0 - 3:2:12No addtional sense informatioAvamar Gen4S Hardware >> Symptom Code 52622/52626 - Power On, reset, or bus device reset occurred (POR) - Multiple DisksSymptom: 52627HWSymptom: 52627 Desc: HARDWARE: Oct 21 14:43:11 avamarnode01 ipmiutil: igetevent: 0044 01/01/05 00:00:24 MAJ BMC Voltage #de BB +3.3V Vbat Lo Crit thresh actual=0.89 V, threshold=2.14 VAvamar Gen4S Hardware >> Symptom Code 52601/52627 - MAJ BMC Voltage #de BB +3.3V Vbat Lo Crit thresh actual=2.05 V, threshold=2.14 VSymptom: 52628HWSymptom: 52628 Desc: HARDWARE: Apr 21 12:10:53 avamarnode01 MR_MONITOR[1675]: Controller ID: 0 Single-bit ECC error; critical threshold#012 exceeded: ECAR #012 = 959688528Gen4s Hardware >> Symptom Code 52619/52628 - Controller ID: 0 Single-bit ECC error; critical threshold exceeded: ECAR = 437816864 , ELOG = 73728, ( Src: Data Bits lane bitmap=0001, bank bitmap=00, elog 12000)Symptom: 52630HWSymptom: 52630 Desc: HARDWARE: Jan 15 14:34:32 avamarnode01 ipmiutil: igetevent: 009e 01/15/22 14:34:27 CRT Bios Critical Interrupt #04 PCIe Fat Sensor PCIe Fatal Surprise Link Down (00:02.0) 70 [a1Avamar Gen4s Hardware >> Symptom Code 52600/52630 - CRT Bios Critical Interrupt PCIe Fat Sensor PCIe Fatal Surprise Link DownSymptom: 52630 Desc: HARDWARE: Feb 9 13:00:37 avamarnode01 ipmiutil: igetevent: 0199 02/09/22 12:58:48 MAJ Bios Critical Interrupt #05 PCIe Cor Sensor PCIe Warn Receiver Error (00:01.1) 71 [a0 0Avamar Gen4s Hardware >> Symptom Code 52601/52630 - MAJ Bios Critical Interrupt #05 PCIe Cor Sensor PCIe Warn Receiver ErrorSymptom: 52632HWSymptom: 52632 Desc: HARDWARE: Jul 20 17:14:47 avamarnode01 ipmiutil: igetevent: 00cf 07/20/22 17:12:13 MAJ BMC Memory #c0 Mem P1 Thrm Trip ECC limit reached, DIMM[63] 6f [2a ff 19]Symptom: 52633HWSymptom: 52633 Desc: HARDWARE: Mar 10 18:44:53 avamarnode01 MR_MONITOR[5703]: Controller ID: 0 Battery has failed and cannot support data retention. Please replace the battery EventGen4s Hardware >> Symptom Code 52619/52633 - Controller ID: 0 Battery has failed and cannot support data retention. Please replace the battery Event ID:150Symptom: 52700 HWSymptom: 52700 Desc: HARDWARE: Jan 19 11:05:05 avamarnode01 ipmiutil: igetevent-gen4t: 00c4 01/19/22 11:04:59 CRT BMC Shutdown In Progress #f4 Delayed Reb Reason Code: 0x0e 6f [81 0e ff]Avamar : Gen4T Hardware >> Symptom Code 52700 - HW_CRIT_BMC_ERR_G4TSymptom: 52701 HWSymptom: 52701 Desc: HARDWARE: Dec 28 15:03:40 avamarnode01 ipmiutil: igetevent-gen4t: 011c 12/28/21 14:13:06 MAJ BMC CMD Status #d2 Power Good (deasserted) 6f [00 ff ff]Avamar : Gen4T Hardware >> Symptom Code 52701 - HW_MAJOR_BMC_ERR_G4TSymptom: 52704 HWSymptom: 52704 Desc: HARDWARE: Jul 1 00:35:20 avamarnode01 ipmiutil: igetevent-gen4t: 014c 07/01/21 00:35:15 MAJ BMC Temperature #31 DIMM_Bank1 Hi Crit thresh actual=127.00 C, threshold=90.00 CSymptom: 52706 HWSymptom: 52706 Desc: HARDWARE: Feb 25 11:39:56 avamarnode01 ipmiutil: igetevent-gen4t: 0158 02/25/22 11:39:50 MIN BMC Temperature #5f Ambient_Temp Hi Noncrit thresh actual=48.00 C, threshold=48.Symptom: 52710 HWSymptom: 52710 Desc: A fan sensor detected a fan speed that has exceeded the non-critical rpm threshold within the Gen4T system.Symptom: 52712HWSymptom: 52712 Desc: HARDWARE: Apr 20 19:12:53 avamarnode01 ipmiutil: igetevent-gen4t: 0003 01/01/70 00:00:01 MAJ BMC Voltage #d8 CMOS_Voltage Lo Crit thresh actual=0.96 V, threshold=2.20 VSymptom: 52714HWSymptom: 52714 Desc: HARDWARE: Jul 18 13:59:53 avamarnode01 hwfaultd: DIMM 7 FaultSymptom: 52716HWSymptom: 52716 Desc: HARDWARE: Feb 3 16:20:57 avamarnode01 hwfaultd: SLIC 1 FaultSymptom: 52717HWSymptom: 52717 Desc: HARDWARE: Feb 6 15:22:52 avamarnode01 hwfaultd: PS 0 FaultSymptom: 52718HWSymptom: 52718 Desc: HARDWARE: Jan 18 11:07:04 avamarnode01 hwfaultd: Fan 0 FaultSymptom: 52719HWSymptom: 52719 Desc: HARDWARE: Feb 20 00:24:41 avamarnode01 hwfaultd: I2C 7 FaultSymptom: 52722HWSymptom: 52722 Desc: HARDWARE: Feb 28 19:02:01 avamarnode01 hwfaultd: CPU Module FaultSymptom: 52723 HWSymptom: 52723 Desc: HARDWARE: Jan 3 08:46:41 avamarnode01 hwfaultd: Expander Unrecoverable ErrorAvamar Server: Gen4T M2400 storage nodes became unresponsive and reported "hwfaultd: Expander Unrecoverable Error" & spontaneous expander resetSymptom: 52724HWSymptom: 52724 Desc: HARDWARE: Feb 17 11:31:21 avamarnode01 hwfaultd: Expander Critical ErrorSymptom: 52725HWSymptom: 52725 Desc: HARDWARE: Jul 25 17:34:57 avamarnode01 hwfaultd: Expander Non-Critical ErrorAvamar Gen4T: hwfaultd: Expander Non-Critical ErrorSymptom: 52726HWSymptom: 52726 Desc: HARDWARE: Aug 24 22:58:25 avamarnode01 resume_updater: libplatform failed to initialize.Avamar - libplatform failed to initializeSymptom: 52728HWSymptom: 52728 Desc: HARDWARE: Jan 1 14:00:07 avamarnode01 resume_updater: Chassis resume settings cannot be read.Avamar : Gen4T Hardware >> Symptom Code 52728 - HW_RESUPD_CHASSIS_ERR_G4TSymptom: 52728 Desc: HARDWARE: Jan 3 09:15:04 avamarnode01 resume_updater: The chassis is not an Avamar system, and there is no saved resume settings. It needs to be manually set up.Avamar Gen4T: Error "The chassis is not an Avamar system, and there is no saved resume settings" after SP replacementSymptom: 52728 Desc: HARDWARE: Jan 1 01:00:02 avamarnode01 resume_updater: Chassis resume settings do not match saved values. Need examination.Avamar - Gen4T: Chassis resume settings do not match saved valuesSymptom: 52750HWSymptom: 52750 Desc: HARDWARE: Jun 4 13:50:12 Adaptec Event Monitor: [10001] :ERR: A controller has been removed from the system: controller 1Symptom: 52751HWSymptom: 52751 Desc: HARDWARE: Jan 10 22:50:02 avamarnode01 Adaptec Event Monitor: [11003] :WRN: One or more logical devices contain a bad stripe: controller 1 ( PM8060-RAID #00192603647 Physical SloSymptom: 52755HWSymptom: 52755 Desc: HARDWARE: Jan 3 04:02:14 avamarnode01 Adaptec Event Monitor: [12000] :WRN: Logical device is degraded: controller 1 ( PM8060-RAID #FFFFFF00 Physical Slot: 0 ), logical devAvamar : Gen4T Hardware >> Symptom Code 52755 - HW_VDISK_LOST_REDUND_G4TSymptom: 52756HWSymptom: 52756 Desc: HARDWARE: Mar 5 01:25:55 avamarnode01 Adaptec Event Monitor: [12005] :ERR: Rebuild failed: controller 1 ( PM8060-RAID #BD162100991 Physical Slot: 0 ), logical device 0 (vd0Symptom: 52759HWSymptom: 52759 Desc: HARDWARE: Jan 3 18:13:01 avamarnode01 Adaptec Event Monitor: [13000] :ERR: Failed drive: controller: 1 ( PM8060-RAID #FFFFFF00 Physical Slot: 0 ), channel: 0, deviceID: 11, enclSymptom: 52764HWSymptom: 52764 Desc: HARDWARE: Jan 2 16:32:12 avamarnode01 Adaptec Event Monitor: [13016] :WRN: Bad Block discovered: controller: 1Avamar : Gen4T Hardware >> Symptom Code 52764 - HW_PHYS_DISK_BAD_BLOCK_G4TSymptom: 52766HWSymptom: 52766 Desc: The wearout threshold was reach on an SSD drive.Symptom: 52768HWSymptom: 52768 Desc: HARDWARE: Nov 22 04:41:23 avamarnode01 Adaptec Event Monitor: [14006] :WRN: Backup unit temperature exceeded the over-heat threshold: controller 1 ( PM8060-RAID #00190136942 PhSymptom: 52769HWSymptom: 52769 Desc: HARDWARE: Mar 13 09:15:47 avamarnode01 Adaptec Event Monitor: [14009] :ERR: Backup unit encountered a fatal condition: controller 1Symptom: 52805HWSymptom: 52804 Desc: HARDWARE: Jun 21 23:01:02 avamarnode01 hardware_monitor[55820]: MessageID:MEM9072, Created:2023-06-21T18:00:57-05:00, Severity:Critical, MessageAvamar ADS Gen5: Event code=52804 A memory device error occurredSymptoms: 29.AAA1.902High Incoming Box (HIB) Symptom: 29.AAA1.902Troubleshooting SYR HIB Service Requests (Resolution Path)