Symptoms
A stop error WHEA_UNCORRECTABLE_ERROR (124) occurs on a Windows server after a fatal error occurs on the PCI bus of a PowerEdge server.The system event log shows:
Log Name: System
Source: Microsoft-Windows-WER-SystemErrorReporting
Date: 4/01/2014 9:00:12 AM
Event ID: 1001
Task Category: None
Level: Information
Keywords: Classic
User: N/A
Computer: computername.domainname.local
Description:
The computer has rebooted from a bugcheck. The bugcheck was: 0x00000124 (0x0000000000000005, 0xfffffa800d0ce028, 0x0000000000000000, 0x0000000000000000). A dump was saved in: C:\Windows\MEMORY.DMP. Report Id: 031614-31356-01
The hardware log for the PowerEdge server has entries similar to the following, indicating a fatal error on the PCI bus.
Sun Mar 16 03:28:02 2014 A bus fatal error was detected on a component at bus 0 device 2 function 0. 0x030002421A2553B1000413186FAA1000h
Sun Mar 16 03:28:02 2014 A bus fatal error was detected on a component at slot 5. 0x020002421A2553B1000413186FAA0085h
The resulting dump file contains information similar to the following:
WHEA_UNCORRECTABLE_ERROR (124)
A fatal hardware error has occurred. Parameter 1 identifies the type of error
source that reported the error. Parameter 2 holds the address of the
WHEA_ERROR_RECORD structure that describes the error condition.
Arguments:
Arg1: 0000000000000005, Generic Error
Arg2: fffffa800d0ce028, Address of the WHEA_ERROR_RECORD structure.
Arg3: 0000000000000000
Arg4: 0000000000000000
Debugging Details:
------------------
BUGCHECK_STR: 0x124_GenuineIntel
CUSTOMER_CRASH_COUNT: 1
DEFAULT_BUCKET_ID: WIN7_DRIVER_FAULT_SERVER
CURRENT_IRQL: 0
STACK_COMMAND: kb
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: GenuineIntel
IMAGE_NAME: GenuineIntel
DEBUG_FLR_IMAGE_TIMESTAMP: 0
FAILURE_BUCKET_ID: X64_0x124_GenuineIntel_PCIEXPRESS
BUCKET_ID: X64_0x124_GenuineIntel_PCIEXPRESS
Cause
A hardware problem causes this issue.
Stop error 0x124 (WHEA_UNCORRECTABLE_ERROR)
Resolution
Determine the hardware component responsible for the PCI bus fatal error. Reseat that device and monitor.Perform diagnostics on the hardware component responsible for the PCI bus fatal error and replace if it is defective.Remove any other PCI devices installed in other slots. Boot the server and monitor it.Reseat the processors, boot the server, and monitor it.Perform diagnostics on the motherboard and processors.Replace anything found to be defective.