...
Occurred without any changeEnvironment:VMware ESXi 6.7.0 Update 3PowerPath/VE 6.6.0.00.00-b117 Server UCSB-B200-M5PSOD on ESXi host:
Asked to check with VMware and here is what VMware had to say:===============================================================================- See FNIC aborts that can lead to PSOD:2020-05-31T07:40:13.397Z cpu42:2101014)ALERT: NMI: 696: NMI IPI: RIPOFF(base):RBP:CS [0x2e4965(0x418021800000):0x45026ec42009:0xfc8] (Src 0x1, CPU42)2020-05-31T07:40:53.701Z cpu0:2100039)Backtrace for current CPU #0, worldID=xxxxxxx, fp=0xxxxxxxxxxxxx2020-05-31T07:40:53.701Z cpu0:2100039)0x451b0049b7d0:[0x41802190b845]PanicvPanicInt@vmkernel#nover+0x439 stack: 0x3eb2244c50f5d4, 0x418021cd9e48, 0x451b0049b878, 0x0, 0x4180000000012020-05-31T07:40:53.701Z cpu0:2100039)0x451b0049b870:[0x41802190bad1]Panic_WithBacktrace@vmkernel#nover+0x56 stack: 0x451b0049b8e0, 0x451b0049b890, 0x451b0049b91c, 0x4180219081d9, 0x2a2020-05-31T07:40:53.701Z cpu0:2100039)0x451b0049b8e0:[0x418021af71d0]Heartbeat_DetectCPULockups@vmkernel#nover+0x431 stack: 0x36b0, 0x0, 0xbf68, 0x321b48878, 0x2022020-05-31T07:40:53.701Z cpu0:2100039)0x451b0049b960:[0x41802191d252]Timer_BHHandler@vmkernel#nover+0xe3 stack: 0x3eb2244ac00000, 0x451a00000000, 0x1, 0x418040000000, 0x42020-05-31T07:40:53.701Z cpu0:2100039)0x451b0049b9e0:[0x4180218cd1ca]BH_DrainAndDisableInterrupts@vmkernel#nover+0x7b stack: 0x451b0049bab0, 0xef000000ff, 0x0, 0x4180400004d0, 0xffffffffffffffff2020-05-31T07:40:53.701Z cpu0:2100039)0x451b0049ba70:[0x4180218f0caa]IntrCookie_VmkernelInterrupt@vmkernel#nover+0xb3 stack: 0xffffffffffffffef, 0x41802194717d, 0x451b0049bb40, 0x0, 0x02020-05-31T07:40:53.701Z cpu0:2100039)0x451b0049ba90:[0x41802194717c]IDT_IntrHandler@vmkernel#nover+0x9d stack: 0x0, 0x418021962067, 0x10b, 0x0, 0x02020-05-31T07:40:53.701Z cpu0:2100039)0x451b0049bab0:[0x418021962066]gate_entry@vmkernel#nover+0x67 stack: 0x0, 0x0, 0x0, 0x0, 0x12020-05-31T07:40:53.701Z cpu0:2100039)0x451b0049bb70:[0x41802189b89d]Power_ArchPerformWait@vmkernel#nover+0x75 stack: 0x4180400007c0, 0x0, 0x7fffffff00000000, 0x1, 0x4180400000002020-05-31T07:40:53.701Z cpu0:2100039)0x451b0049bb78:[0x41802189ba26]Power_ArchSetCState@vmkernel#nover+0x8f stack: 0x0, 0x7fffffff00000000, 0x1, 0x418040000000, 0x02020-05-31T07:40:53.701Z cpu0:2100039)0x451b0049bbc8:[0x418021b099f2]CpuSchedIdleLoopInt@vmkernel#nover+0x333 stack: 0x450252486000, 0x0, 0x0, 0x1, 0x4180400000002020-05-31T07:40:53.701Z cpu0:2100039)0x451b0049bc38:[0x418021b0c865]CpuSchedDispatch@vmkernel#nover+0x13de stack: 0x418040000108, 0x451ac2ca3100, 0x451ac00a3100, 0x451b033a3500, 0x4180400000802020-05-31T07:40:53.701Z cpu0:2100039)0x451b0049bd68:[0x418021b0dd69]CpuSchedWait@vmkernel#nover+0x2f6 stack: 0x1, 0x80000000009b56dc, 0x451b004a32c0, 0x101b0846d, 0x02020-05-31T07:40:53.701Z cpu0:2100039)0x451b0049bdf8:[0x418021b0e322]CpuSched_EventQueueWaitShared@vmkernel#nover+0x4f stack: 0x0, 0x4321e1ee7ed0, 0x9146e516dc, 0x418021f94018, 0x4321e1ee7ec82020-05-31T07:40:53.701Z cpu0:2100039)0x451b0049be28:[0x418021f94017]UserThread_QueueWait@(user)#+0x34 stack: 0xd74367, 0x0, 0x3ca08ef, 0x418021fa47ae, 0xffffffffffffffff2020-05-31T07:40:53.701Z cpu0:2100039)0x451b0049be58:[0x418021fa47ad]LinuxThread_Futex@(user)#+0x282 stack: 0x121faf160, 0x145a41600, 0x451b0049bec8, 0x418021f48282, 0x9146e516dc2020-05-31T07:40:53.701Z cpu0:2100039)0x451b0049bee8:[0x418021f4a31b]User_LinuxSyscallHandler@(user)#+0x180 stack: 0x451b0049bfc8, 0x0, 0x0, 0x0, 0x02020-05-31T07:40:53.701Z cpu0:2100039)0x451b0049bf28:[0x41802192bb6c]User_LinuxSyscallHandler@vmkernel#nover+0x1d stack: 0x10b, 0x0, 0x0, 0xca, 0x9189fcedbc2020-05-31T07:40:53.701Z cpu0:2100039)0x451b0049bf38:[0x418021962066]gate_entry@vmkernel#nover+0x67 stack: 0x0, 0xca, 0x9189fcedbc, 0xd74367, 0x9146e516802020-05-31T07:40:53.713Z cpu0:2100039)VMware ESXi 6.7.0 [Releasebuild-14320388 x86_64]2020-05-31T07:38:49.169Z cpu39:2097900)nfnic: : INFO: fnic_abort_cmd: 2801: Abort cmd called for Tag: 0x401 issued time: 30001 ms CMD_STATE: FNIC_IOREQ_CMD_PENDING CDB Opcode: 0x4d sc:0x45a3134249c0 flags: 0x3 lun: 1 target: 0xa00c02020-05-31T07:38:49.169Z cpu39:2097900)WARNING: nfnic: : fnic_abort_cmd: 2815: Abort for cmd tag: 0x401 in pending state2020-05-31T07:38:51.178Z cpu43:2097195)nfnic: : INFO: fnic_fcpio_icmnd_cmpl_handler: 1489: io_req: 0x45a30be00890 sc: 0x45a3134249c0 tag: 0x401 CMD_FLAGS: 0x53 CMD_STATE:FNIC_IOREQ_ABTS_PENDING ABTS pending hdr status: FCPIO_ABORTED scsi_status: 0x0$2020-05-31T07:38:51.178Z cpu43:2097195)nfnic: : INFO: fnic_fcpio_itmf_cmpl_handler: 1940: fcpio hdr status: FCPIO_TIMEOUT2020-05-31T07:38:51.178Z cpu43:2097195)nfnic: : INFO: fdls_tgt_send_adisc: 895: sending ADISC to tgt: 0xa00c02020-05-31T07:38:51.178Z cpu43:2097195)nfnic: : INFO: fnic_fcpio_itmf_cmpl_handler: 1987: io_req: 0x45a30be00890 sc: 0x45a3134249c0 id: 0x401 CMD_FLAGS: 0x73 CMD_STATE: FNIC_IOREQ_ABTS_PENDINGhdr status: FCPIO_TIMEOUT ABTS cmpl received2020-05-31T07:39:00.105Z cpu57:2097436)WARNING: nenic: _vnic_dev_cmd2:333: 0000:62:00.0:Timed out devcmd 42020-05-31T07:39:00.206Z cpu57:2097436)WARNING: nenic: _vnic_dev_cmd2:333: 0000:62:00.1:Timed out devcmd 42020-05-31T07:39:01.181Z cpu42:2098260)nfnic: : INFO: fnic_queuecommand: 718: Tport: 0xa02c0 not ready or not in adisc state2020-05-31T07:39:01.181Z cpu42:2098260)nfnic: : INFO: fnic_queuecommand: 718: Tport: 0xa00c0 not ready or not in adisc state2020-05-31T07:39:12.642Z cpu42:2098260)nfnic: : INFO: fnic_queuecommand: 718: Tport: 0xa02c0 not ready or not in adisc state2020-05-31T07:39:12.642Z cpu42:2098260)nfnic: : INFO: fnic_queuecommand: 718: Tport: 0xa00c0 not ready or not in adisc state2020-05-31T07:39:20.104Z cpu57:2097436)WARNING: nenic: _vnic_dev_cmd2:333: 0000:62:00.0:Timed out devcmd 42020-05-31T07:39:20.204Z cpu57:2097436)WARNING: nenic: _vnic_dev_cmd2:333: 0000:62:00.1:Timed out devcmd 42020-05-31T07:39:26.518Z cpu42:2098260)nfnic: : INFO: fnic_queuecommand: 718: Tport: 0xa02c0 not ready or not in adisc state2020-05-31T07:39:26.518Z cpu42:2098260)nfnic: : INFO: fnic_queuecommand: 718: Tport: 0xa00c0 not ready or not in adisc state2020-05-31T07:39:26.753Z cpu39:2097900)WARNING: nfnic: : fnic_taskMgmt: 1753: TaskMgmt Abort2020-05-31T07:39:26.753Z cpu39:2097900)nfnic: : INFO: fnic_taskMgmt: 1788: TaskMgmt abort sc->cdb: 0x122020-05-31T07:39:26.753Z cpu39:2097900)nfnic: : INFO: fnic_abort_cmd: 2801: Abort cmd called for Tag: 0x402 issued time: 30001 ms CMD_STATE: FNIC_IOREQ_CMD_PENDING CDB Opcode: 0x12 sc:0x459b25567c80 flags: 0x3 lun: 1 target: 0xa00c02020-05-31T07:39:26.753Z cpu39:2097900)WARNING: nfnic: : fnic_abort_cmd: 2815: Abort for cmd tag: 0x402 in pending statenfnic at 4.0.0.40-1OEM.670.0.0.8169922- The fnic driver needs to be update to latest driver firmware versions.- Also make sure that the the bios is at its latest version or have cisco look into that as I am seeing NMI errors in the PSOD:2020-05-31T07:40:13.397Z cpu42:2101014)ALERT: NMI: 696: NMI IPI: RIPOFF(base):RBP:CS [0x2e4965(0x418021800000):0x45026ec42009:0xfc8] (Src 0x1, CPU42)This issue appears to be hardware\driver related. Let's get those updated and monitor and see where we are after those updates.
As per VMware, this issue appears to be hardware\driver related.Since nfnic is at 4.0.0.40-1OEM.670.0.0.8169922- The fnic driver needs to be update to latest driver firmware versions.- Also make sure that the the bios is at its latest version or have cisco look into that as there are NMI errors in the PSOD