...
When running a Microsoft Windows Failover Clustering (WFC) instance in a VMware ESXi Cluster Across Box (CAB) configuration, and using shared physical mode Raw Device Mapping (RDM), Windows event logger reports a critical error in the system logs during faults. A SAN storage controller fault or a redundant target port failure might trigger an unexpected failover of WFC resourcesThe Windows 2008 and Windows 2012 or Windows 2012 R2 system event logs show these critical/error/warning messages: Windows 2008: Event ID: 1135 Cluster node 'node name' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges. Event ID: 1069 Cluster resource 'Cluster Disk # in clustered service or application 'Cluster Group' failed. Event ID: 1177 The Cluster service is shutting down because quorum was lost. This could be due to the loss of network connectivity between some or all nodes in the cluster, or a failover of the witness disk. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapter. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges. Event ID: 7024 The Cluster Service service terminated with service-specific error A quorum of cluster nodes was not present to form a cluster. Event ID: 7031 The Cluster Service service terminated unexpectedly. It has done this 1 time(s). The following corrective action will be taken in 60000 milliseconds: Restart the service. Windows 2012 or Windows 2012 R2 : Event ID: 140 The system failed to flush data to the transaction log. Corruption may occur in VolumeId: X:, DeviceName: \Device\HarddiskVolume#. ({Device Busy} The device is currently busy.) Event ID: 1038 Ownership of cluster disk 'Cluster Disk #' has been unexpectedly lost by this node. Run the Validate a Configuration wizard to check your storage configuration. Event ID: 1069 Cluster resource 'Cluster Disk #' of type 'Physical Disk' in clustered role 'X:' failed. The error code was '0xaa' ('The requested resource is in use.'). Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet. In the /var/log/vmkernel.log files on the ESXi host, you see similar warning messages: WARNING: NMP: nmpUpdatePReservationOnFailover:1264: Unable to check for matching key on failover for device "naa.600a098044306879702b454e48496232" WARNING: NMP: nmp_DeviceUpdatePathStates:886: Could not drop reservation on failover for NMP device "naa.600a098044306879702b454e48496232". WARNING: NMP: nmpDeviceAttemptFailover:603: Retry world failover device "naa.600a098044306879702b454e48496232" - issuing command 0x439dc97ccd80 Note: This unexpected resource failover may affect the application or I/O running on the failover cluster. Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
This unexpected resource failover may affect the application or I/O running on the failover cluster. If this issue occurs, you must restart your application.
This issue is resolved in: VMware ESXi 6.0 Update 1b, available at VMware Downloads.VMware ESXi 5.5 patch ESXi550-201512001. For more information, see VMware ESXi 5.5, Patch Release ESXi550-201512001 (2135410).
To be alerted when this document is updated, click the Subscribe to Article link in the Actions box.. For more information about Microsoft Clustering solutions running on VMware vSphere, see: Microsoft Cluster Service (MSCS) support on ESXi/ESX (1004617)MSCS support enhancements in vSphere 5.5 (2052238)Microsoft Clustering on VMware vSphere: Guidelines for supported configurations (1037959)Microsoft Windows Server Failover Clustering (WSFC) with shared disks on VMware vSphere 6.x: Guidelines for supported configurationsSetup for Failover Clustering and Microsoft Cluster Service: vSphere 5.5vSphere 6.0 To view the Windows Event logs: Windows 2012: In the right pane of the Server Manager window, click Tools and select Event Viewer from the menu.In the left pane of the Event Viewer window, go to Event Viewer (Local) > Windows Logs > System. Windows 2008: In the left pane of the Server Manager window, go to Server Manager > Diagnostics > Event Viewer > Windows Logs > System. Microsoft Cluster Service (MSCS) support on ESXi/ESXLUN filtering mechanism during RDM creationGuidelines for Microsoft Clustering on vSphereMSCS support enhancements in vSphere 5.5共有物理モード RDM を使用した VMware ESXi Cluster Across Box における Windows フェイルオーバ クラスタリングが Windows イベント ログで重大なエラーを報告する使用共享物理模式 RDM 在 VMware ESXi 跨机箱群集上运行 Microsoft Windows 故障切换群集时会在 Windows 事件日志中报告严重错误