...
The first error is a lose communication with the MD24xx JBOD enclosure. When connection is restored, one or more physical drives on the MD24xx enclosure are placed into a Foreign Configuration state by the PERC adapter.Foreign ConfigurationA foreign configuration is a set of physical disk drives that contain a RAID configuration that is not managed by the RAID controller it is attached to. Further, the disk configuration is also present on each disk in the array. If the configuration of a disk does not match what the controller has for that disk, then the controller places it into a foreign configuration state.It is important to understand that drives in a foreign state are a symptom of a larger issue, it is NOT the issue. Troubleshoot why the PERC adapter placed the drives in a foreign state to begin with. Most likely, it is a SAS connection reset on the attached MD24xx JBOD storage, the SAS cable(s), the PERC adapter itself, or possibly a power-related issue.
A PERC adapter reporting a foreign configuration can happen for several reasons including, but not limited to: Moving physical drives from one server to another or swapping slots can cause the controller to identify the disks as a foreign configuration. Replacing the PERC adapter can cause arrays that previously worked to no longer be recognized after a power outage. A RAID configuration may already exist on a replacement set of drives. Reseating or removing an external enclosure cable (or if the cable is loose) while the operating system is running can result in a foreign configuration when the connection is reestablished. Improper rebooting or power cycling of a server with attached external storage may cause the disks to be placed in a foreign state. A power-related event occurs causing the enclosure to reset/power cycle. This could be related to a failed PSU, power outage, or a CPLD reset in the enclosure itself.
The MD24xx JBOD is still a fairly new product and functions differently in several ways compared to previous JBOD devices, such as MD12xx, MD14xx, MD3060e, or ME484. Changes have been made in how it is validated for setup, configuration, and usage.Recently we are seeing an increase in drives being placed into a foreign state by the PERC adapter on drives installed in the attached MD24xx JBOD enclosure. The main thing all the cases have in common is that they are all connected to a PERC adapter which is where a foreign configuration state originates from.The foreign configuration can be imported immediately to get data access back; however, if the root cause of the SAS connection reset is not discovered, the drives are likely to be placed in a foreign state again. The PERC adapter is often mistakenly identified as the cause of the issue. Most often it is not the PERC adapter that is causing this issue. As indicated previously, the foreign configuration state is a symptom of the issue, it is NOT usually the issue. Any disruption along the SAS communication chain can cause the PERC adapter to place disks in a foreign state if it does not recognize the RAID configuration when the connection is restored.The following are steps to help analyze and troubleshoot what is causing the PERC adapter to place the drives in a foreign state. Confirm that it is a supported configuration according to the Support Matrix: Supported Operating Systems Windows Server 2022 Windows Server 2019 Red Hat Enterprise Linux 9.1 Red Hat Enterprise Linux 9.0 Red Hat Enterprise Linux 8.7 Red Hat Enterprise Linux 8.6 SUSE Linux Enterprise Server 15 SP5 Supported PERC Adapters PERC H840 PERC H965e Supported SAS Cables (P/Ns - F82HG, NX1XW, W1W05, 3J2R2, 39Y00, TV165 SAS-4 (24 Gbps - White Tab) Supported Hard Drives See the Support Matrix for the specific MD24xx JBOD array. Confirm any steps taken to troubleshoot the SAS cables. Is there at least one SAS cable connected to each EMM on the MD24xx JBOD array? With the MD24xx JBODs, there must be redundant cabling; single cabling to one EMM is NOT supported. Have the SAS cables been reseated or replaced? Are they using the SAS-4 cables (white tab) that shipped with their MD24xx JBOD array? Is the port LED on the EMM lit? The port LED should be solid Amber color if using a SAS-3 endpoint/cable (Blue Tab). The port LED should be solid Green color if using a SAS-4 endpoint/cable (White Tab). Note:A SAS-3 Endpoint/Cable is a 12 Gbps SAS HBA adapter or cable (with a Blue Tab), such as a PERC H840 or an HBA355e adapter. SAS-3 cables are not supported for use with the MD24xx JBOD array.A SAS-4 Endpoint/Cable is a 24 Gbps SAS HBA adapter or cable (with a White Tab) such as a PERC H965e. SAS-4 cables are the only cables supported for use with the MD24xx JBOD arrays.If either the SAS cable or the adapter is a SAS-3 version, the port LED is solid Amber. What parts (if any) have been replaced? Confirm the driver and firmware updates. What is the PERC adapter’s firmware and driver revisions? Update both to the latest revisions. What is the MD24xx JBOD EMM firmware revision? Update to the latest revision. What is the hard drive firmware revision on the drives that are in a foreign state? Update to the latest revision. Is the customer running with the latest Windows updates? Have there been any recent firmware or driver updates to the EMM, drives, PERC, so on? What other troubleshooting steps have been taken? Was any sort of maintenance occurring on the server that involved a reboot of the server and by extension the attached MD24xx storage array? What logs have been collected (Windows System Event Logs, TSR, EMCGrab, SHMCLI Output, vm-support bundles, so on)? Gather ESXi host diagnostic logs (that is, vm-support bundle), see VMware article for details. "vm-support" command in ESX/ESXi to collect diagnostic information (external link) Collect a Linux sosreport. Most Linux distributions include the sosreport utility. See the Red Hat article for details. What is a sos report and how to create one in Red Hat Enterprise Linux? (external link) Collect an EMC grab by following article: How To Run EMC Grab On Microsoft Windows Hosts. Collect the Powertools SHMCLI output using the HWInfo script. Contact Dell Technical Support to collect this output. Collect output from PowerVault JBOD EMMs using Serial Connection. Contact Dell Technical Support to collect this output. Confirm what version of iDRAC the customer is using. iDRAC 7.10.30.00 has been released. Support for PERC H965e on 16g servers and HBA 355e on 15g and 16g servers. iDRAC 7.00.00.171 has been released. Support for PERC H840 and HBA355e on 14g servers. Any instance of the above not being followed can cause issues. Correct all or most of the above before escalating or repeats can occur. Remember to look for SAS connection resets in the logs, this will ultimately lead you to the solution to this problem.