Loading...
Loading...
If the connection between the XMS and Storage Controllers is down, the XMS is unable to manage and report the status of Storage controllers. Alerts such as the following may appear: Code Alert Text Reason 0400503 disconnected_from_node_mgr Storage Controller is disconnected from the XMS. N/A 0200901 disconnected_from_sys_mgr Cluster manager is running, but it is disconnected from XMS due to <reason>. sys_connect_error, no_active_sym, none, so on
These alerts may occur for many different reasons. It means that there are some network issues that caused a disconnection between XMS and Storage Controllers. It may also be that the cluster was decommissioned or moved to another XMS. NOTE: Confirm if the alert "The Storage Controller is disconnected from the XMS." is raised and cleared quickly several times a day or during a specific time every day. If it does, it could be related to a job or scan being run on the network. Network scans (like vulnerability scans, SNMP sweeps, or backup jobs) can generate heavy traffic or bursts of packets that saturate the management network. The XMS communicates with SCs over the management network using HTTPS and SSH. If the network is congested or latency spikes, the heartbeat packets may be delayed or dropped, causing temporary disconnect alerts to raise and clear. Removing the XMS and controller management IPs from any network scans usually resolves the issue.
Scenario 1: XMS is no longer managing the cluster. Alert code 0200901 may indicate that a cluster was removed from the XMS without the cluster being properly decommissioned from the XMS. Verify if the cluster has been decommissioned or moved to another XMS. If the cluster has been decommissioned or moved to another XMS, remove the cluster from this XMS. The cluster can be removed from the XMS using XMCLI command remove-cluster if decommissioned or moved. Scenario 2: XMS - Cluster connectivity issue The following steps to troubleshoot the network connectivity: Step 1: Check Network Connections: Log in to the XMS by using PuTTY and login as admin. Run the below XMCLI command to see the XMS and Storage Controllers' (SCs) IP addresses. xmcli (admin)> show-ip-addresses Name: xms Index: 1 XMS-IP-Addr: 10.241.174.92 XMS-IP-Addr-Subnet: 255.255.255.0 XMS-GW-Addr: 10.241.174.1 XMS-Secondary-IP-Addr: XMS-Secondary-IP-Addr-Subnet: XMS-Secondary-GW-Addr: Storage-Controller-Name Index Cluster-Name Index Mgr-Addr-Subnet MGMT-GW-IP X1-SC1 1 SVT-012 2 10.241.192.93/24 10.241.174.1 X1-SC2 2 SVT-012 2 10.241.192.94/24 10.241.174.1xmcli (admin)> show-xms Name Index SW-Version Xms-IP-Addr-With-Subnet Xms-Secondary-IP-Addr-With-Subnet XMS-Mgmt-Ifc REST-API-Protocol-Version IP-Version Secondary-IP-Version Default-User-Inactivity-Timeout Repeating-Alert-Threshold Min-TLS-Version xms 1 6.4.4-5 10.241.192.92/24 eth0 3.1 ipv4 ipv6 10 10 1.2 xmcli (admin)> show-storage-controllers Storage-Controller-Name Index Mgr-Addr IB-Addr-1 IB-Addr-2 X-Brick-Name X-Brick-Index Cluster-Name Index Lifecycle-State Health-State Enabled-State Stop-Reason Conn-State Journal-State SC-HW-Label X1-SC1 1 10.241.192.93 169.254.0.1 169.254.0.2 X1 1 SVT-001 2 healthy healthy enabled none connected healthy X1-SC1 X1-SC2 2 10.241.192.94 169.254.0.17 169.254.0.18 X1 1 SVT-001 2 healthy healthy enabled none connected healthy X1-SC2 Run XMCLI command below with the correlating storage controller id to test the XMS to Storage controller network connectivity. test-xms-storage-controller-connectivity sc-id=1 If the connectivity checks fail, the network environment should be checked to see if there are issues interrupting connectivity between the XMS and the SCs. If the connectivity checks are good, proceed with step 2. Step 2: Collect a cluster log bundle for analysis (see article in notes for how to collect logs) Check all the alerts. If an alert " Management link health status cannot be determined " is also reported, it means that there is a network issue with the environment. The network should be checked. If the alert "The Storage Controller is disconnected from the XMS" reported once then removed and no other alerts populated, monitor for a couple days for any additional incidents. If further evaluation of the incident is preferred, engage Dell Support. Alert Example: 2015-11-18 02:38:08,015: Raised alert: "The Storage Controller is disconnected from the XMS." object: X2-SC1 [3] severity: major 2015-11-18 02:38:14,408: Removed alert: "The Storage Controller is disconnected from the XMS." object: X2-SC1 [3] Step 3: Engage Dell Support for further analysis if the alert code 0400503 is raised and has not cleared.
Click on a version to see all relevant bugs
Dell Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.