Loading...
Loading...
The PowerFlex system contains multiple Protection Domains (PDs).The PowerFlex MDM component is separated among the different Protection Domains.The PowerFlex MDM component is deployed on the first SDS server within each sequence of Protection Domains. Vmkernel logs from an impacted SDC show that it disconnected from the MDM system for approximately 18 minutes: 2024-03-17T07:43:07.386Z cpu105:2098415)WARNING: [32885921148] Disconnected from MDM with ID 1234567895dcce0f 2024-03-17T08:01:29.796Z cpu104:2098694)WARNING: [32887023552] Connected to MDM with ID 1234567895dcce0f MDM query_cluster output shows the MDMs: Cluster: Name: System1, ID: 1234567895dcce0f, Mode: 5_node, State: Normal, Active: 5/5, Replicas: 3/3 Master MDM: Name: server237 Secondary MDMs: Name: server201 Name: server219 MDM event logs show that multiple SDSs from different PDs were placed into PMM close to the same time: 2024-03-17 02:33:34.230000:0000572 INFO Command enter_protected_maintenance_mode received, SDS: ID: 1234567890000024 Name: server237 2024-03-17 02:37:00.207000:0000622 INFO Command enter_protected_maintenance_mode received, SDS: ID: 1234567890000012 Name: server219 2024-03-17 02:42:07.566000:0000311 INFO Command enter_protected_maintenance_mode received, SDS: ID: 1234567890000000 Name: server201 Multiple SDSes can be in PMM/IMM simultaneously if they are in different Protection Domains. Server message logs show that all MDM servers were powered off close to the same time: Mar 17 03:32:54 server237 systemd-logind: Power key pressed. Mar 17 03:37:16 server219 systemd-logind: Power key pressed. Mar 17 03:43:04 server201 systemd-logind: Power key pressed. The server power-on process was prolonged due to the installation of firmware updates. Output from query_all shows the order and sequence of the SDSs in the different Protection Domains: Protection Domain 123450000000000 Name: PD1 SDS ID: 1234567890000000 Name: Sds-server201 <- MDM SDS ID: 1234567890000001 Name: Sds-server202 SDS ID: 1234567890000002 Name: Sds-server203 Protection Domain 1234500000001 Name: PD2 SDS ID: 1234567890000012 Name: Sds-server219 <- MDM SDS ID: 1234567890000013 Name: Sds-server220 SDS ID: 1234567890000014 Name: Sds-server221 Protection Domain 12345000000002 Name: PD3 SDS ID: 1234567890000024 Name: Sds-server237 <- MDM SDS ID: 1234567890000025 Name: Sds-server238 SDS ID: 1234567890000026 Name: Sds-server239 Impact SDCs disconnect from the MDM cluster resulting in the inability to access their storage volumes.
There is a design issue in PFxM regarding the maintenance process for servers located in different PDs simultaneously. The issue stems from MDMs being dispersed across various PDs and situated on the first server in each PD.The issue arises when SDS servers are sequentially assigned to PMM and IMM within the Protection Domains. If the MDM component aligns with the same SDS sequence across different Protection Domains, issuing PMM and IMM to SDS servers by PFxM and rebooting them may inadvertently lead to multiple MDMs going down, potentially causing the MDM cluster to go offline.
To prevent this issue from occurring, there are a few options: Option 1: Do not upgrade multiple Protections Domains in Parallel. Only upgrade the servers in one Protection Domain at a time.Option 2: Manually select non-MDM members in the other PDs when presented with the upgrade window. Protection Domain 123450000000000 Name: PD1 SDS ID: 1234567890000000 Name: Sds-server001 <- MDM SDS ID: 1234567890000001 Name: Sds-server002 SDS ID: 1234567890000002 Name: Sds-server003 Protection Domain 1234500000001 Name: PD2 SDS ID: 1234567890000012 Name: Sds-server004 SDS ID: 1234567890000013 Name: Sds-server005 <- MDM do not select SDS ID: 1234567890000014 Name: Sds-server006 Protection Domain 12345000000002 Name: PD3 SDS ID: 1234567890000024 Name: Sds-server007 SDS ID: 1234567890000025 Name: Sds-server008 SDS ID: 1234567890000026 Name: Sds-server009 <- MDM do not select Option 3: Migrate MDM members to the same PD, this prevents multiple SDSs with MDM services from going into Maintenance Mode simultaneously. Impacted Version PFxM 3.xPFMP 4.x Fixed In Version PFxM 3.8.9PFMP 4.6.2
Click on a version to see all relevant bugs
Dell Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.