Loading...
Loading...
Impacted versions: ScaleIO 1.32.x, ScaleIO 2.x, ScaleIO 3.x This specific problem is related to ScaleIO running on Ready Nodes installed with ESXi 6.5 where SATADOM hosting SVM is responding slowly. MDM event logs are showing the cluster going DEGRADED: 27960 2018-03-21 18:03:25.520 MDM_CLUSTER_NODE_DEGRADED ERROR MDM cluster node is now DEGRADED - node ID 36dfe7d60ed527e2; IPs: [10.171.200.128,10.171.208.128], Port: 9011 is offline. 27963 2018-03-21 18:03:26.186 MDM_CLUSTER_NODE_NORMAL INFO MDM cluster node ID 36dfe7d60ed527e2; IPs: [10.171.200.128,10.171.208.128], Port: 9011 is now in NORMAL state. 27964 2018-03-21 18:03:26.186 MDM_CLUSTER_NORMAL INFO MDM cluster is now in NORMAL mode. In the MDM traces we can see the following messages: 21/03 18:03:25.635186 0x7f9ef0740eb0:replFile_WriteUnlocked:00689: WARNING: Harden took too long: 1940 ms 21/03 18:03:25.635206 0x7f9ef0a22eb0:syncer_Resync:00705: Syncer UMT I/Os are now replicated ESXi vmkernel.log contains the following errors: 2018-03-21T14:04:52.650Z cpu54:65663)NMP: nmp_ThrottleLogForDevice:3617: Cmd 0x2a (0x439dd1bd0ec0, 65599) to dev "t10.ATA_____SATADOM2DML_3SE__________________________TW00TXXXXXXXXXXXXXXX" on path "vmhba2:C0:T5:L0" Failed: H:0x2 D:0x0 P:0x0 Invalid sens$ 2018-03-21T14:04:52.650Z cpu54:65663)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "t10.ATA_____SATADOM2DML_3SE__________________________TW00TXXXXXXXXXXXXXXX" state in doubt; requested fast path state update... 2018-03-21T14:04:52.650Z cpu54:65663)ScsiDeviceIO: 2927: Cmd(0x439dd1bd0ec0) 0x2a, CmdSN 0x4dcf1a from world 65599 to dev "t10.ATA_____SATADOM2DML_3SE__________________________TW00TXXXXXXXXXXXXXXX" failed H:0x2 D:0x0 P:0x0 Invalid sense data: 0x65 0x20 $ 2018-03-21T14:04:52.721Z cpu31:65640)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "t10.ATA_____SATADOM2DML_3SE__________________________TW00TXXXXXXXXXXXXXXX" state in doubt; requested fast path state update... 2018-03-21T14:04:52.721Z cpu31:65640)ScsiDeviceIO: 2927: Cmd(0x439dd0dc4ec0) 0x2a, CmdSN 0x4dcf1b from world 65599 to dev "t10.ATA_____SATADOM2DML_3SE__________________________TW00TXXXXXXXXXXXXXXX" failed H:0x2 D:0x0 P:0x0 Invalid sense data: 0x65 0x20 $ 2018-03-21T14:04:52.861Z cpu31:65640)ScsiDeviceIO: 2927: Cmd(0x439dc13f10c0) 0x2a, CmdSN 0x4dcf1d from world 65599 to dev "t10.ATA_____SATADOM2DML_3SE__________________________TW00TXXXXXXXXXXXXXXX" failed H:0x2 D:0x0 P:0x0 Invalid sense data: 0x0 0x0 0x$ 2018-03-21T14:04:52.931Z cpu31:65640)ScsiDeviceIO: 2927: Cmd(0x439dc1262440) 0x2a, CmdSN 0x4dcf1e from world 65599 to dev "t10.ATA_____SATADOM2DML_3SE__________________________TW00TXXXXXXXXXXXXXXX" failed H:0x2 D:0x0 P:0x0 Invalid sense data: 0x0 0x0 0x$
There is a known issue with the ahci driver used by SATADOM with ESXi 6.5 (before Update 1) resulting with degraded performance of the disk and causing overall ScaleIO system issues.
This is not a ScaleIO(VxFlex) issue. Upgrade to ESXi 6.5 U1 or higher to fix the issue. See https://knowledge.broadcom.com/external/article?legacyId=52927
Click on a version to see all relevant bugs
Dell Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.