Loading...
Loading...
The Physical Coding Sublayer, Media Access Control, Link Level Retry, (PML) recovery feature is available starting in HPE Slingshot 2.3.0 release and HPE SHS 11.1.0 release.It is disabled by default in both 2.3 fabric and 11.1 host software. This is described in Section 10.5.2.10 PML recovery summary events of the HPE Slingshot Administration Guide.Please note that this feature needs to be enabled on the fabric first before enabling this feature on the host. PML recovery enables links to recover from transient faults that would otherwise cause the link to flap, without packet loss and only a brief delay in transmission. This can stabilize the fabric by reducing occasional random disruptions.Links that go through frequent PML recovery require hardware action like repeat link flappers. Monitor HsnPmlRecoveryDetected and HsnLinkFlapDetected.Redfish events to identify maintenance candidates. Refer to the “PML Recovery Summary Events” section of the HPE Slingshot Administration Guide.Interaction with auto lane degradeAuto lane degrade (ALD) is a hardware feature that will always occur before the PML recovery can begin. Both features are typically triggered by the same faults, and PML recovery is prohibited for degraded links. Additionally, if degrade recovery is enabled, it will bounce the link before PML recovery can occur. Hence there is no known benefit or cost to enabling both features together on the same port. The combination is expected to behave the same as only enabling ALD. The suggestion is to enable ALD on Fabric ports and PML recovery on edge ports.This is because ALD operates in the order of microseconds while PML operates in milliseconds. So if both features are enabled on the same port, then ALD will always take effect first.
The PML recovery feature is available beginning with HPE Slingshot release 2.3.0 and HPE SHS release 11.1.0. It is disabled by default in both 2.3 fabric and 11.1 host software.
Please note that this feature needs to be enabled on the fabric first before enabling this feature on the host.Enable PML recovery on fabric.By default, the retry handler SPT timeout, which defines the period hosts wait before resending packets, is set to one second. On systems with a timeout of 0.5 seconds, it is recommended to halve the PML recovery timeout period. The default of 60 ms is sufficient unless HPE has provided instructions to change the retry handler timeout.Configuring the timeout for edge links also requires setting the pml_rec_timeout kernel module parameter of the cxi_sbl kernel module on the host.Syntax:fmctl update topology-policies/template-policy configFlagMap.PML_RECOVERY_TIMEOUT="30"Enable PML recovery on host.Use ethtool to enable it on hosts as follows:Syntax:ethtool --set-priv-flags <hsn-iface> disable-pml-recovery offALD operates in microseconds while PML operates in milliseconds. If both features are enabled on the same port, ALD will always operate first.
Operating Systems Affected:Not Applicable
Click on a version to see all relevant bugs
Hewlett Packard Enterprise Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.