Symptom
On a Nexus switch running LACP to a server that fails to set the collecting bit during the LACP negotiation in a timely fashion the ports will go error disabled due to sequence timeout on the Nexus switch in logs that look like this:
%ETHPORT-5-IF_SEQ_ERROR: Error ("sequence timeout") communicating with MTS_SAP_ETH_PORT_CHANNEL_MGR for opcode MTS_OPC_ETHPM_PORT_CLEANUP (RID_PORT: Ethernet1/20)
%ETH_PORT_CHANNEL-3-RSP_TIMEOUT: Component MTS_SAP_LACP timed out on response to opcode:MTS_OPC_PCM_RX_HW_ENABLED (for:Ethernet1/20)
%ETHPORT-5-IF_SEQ_ERROR: Error ("required service is not responding") communicating with MTS_SAP_ETH_PORT_CHANNEL_MGR foropcode MTS_OPC_ETHPM_BUNDLE_MEMBER_BRINGUP (RID_PORT: Ethernet1/20)
Conditions
Server is not handing the LACP negotiation correctly
Workaround
Perform the following steps to get the ports to come up:
1) Shutdown the NICS from the server side
2) Delete the port-channel from the nexus side
3) Default all the physical interfaces
4) Re-create the port-channel
5) Re-apply to the physical interfaces
Now the port-channel will come back up.
Further Problem Description
Please note that this "defect" was filed due to a specific issue with the end-host behaving incorrectly. The "fix" for this on NX-OS side is simply added logging of the issue & trigger for ease of troubleshooting & diagnosing.
The ports will continue to error disable if the issue is not addressed properly on the end-host.
Upgrading NX-OS will not resolve this problem as the problem does not lie within NX-OS. The misbehaving end-host must be fixed in order for the channel to form properly.
The added syslog on NX-OS to spot this issue will have the following format (starting in 8.3(2)):
"%ETH_PORT_CHANNEL-3-PEER_COLLECT_ENABLE_NOT_RCVD: LACP neighbor for Ethernetx/y may not have enabled collect bit in LACP PDU in stipulated time."