...
+ When trying to remove cabling from both Active and Hot-Standby port at same time there is possibility that Active will not go down immediately but with 30 seconds delay. `show logging log` 2021 Aug 23 14:00:48 N9K-2 %ETH_PORT_CHANNEL-5-FOP_CHANGED: port-channel2: first operational port changed from Ethernet1/49 to none 2021 Aug 23 14:00:48 N9K-2 %ETH_PORT_CHANNEL-5-PORT_HOT_STANDBY_DOWN: port-channel2: hot-standby port Ethernet1/50 is down 2021 Aug 23 14:01:18 N9K-2 %ETH_PORT_CHANNEL-5-PORT_DOWN: port-channel2: Ethernet1/49 is down 2021 Aug 23 14:01:18 N9K-2 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN: Interface port-channel2 is down (No operational members) 2021 Aug 23 14:01:18 N9K-2 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface port-channel2,bandwidth changed to 100000 Kbit 2021 Aug 23 14:01:18 N9K-2 %ETHPORT-5-IF_DOWN_LINK_FAILURE: Interface Ethernet1/49 is down (Link failure) 2021 Aug 23 14:01:18 N9K-2 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN: Interface port-channel2 is down (No operational members) + As we can see above there is 30 seconds delay in reporting Active LACP port as Down. + During this 30 seconds interface is logically reported as UP/UP `show int eth1/49` ---- > this output is requested between 14:00:48 and 14:01:18 Ethernet1/49 is up admin state is up, Dedicated Interface Belongs to Po2 + The only reason why we are bringing down interface looks to be 30 seconds deadlock timer `show system internal ethpm errors` 2021 Aug 23 14:01:18.274093: E_DEBUG ethpm [23970]: ethpm_timer_msg_handler(683): bundle Mem Down Timer for Ethernet1/49 - port-channel2 + Without above timer probably we would be waiting in infinity for bringing down this interface. + Issue looks to be located somewhere between ETHPM and PORT-CLIENT components as from port-client we see that link is considered as down immediately. Possibly concept of Hot-standby port might also be playing some role here. Aug 23 14:00:48 2021 00288811 Ethernet1/49 ---- DOWN Link down debounce timer stopped and link is down Aug 23 14:00:48 2021 00266262 Ethernet1/49 ---- DOWN Link down debounce timer started(0x40e50006)
Condiitons: + Problem was reproduced in scenario where two Nexus 9K in VPC were connected towards ASR9K routers over VPC port-channel + On ASR side there is MC-lag configuration + Per VPC port-channel towards ASR side we have one physical port in Hot-Standby mode and one in Active + Issue is seen when there is an attempt to remove cabling from both Hot-Standby and Active port at same time on one VPC switches (does not matter if VPC primary or Secondary) + Only cabling is removed, transceivers ARE NOT removed from the N9K switches. + Issue not seen every time there is attempt to remove cabling but it is happening pretty frequently.
None