...
*BGP session fails to peer or peers: 2017 May 4 13:58:24.709 GMT N9k01 %BGP-5-ADJCHANGE: bgp-65000 [8122] (default) neighbor 2.2.2.2 Down - holdtimer expired error 2017 May 4 13:58:25.092 GMT N9k01 %BGP-5-ADJCHANGE: bgp-65000 [8122] (default) neighbor 3.3.3.3 Down - holdtimer expired error 2017 May 4 14:01:11.855 GMT N9k01 %BGP-5-ADJCHANGE: bgp-65000 [8122] (default) neighbor 4.4.4.4 Down - holdtimer expired error *Shortly before this a message is logged such as the following: 2017 May 4 13:58:17.547 GMT N9k01 %ACLQOS-SLOT1-2-ACLQOS_FAILED:ACLQOS failure: BCM SDK API bcm_field_entry_create_id(906) failed for unit 0 with error Entry exists(-8) *The BGP peer is still reachable via ICMP; other protocols to peer (eg. EIGRP) do not seem to be impacted.
*BGP traffic traversing NAT enabled interface. [BGP SRC/DST itself is not NAT'd] *Issue seen on N9K-C9396PX running both 7.0(3)I4(5) & 7.0(3)I4(6) but will affect other earlier versions and models. *Temporarily exceed NAT TCAM (either with static or dynamic entries). *Not tested on Tahoe platforms.
*To recover the condition reload the switch. This situation is preventable by disallowing an over NAT TCAM situation, which is a resource exhaustion issue. *Set the max translations appropriately so that that the software limit will be hit before TCAM gets exhausted. *Free up additional TCAM by disabling atomic updates and/or increase NAT TCAM. NOTE: disabling atomic updates causes ACL changes to be disruptive unless the "hardware access-list update default-result permit" is configured. For more details on atomic updated see "Cisco Nexus 9000 Series NX-OS Security Configuration Guide, Release 7.x" section "Atomic ACL Updates". (http://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus9000/sw/7-x/security/configuration/guide/b_Cisco_Nexus_9000_Series_NX-OS_Security_Configuration_Guide_7x/b_Cisco_Nexus_9000_Series_NX-OS_Security_Configuration_Guide_7x_chapter_01001.html#concept_945210FB9986499285C6A00065105AC9)
Simplified topology: Lo0 1.1.1.1 |---N9k01-----N9k02--| Lo0 2.2.2.2 Neighbor is reachable via ICMP. (Loopback advertise via via static or EIGRP.) *BGP SYN <----N9k02 *N9k01----> BGP SYN ACK ----> Never seen in Ethanalyzer on N9k02. N9k01# ethanalyzer local interface inband display-filter "ip.addr==2.2.2.2 && tcp.port eq 179" limit-captured-frames 0 Capturing on inband 2017-05-06 13:22:21.537740 2.2.2.2 -> 1.1.1.1 TCP 47725 > bgp [SYN] Seq=0 Win=14600 Len=0 MSS=1460 TSV=1783687588 TSER=0 WS=2 2017-05-06 13:22:21.537772 1.1.1.1 -> 2.2.2.2 TCP bgp > 47725 [SYN, ACK] Seq=0 Ack=1 Win=14480 Len=0 MSS=1460 TSV=38429553 TSER=1783687588 WS=2 2017-05-06 13:22:22.537957 1.1.1.1 -> 2.2.2.2 TCP bgp > 47725 [SYN, ACK] Seq=0 Ack=1 Win=14480 Len=0 MSS=1460 TSV=38429854 TSER=1783687588 WS=2 2017-05-06 13:22:22.540082 2.2.2.2 -> 1.1.1.1 TCP 47725 > bgp [SYN] Seq=0 Win=14600 Len=0 MSS=1460 TSV=1783687889 TSER=0 WS=2 2017-05-06 13:22:22.540112 1.1.1.1 -> 2.2.2.2 TCP bgp > 47725 [SYN, ACK] Seq=0 Ack=1 Win=14480 Len=0 MSS=1460 TSV=38429854 TSER=1783687588 WS=2 2017-05-06 13:22:24.543437 2.2.2.2 -> 1.1.1.1 TCP 47725 > bgp [SYN] Seq=0 Win=14600 Len=0 MSS=1460 TSV=1783688490 TSER=0 WS=2 2017-05-06 13:22:24.543467 1.1.1.1 -> 2.2.2.2 TCP bgp > 47725 [SYN, ACK] Seq=0 Ack=1 Win=14480 Len=0 MSS=1460 TSV=38430455 TSER=1783687588 WS=2 2017-05-06 13:22:24.737958 1.1.1.1 -> 2.2.2.2 TCP bgp > 47725 [SYN, ACK] Seq=0 Ack=1 Win=14480 Len=0 MSS=1460 TSV=38430514 TSER=1783687588 WS=2 2017-05-06 13:22:28.550065 2.2.2.2 -> 1.1.1.1 TCP 47725 > bgp [SYN] Seq=0 Win=14600 Len=0 MSS=1460 TSV=1783689692 TSER=0 WS=2 2017-05-06 13:22:28.550105 1.1.1.1 -> 2.2.2.2 TCP bgp > 47725 [SYN, ACK] Seq=0 Ack=1 Win=14480 Len=0 MSS=1460 TSV=38431657 TSER=1783687588 WS=2 2017-05-06 13:22:28.937958 1.1.1.1 -> 2.2.2.2 TCP bgp > 47725 [SYN, ACK] Seq=0 Ack=1 Win=14480 Len=0 MSS=1460 TSV=38431774 TSER=1783687588 WS=2 N9k02# ethanalyzer local interface inband display-filter "ip.addr==2.2.2.2 && tcp.port eq 179" limit-captured-frames 0 Capturing on inband 2017-05-06 13:22:21.541887 2.2.2.2 -> 1.1.1.1 TCP 47725 > bgp [SYN] Seq=0 Win=14600 Len=0 MSS=1460 TSV=1783687588 TSER=0 WS=2 2017-05-06 13:22:22.544225 2.2.2.2 -> 1.1.1.1 TCP 47725 > bgp [SYN] Seq=0 Win=14600 Len=0 MSS=1460 TSV=1783687889 TSER=0 WS=2 2017-05-06 13:22:24.547566 2.2.2.2 -> 1.1.1.1 TCP 47725 > bgp [SYN] Seq=0 Win=14600 Len=0 MSS=1460 TSV=1783688490 TSER=0 WS=2 2017-05-06 13:22:28.554207 2.2.2.2 -> 1.1.1.1 TCP 47725 > bgp [SYN] Seq=0 Win=14600 Len=0 MSS=1460 TSV=1783689692 TSER=0 WS=2 2017-05-06 13:22:36.567565 2.2.2.2 -> 1.1.1.1 TCP 47725 > bgp [SYN] Seq=0 Win=14600 Len=0 MSS=1460 TSV=1783692096 TSER=0 WS=2 *Configuring a DMIRROR on RX of interface of N9k02 recieving BGP TCP flow from N9k01 to manually copy traffic to CPU allows BGP session to come up. [This is not a workaround.]