...
RSVP resource exhausted log seen on all the 4 nodes : (this log was seen only once four months before the RSVP tunnel down issue) 2021 Jan 4 23:32:04 AKALB-CORE-CR02 %RSVP-1-LSP_RESOURCE_EXHAUSTED: Failed to get a new LSP resource: 1048576/1024 + Logs just seen once 32 weeks after the upgrade on all the nodes. + `show ip rsvp internal counters` Signaling RX Error Path Resv No-Path-Info 0 20 No-Sender-Info 0 13 Handle Type Count GEN 0 PSB 20 PFC 15 RSB 9 show ip rsvp counters all` Teardown Reason Path Resv UNSPECIFIED 0 0 PATH TIMEOUT 2 2 RESV TIMEOUT 0 0 SIGNALED 9 1 MGMT 0 0 POLICY 0 0 PROXY 20 0 NO_RESOURCES 0 0. --> I do not see any resources in counters
Nexus 7k - 7.3(6)D1(1) RSVP constraint strict-hop tunnels template of primary tunnel config interface tunnel-te1001 description PAN-CR02->PAN-CR01 fast-reroute path-option 10 explicit name AKPAN-CR02->AKPAN-CR01-PRIMARY explicit-path name AKPAN-CR02->AKPAN-CR01-PRIMARY index 1 next-address strict path-option 20 explicit name AKPAN-CR02->AKPAN-CR01-SECONDARY explicit-path name AKPAN-CR02->AKPAN-CR01-SECONDARY index 1 exclude-address path-option 30 dynamic
RSVP process restart brought back few tunnels
+ 4 nodes connected to each other only partial mesh. + Issue seen after one of the links between the nodes flapped + Tried shut/no shut of the tunnel didn’t fix. + Process restart on 2 nodes has brought up few tunnels + RSVP process Crash (2020)in —> 3 crash …. Except — ALB0CORE02 there is SPM crash at similar time Jan 2020 after that upgrade was done + Backtrace not complete in crash file . 2021 Jul 13 09:53:58.127375 rsvp: [8842] PATH: rsvp_path_get_downstream_nbor_rid: psb 0xe122b020, router-id 2021 Jul 13 09:53:58.127403 rsvp: [8842] RESV: rsvp_resv_update_glbl_nbr [DONE]: rsb (1001)::1(336), psb 0xe122b020: ok 2021 Jul 13 09:53:58.127422 rsvp: [8842] RESV: rsvp_resv_receive_single_fd [DONE]: rsb 0xe12796ec, valid-syntax ok, refresh 45000ms expires in 63337093sec: ok 2021 Jul 13 09:53:58.127438 rsvp: [8842] RESV: rsvp_db_rsb_free: rsb=0xe127917c 2021 Jul 13 09:53:58.127467 rsvp: [8842] RESV: rsvp_resv_receive [DONE]: flow (1001)::Wildcard sender, in-i/f port-channel1.10, count 1 valid-syntax TRUE: ok + the above one is for working debug , + Debug logs and span capture on the non-working tunnels suggest that the PATH msg are being send but not seeing RESV msg from remote end + We are not seeing path error or resv error + All the rsvp neighbours are up + there is one directly connected tunnel also which is down + Local lab reproduction not successful