...
Mroute dnld delayed for more than 260secs, RPF Interface 0xb28 (type: 25) missing from OFA, platform route operation will be retried, traffic loss
Bring up
None
Vnc info: 7315-sit-lsr2:2/cisco123 (Problem is on R4) Seeing following errors and mcast traffic loss followed by router reload. Collecting logs, will update more info in the follow up post. Issue: ====== RP/0/RP0/CPU0:sf-d-R4#sh logging | i mrib RP/0/RP1/CPU0:Dec 16 15:15:35.680 PST: mrib6[1157]: %ROUTING-MRIB-4-GEN_WARN : Mroute dnld delayed,Repulse [bcdl_agent cid 0xb prid 0x0] pending for 84.076s RP/0/RP1/CPU0:Dec 16 15:15:35.758 PST: mrib[1156]: %ROUTING-MRIB-4-GEN_WARN : Mroute dnld delayed,Repulse [bcdl_agent cid 0x19 prid 0x0] pending for 84.607s RP/0/RP1/CPU0:Dec 16 15:16:35.758 PST: mrib[1156]: %ROUTING-MRIB-4-GEN_WARN : Mroute dnld delayed,Repulse [bcdl_agent cid 0x19 prid 0x0] pending for 144.607s RP/0/RP1/CPU0:Dec 16 15:17:35.758 PST: mrib[1156]: %ROUTING-MRIB-4-GEN_WARN : Mroute dnld delayed,Repulse [bcdl_agent cid 0x19 prid 0x0] pending for 204.608s RP/0/RP1/CPU0:Dec 16 15:18:35.758 PST: mrib[1156]: %ROUTING-MRIB-4-GEN_WARN : Mroute dnld delayed,Repulse [bcdl_agent cid 0x19 prid 0x0] pending for 264.608s LC/0/8/CPU0:Dec 16 15:21:15.503 PST: ipv6_mfwd_partner[201]: %OS-BCDL-7-MSG_STALL : group ipv6_mrib: No data message received for the last 360 seconds on the subgroup: 2 LC/0/8/CPU0:Dec 16 15:21:17.138 PST: fib_mgr[244]: %OS-BCDL-7-MSG_STALL : group ipv4_lmrib: No data message received for the last 360 seconds on the subgroup: 2 LC/0/8/CPU0:Dec 16 15:22:15.503 PST: ipv6_mfwd_partner[201]: %OS-BCDL-3-MSG_HB_MISS : group ipv6_mrib: no heartbeats received in last 360 seconds LC/0/8/CPU0:Dec 16 15:22:17.139 PST: fib_mgr[244]: %OS-BCDL-3-MSG_HB_MISS : group ipv4_lmrib: no heartbeats received in last 360 seconds RP/0/RP0/CPU0:sf-d-R4#sh logging | i ipv4_mfwd_partner LC/0/8/CPU0:Dec 16 15:19:29.118 PST: ipv4_mfwd_partner[222]: %ROUTING-IPV4_MFWD_PLATFORM-7-ERR_GEN : RPF Interface 0xb28 (type: 25) missing from OFA, platform route operation will be retried, rc: (0x2f908002): failed : 'OFA' detected the 'informational' condition 'OFA Infra DB': No such file or directory LC/0/8/CPU0:Dec 16 15:19:29.118 PST: ipv4_mfwd_partner[222]: %ROUTING-IPV4_MFWD-4-FROM_MRIB_UPDATE : MFIB couldn't process update from MRIB : Unable to commit new (0xe0000000):(192.44.51.2,232.1.38.0/32) - Resource temporarily unavailable LC/0/8/CPU0:Dec 16 15:24:42.264 PST: ipv4_mfwd_partner[222]: %ROUTING-IPV4_MFWD-5-MRIB_UPDATE_RETRY : One or more route update not fully processed. Re-downloading... RP/0/RP0/CPU0:sf-d-R4#sh logging | i ipv6_mfwd_partner LC/0/8/CPU0:Dec 16 15:21:15.503 PST: ipv6_mfwd_partner[201]: %OS-BCDL-7-MSG_STALL : group ipv6_mrib: No data message received for the last 360 seconds on the subgroup: 2 LC/0/8/CPU0:Dec 16 15:22:15.503 PST: ipv6_mfwd_partner[201]: %OS-BCDL-3-MSG_HB_MISS : group ipv6_mrib: no heartbeats received in last 360 seconds RP/0/RP0/CPU0:sf-d-R4# Console logs: ============= LC/0/0/CPU0:Dec 16 15:14:19.954 PST: npu_drvr[163]: %FORWARDING-PBR-6-ACL_OOR : PBR ACL OOR, trans_id 316716 NPU 0 ifhandle 0xf800b64, please check ACL resource usage with show controllers npu commands. LC/0/1/CPU0:Dec 16 15:14:20.003 PST: npu_drvr[265]: %FORWARDING-PBR-6-ACL_OOR : PBR ACL OOR, trans_id 254220 NPU 1 ifhandle 0xf800c14, please check ACL resource usage with show controllers npu commands. LC/0/3/CPU0:Dec 16 15:14:20.596 PST: npu_drvr[297]: %FORWARDING-PBR-6-ACL_OOR : PBR ACL OOR, trans_id 308593 NPU 0 ifhandle 0xf800b64, please check ACL resource usage with show controllers npu commands. LC/0/11/CPU0:Dec 16 15:14:20.605 PST: npu_drvr[216]: %FORWARDING-PBR-6-ACL_OOR : PBR ACL OOR, trans_id 262792 NPU 1 ifhandle 0xf800c14, please check ACL resource usage with show controllers npu commands. RP/0/RP1/CPU0:Dec 16 15:14:20.713 PST: bgp[1070]: %ROUTING-BGP-5-ADJCHANGE : neighbor 192:64:7e::2 Up (VRF: vpn126) (AS: 2) RP/0/RP1/CPU0:Dec 16 15:14:21.874 PST: instorch[349]: %INFRA-INSTALL-6-NEW_NODES_REGENERATE_CACHE : Install process restarted on some nodes, or new nodes joined since fingerprinting cache generation started: 0/RP0/CPU0, 0/10/CPU0, 0/0/CPU0, 0/3/CPU0, 0/11/CPU0, 0/1/CPU0, 0/8/CPU0. Generation will be rescheduled. RP/0/RP0/CPU0:Dec 16 15:14:22.007 PST: fpd_client[312]: %PLATFORM-CPA_INTF_FPD-3-ACCESS_ERROR : Node 0/RP0/CPU0 FPD ACCESS failure 'perform_rest_operation failed [@|lg^?^?] for url https://10.1.0.2:443/xyz/openbmc_project/software/BmcUbootGolden/action/fpdinfo' RP/0/RP1/CPU0:Dec 16 15:14:22.009 PST: shelfmgr[353]: %PLATFORM-SHELFMGR-6-INFO_LOG : 0/RP0/BMC0 is operational RP/0/RP1/CPU0:Dec 16 15:14:22.758 PST: isis[1003]: %ROUTING-ISIS-5-ADJCHANGE : ISIS (core): Adjacency to 49.0001.0010.9400.00bb (HundredGigE0/0/0/45.179) (L1) Up, New adjacency RP/0/RP1/CPU0:Dec 16 15:14:22.758 PST: isis[1003]: %ROUTING-ISIS-5-ADJCHANGE : ISIS (core): Adjacency to 49.0001.0010.9400.0093 (HundredGigE0/0/0/45.139) (L1) Up, New adjacency RP/0/RP1/CPU0:Dec 16 15:14:22.763 PST: isis[1003]: %ROUTING-ISIS-5-ADJCHANGE : ISIS (core): Adjacency to 49.0001.0010.9400.007d (HundredGigE0/0/0/45.117) (L1) Up, New adjacency RP/0/RP1/CPU0:Dec 16 15:14:23.112 PST: mpls_ldp[1184]: %ROUTING-LDP-5-NSR_SYNC_START : Initial synchronization started for 1 peers RP/0/RP1/CPU0:Dec 16 15:14:23.463 PST: bgp[1070]: %ROUTING-BGP-5-ADJCHANGE : neighbor 192:64:c4::2 Up (VRF: vpn196) (AS: 2) RP/0/RP1/CPU0:Dec 16 15:14:23.467 PST: mpls_ldp[1184]: %ROUTING-LDP-5-NSR_SYNC_PEER_DONE : VRF 'default' (0x60000000), Peer 12.12.12.8:0 initial synchronization done RP/0/RP1/CPU0:Dec 16 15:14:23.472 PST: mpls_ldp[1184]: %ROUTING-LDP-5-NSR_SYNC_DONE : Initial synchronization successfully done for 1 peers RP/0/RP0/CPU0:Dec 16 15:14:24.021 PST: fpd_client[312]: %PLATFORM-CPA_INTF_FPD-3-ACCESS_ERROR : Node 0/RP0/CPU0 FPD ACCESS failure 'perform_rest_operation failed [ RP/0/RP1/CPU0:Dec 16 15:14:24.333 PST: fpd_client[350]: %PLATFORM-CPA_INTF_FPD-3-ACCESS_ERROR : Node 0/RP1/CPU0 FPD ACCESS failure 'perform_rest_operation failed [`(3R}^?] for url https://10.1.0.2:443/xyz/openbmc_project/software/BmcFpga/action/fpdinfo' 9400.0084 (HundredGigE0/0/0/45.124) (L1) Up, New adjacency LC/0/8/CPU0:Dec 16 15:15:04.789 PST: processmgr[51]: %OS-SYSMGR-7-DEBUG : Feature:pbr_ea pbr_ea[265] (pid 7480) has not sent proc-ready within 90 seconds RP/0/RP1/CPU0:Dec 16 15:15:04.791 PST: shelfmgr[353]: %PLATFORM-SHELFMGR-6-INFO_LOG : 0/8/CPU0 is operational LC/0/8/CPU0:Dec 16 15:15:04.821 PST: sysmgr_log[69579]: %OS-SYSMGR-7-CHECK_LOG : /pkg/bin/sysmgr_debug_script invoked for: (pbr_ea) process did not signal EOI and for level:999 . Output is in /var/log/sysmgr_debug/debug.node0_8_CPU0.8134.log RP/0/RP1/CPU0:Dec 16 15:15:13.984 PST: ifmgr[263]: %OS-MMAP_PEER-7-CONNECT : Connect from process 6859 to 26065 skipped: Connection refused RP/0/RP1/CPU0:Dec 16 15:15:14.031 PST: fib_mgr[198]: %OS-MMAP_PEER-7-CONNECT : Connect from process 7217 to 27379 skipped: Connection refused LC/0/8/CPU0:Dec 16 15:16:34.881 PST: processmgr[51]: %OS-SYSMGR-7-DEBUG : Feature:span_ea span_ea[277] (pid 8155) has not sent proc-ready within 90 seconds LC/0/8/CPU0:Dec 16 15:16:34.909 PST: sysmgr_log[66057]: %OS-SYSMGR-7-CHECK_LOG : /pkg/bin/sysmgr_debug_script invoked for : (span_ea) process did not signal EOI Output is in /var/log/sysmgr_debug/debug.node0_8_CPU0.8708.log RP/0/RP1/CPU0:Dec 16 15:16:35.758 PST: mrib[1156]: %ROUTING-MRIB-4-GEN_WARN : Mroute dnld delayed,Repulse [bcdl_agent cid 0x19 prid 0x0] pending for 144.607s LC/0/8/CPU0:Dec 16 15:16:41.291 PST: fib_mgr[244]: %ROUTING-FIB-4-RETRYDB_NONEMPTY : 2054 FIB object(s) have been in IPv4 retry queue for at least 120 seconds RP/0/RP1/CPU0:Dec 16 15:16:47.778 PST: snmpd[1002]: %SNMP-SNMP-3-TRAP_DELAY_TIMER : Trap delay timer expired for Host 6.1.18.179:162 (Vrf default). Queued traps will be dropped RP/0/RP1/CPU0:Dec 16 15:17:01.701 PST: fib_mgr[198]: %OS-MMAP_PEER-7-CONNECT : Connect from process 7217 to 28340 skipped: Connection refused LC/0/8/CPU0:Dec 16 15:16:34.881 PST: processmgr[51]: %OS-SYSMGR-7-DEBUG : Feature:span_ea span_ea[277] (pid 8155) has not sent proc-ready within 90 seconds LC/0/8/CPU0:Dec 16 15:16:34.909 PST: sysmgr_log[66057]: %OS-SYSMGR-7-CHECK_LOG : /pkg/bin/sysmgr_debug_script invoked for : (span_ea) process did not signal EOI Output is in /var/log/sysmgr_debug/debug.node0_8_CPU0.8708.log RP/0/RP1/CPU0:Dec 16 15:16:35.758 PST: mrib[1156]: %ROUTING-MRIB-4-GEN_WARN : Mroute dnld delayed,Repulse [bcdl_agent cid 0x19 prid 0x0] pending for 144.607s LC/0/8/CPU0:Dec 16 15:16:41.291 PST: fib_mgr[244]: %ROUTING-FIB-4-RETRYDB_NONEMPTY : 2054 FIB object(s) have been in IPv4 retry queue for at least 120 seconds RP/0/RP1/CPU0:Dec 16 15:16:47.778 PST: snmpd[1002]: %SNMP-SNMP-3-TRAP_DELAY_TIMER : Trap delay timer expired for Host 6.1.18.179:162 (Vrf default). Queued traps will be dropped RP/0/RP1/CPU0:Dec 16 15:17:01.701 PST: fib_mgr[198]: %OS-MMAP_PEER-7-CONNECT : Connect from process 7217 to 28340 skipped: Connection refused LC/0/8/CPU0:Dec 16 15:17:12.191 PST: fib_mgr[244]: %ROUTING-FIB-4-RETRYDB_NONEMPTY : 1606 FIB object(s) have been in IPv6 retry queue for at least 120 seconds LC/0/8/CPU0:Dec 16 15:17:28.687 PST: fib_mgr[244]: %ROUTING-FIB-4-RETRYDB_NONEMPTY : 1659 FIB object(s) have been in Main retry queue for at least 120 seconds RP/0/RP1/CPU0:Dec 16 15:17:33.344 PST: fib_mgr[198]: %OS-MMAP_PEER-7-CONNECT : Connect from process 7217 to 28340 skipped: Connection refused RP/0/RP1/CPU0:Dec 16 15:17:33.500 PST: ntpd[364]: %IP-IP_NTP-5-HP_CONN_RECOVERED : High priority NTP peer connection recovered - Stratum 15->2. RP/0/RP1/CPU0:Dec 16 15:17:35.758 PST: mrib[1156]: %ROUTING-MRIB-4-GEN_WARN : Mroute dnld delayed,Repulse [bcdl_agent cid 0x19 prid 0x0] pending for 204.608s RP/0/RP0/CPU0:Dec 16 15:17:45.326 PST: fib_mgr[351]: %ROUTING-FIB-4-RETRYDB_NONEMPTY : 17 FIB object(s) have been in IPv4 retry queue for at least 240 seconds RP/0/RP1/CPU0:Dec 16 15:17:46.050 PST: isis[1003]: %ROUTING-ISIS-5-NSR_NOTIFY_READY : ISIS (core): ISIS NSR set nsr ready. rc 0 RP/0/RP1/CPU0:Dec 16 15:17:50.438 PST: fib_mgr[198]: %OS-MMAP_PEER-7-CONNECT : Connect from process 7217 to 28340 skipped: Connection refused RP/0/RP1/CPU0:Dec 16 15:17:56.090 PST: processmgr[51]: %OS-SYSMGR-3-ERROR : auth_service_exppubkey(1) (jid 1188) exited, will be respawned with a delay (slow-restart) RP/0/RP1/CPU0:Dec 16 15:17:56.090 PST: processmgr[51]: %OS-SYSMGR-3-ERROR : auth_service_exppubkey(1188) (fail count 7) will be respawned in 120 seconds RP/0/RP1/CPU0:Dec 16 15:18:32.828 PST: rt_check_mgr[1181]: %OS-RT_CHECK-4-NODE_FAILED : ipv6-mpls detected node(s) did not reply to background scan id 5 RP/0/RP1/CPU0:Dec 16 15:18:35.758 PST: mrib[1156]: %ROUTING-MRIB-4-GEN_WARN : Mroute dnld delayed,Repulse [bcdl_agent cid 0x19 prid 0x0] pending for 264.608s LC/0/3/CPU0:Dec 16 15:18:53.332 PST: npu_drvr[297]: %PLATFORM-OFA-4-OOR_YELLOW : NPU 0, Table iprte, Resource lpm_tcam LC/0/3/CPU0:Dec 16 15:18:53.333 PST: npu_drvr[297]: %PLATFORM-OFA-4-OOR_YELLOW : NPU 1, Table iprte, Resource lpm_tcam LC/0/3/CPU0:Dec 16 15:18:53.799 PST: npu_drvr[297]: %PLATFORM-OFA-5-OOR_GREEN : NPU 1, Table iprte, Resource lpm_tcam LC/0/3/CPU0:Dec 16 15:18:53.850 PST: npu_drvr[297]: %PLATFORM-OFA-5-OOR_GREEN : NPU 0, Table iprte, Resource lpm_tcam LC/0/1/CPU0:Dec 16 15:18:53.919 PST: npu_drvr[265]: %PLATFORM-OFA-4-OOR_YELLOW : NPU 0, Table iprte, Resource lpm_tcam LC/0/1/CPU0:Dec 16 15:18:53.936 PST: npu_drvr[265]: %PLATFORM-OFA-4-OOR_YELLOW : NPU 1, Table iprte, Resource lpm_tcam LC/0/1/CPU0:Dec 16 15:18:54.274 PST: npu_drvr[265]: %PLATFORM-OFA-5-OOR_GREEN : NPU 0, Table iprte, Resource lpm_tcam LC/0/1/CPU0:Dec 16 15:18:54.274 PST: npu_drvr[265]: %PLATFORM-OFA-5-OOR_GREEN : NPU 1, Table iprte, Resource lpm_tcam LC/0/8/CPU0:Dec 16 15:19:29.118 PST: ipv4_mfwd_partner[222]: %ROUTING-IPV4_MFWD_PLATFORM-7-ERR_GEN : RPF Interface 0xb28 (type: 25) missing from OFA, platform route operation will be retried, rc: (0x2f908002): failed : 'OFA' detected the 'informational' condition 'OFA Infra DB': No such file or directory LC/0/8/CPU0:Dec 16 15:19:29.118 PST: ipv4_mfwd_partner[222]: %ROUTING-IPV4_MFWD-4-FROM_MRIB_UPDATE : MFIB couldn't process update from MRIB : Unable to commit new (0xe0000000):(192.44.51.2,232.1.38.0/32) - Resource temporarily unavailable RP/0/RP1/CPU0:Dec 16 15:19:34.436 PST: bgp[1070]: %ROUTING-BGP-7-SEND_EOR_WAIT_EXPIRED : [18088] : Default wait timer expired for update-group 0.2 sub-group 0.1 for table TBL:default (25/70), Sending EOR now