...
During an MPLS core link flap or network changes, the IPNHGROUP, MPLSNH or other OFA objects may be in a delete pending state due to their child objects not being cleanly deleted. This leads to the corresponding hardware objects ECMP FEC , FECs, ENCAP not deleting properly creating a leak scenario leading to Out of resource condition (OOR) This may result in HW programming failure logs, FIB-related tracebacks etc and ultimately cause a traffic drop.
The issue is seen When the BGP PIC feature is used with primary and backup BGP Next hop. But there may be other conditions also
To cleanly delete the objects a reload of the line card or router is recommended
ipnhgroup issue identification ===================== - verify the ipnhgroup in the ipnhgroup dump [node0_RP0_CPU0:~]$dpa_ipnhgroup_show_client -c Table [IPNHGROUP] has 57 entries in DB Table [IPNHGROUP] had 208 as highest count @ Wed Nov 2 04:21:32 2022 [node0_RP0_CPU0:~]$ [node0_RP0_CPU0:~]$dpa_ipnhgroup_show_client > ipnh_dump - this file also available from show tech ofa and unzip ofa_show_objs_node0/x/y file - during problem scenarios the child objects of this nhgroup are in delete pending state hence the ecmpfec is not deleted cleanly ipnhgroup element 0 (hdl:0x8c1534e8): base |-- dpd_slf - pending(cr/up/dl):0/0/0, sibling:0x87ac5be8, child:0, num_parents:1, visits:0 <<<<< 0/0/1 in failure scenarios color_mask:0, last_bwalk_id:0 num_bwalks_started:0 |-- flag - 0 |-- keylen - 12 |-- trans_id - 410 |-- create_trans_id - 410 |-- obj_handle - 0x8c1534e8 |-- obj_rc - 0x0 |-- reason - 0 |-- table_operation - 6 |-- total_obj_size - 24944 |-- idempotent - 0 |-- inflight - 0 |-- table_prop - jid=223 mtime=(GMT)2022.Nov.02 04:13:48.590004 |-- (cont'd) - replayed=0times `-- obj_rc - 0:Success <<<<<<<<<<<<< 'OFA DB delete pending' in failure scenarios ofa_npu_id_t npu_id => 0 uint32_t num_paths => 1 uint32_t obj_ref_cnt => 1 uint64_t object_id => 0 ofa_if_t fec_id => 0x2000ffe5 <<<<<<<<<<<<< ecmp fec (starting with 20000) ofa_if_t ecmp_base_fec_id => (not set) int num_primary_paths => 1 int num_bkup_paths => 0 ofa_bool_t is_cascaded => (not set) uint8_t force_fec_level => (not set) uint8_t force_ecmp => 0 uint8_t indirection_ecmp_fec => 0 <<<<<<<<< indirection is 1 in failure scenarios Following information is related MPLSNH issue. ====================== The encap (example the triage traces 0x400158ca) was programmed in the HW for label 0x427f. The cef triggered the delete event to OFA. OFA marked the MPLSNH object as delete pending state as the child object IPNHGROUP-1 object was referring to the MPLSNH(as parent). The IPNHGROUP-1 also received the event to OFA but the OFA marked as delete pending as it's been referred by the IPNHGROUP-2. For IPNHGROUP-2 there is no delete event. hence the resources(encap and fec) associated to the parent objects like IPNHGROUP-1 and the MPLSNH-1 were not freed and held-up in the HW. So, when next time the grid allocated the same encap to some other purpose, while program the same the SDK return error as the entry already present in the HW with different content. After the log analysis the IPNHGROUP-2 was the FEC associated to the RSHLDI. the RSHLDI during the modify with the in_place_modify=0 it create the new LDI_KEY and programs the new FEC and missed to delete the old FEC. Due to this the FEC associated to the old key is still hanging in the OFA and its parent objects like fec/encap were not freed though they marked as delete. The fix is during RSHLDI modify with the in_place_modify FALSE, if the old_eng_ctx has the valid fec and working_eng_ctx does not have the fec. Post program the new FEC need to delete the fec associated with the old_eng_ctx and clear the fec_key in old_eng_ctx to all zeros. Hence when the parent objects like IPNHGROUP or MPLSNH gets delete event they will free the resources in the HW.