Symptom
ISIS microloop avoidance is not triggered by local link or BFD down event.
Conditions
The problem happens under specific timing conditions, when updated LSP triggers SPF prior
the link or BFD down event is locally detected.
The problem is more likely to happen under following conditions:
- extremely aggressive initial LSP generation and SPF timers (example - both set to 1 ms)
- event happening on remote side (interface shutdown etc), and slow local failure detection (copper interfaces, subinterfaces with slow BFD, L1/L2 infrastructure in between routers, etc)
Workaround
1/ Avoid using aggressive LSP and SPF timers
2/ Use "microloop avoidance segment-routing", which is more advanced solution with following benefits:
- works for both local and remote (not directly connected) links
- supports more triggers - link down, link up, metric change, overload bit set/clear
- has more advanced detection mechanism to ignore LSP changes that do not affect the topology
Further Problem Description
This fix doesn't cover the case when SPF is first triggered by remote LSP change without receiving the local trigger. For regular microloop avoidance, the local trigger (interface or BFD down) must be received prior the SPF is started. Otherwise the FRR backup paths are not activated in the hardware, and delaying the RIB update would just slow down the network convergence.
If this case needs to be covered, use segment-routing microloop avoidance.