Symptom
There might be a scenario when traffic between pods is blackholed when all Spines are reloaded/upgraded in a timeframe of less than 30 minutes.
Conditions
This issue occurs if a second spine switch is reloaded/upgraded just after the first spine switch.
When Spine1 is reloaded, it comes up with an overload set for both ISIS and OSPF. The worst metric gets advertised while downloading policies and programming the hardware. ISIS becomes operational, but OSPF sends max-metric for 15 minutes more. In this situation, the second spine switch upgrade/reload will cause connectivity issues, as both switches will send max-metric for some time.
Workaround
Wait while OSPF returns to normal calculated metrics on Spine1. You can use the "show ip ospf database detail" command on the IPN router to confirm it. Start the Spine2 upgrade only when no max-metrics are advertised from Spine1.