Symptom
ASR1002-X with 8GB memory running Version 15.1 goes unresponsive within 5-6 days the router crashes with Watchdog. For testing they upgraded to 15.4(3)S4 and then to 15.5(3)s1a as well but same issue.
IOSD memory leak in acl-handle process is noticed.
Logs during issue before crash on Dec 2:
Dec 2 13:31:47.777 EST: %PLATFORM-3-ELEMENT_CRITICAL: SIP0: smand: RP/0: Committed Memory value 115% exceeds critical level 100%
CMD: 'show plat softw status contr br' 14:06:32 EST Wed Dec 2 2015
IOSXE-WATCHDOG: Process = Exec
.Dec 2 14:07:04.115 EST: %SCHED-0-WATCHDOG: Scheduler running for a long time, more than the maximum configured (120) secs.
-Traceback= 1#a3fe01abba2bac2871f0e4442db8a494 ld-linux-x86-64:7FED81E8A000+8A
Conditions
With PFR on head end master controller and BR
Workaround
They disabled the PfR on the head end Master Controller and Border Routers (DMVPN Hub routers). The memory looks stable since then.
--------------------------------------------------------------
NADC2-PfR-MC#sh pfr ma
OER state: disabled
NADC2R4-DMVPN-HUB2#sh pfr bo
OER BR 10.220.253.52 DISABLED, MC 172.27.0.186 UP/DOWN: DOWN
Conn Status: CLOSED
OER Netflow Status: ENABLED, PORT: 3949
Version: 3.3 MC Version: 0.0
Nbar Status: Inactive
Exits
NADC1R2-DMVPN-HUB1#sh pfr bo
OER BR 10.220.253.51 DISABLED, MC 172.27.0.186 UP/DOWN: DOWN
Conn Status: RETRY
OER Netflow Status: ENABLED, PORT: 3949
Version: 3.3 MC Version: 0.0
Nbar Status: Inactive
Exits